[R] Re: Importing big plain files from ERP-System/Data Mining with R

Tue Oct 26 10:23:39 CEST 2004

Hi,
as concern R & datamining & large databases you can
see those resources:

Diego Kuonen, Introduction au data mining avec R :
vers la reconquÃªte du `knowledge discovery in
databases' par les statisticiens. Bulletin of the
Swiss Statistical Society, 40:3-7, 2001.
http://www.statoo.com/en/publications/2001.R.SSS.40/

Diego Kuonen and Reinhard Furrer, Data mining avec R
dans un monde libre. Flash Informatique SpÃ©cial Ã‰tÃ©,
pages 45-50, sep 2001.
http://sawww.epfl.ch/SIC/SA/publications/FI01/fi-sp-1/sp-1-page45.html

Brian D. Ripley, Datamining: Large Databases and
Methods, in Proceedings  of "useR! 2004 - The R User
Conference", maggio 2004
http://www.ci.tuwien.ac.at/Conferences/useR-2004/Keynotes/Ripley.pdf

Brian D. Ripley, Using Databases with R, R News,
Gennaio 2001, pagg. 18-20
http://cran.r-project.org/doc/Rnews/Rnews_2001-1.pdf

B. D. Ripley, R. M. Ripley,  Applications of R Clients
and Servers in Proceedings of the Distributed
Statistical Computing 2001 Workshop, 2001, Vienna
University of Technology.
http://www.ci.tuwien.ac.at/Conferences/DSC-2001/Proceedings/Ripley.pdf

Torsten Hothorn, David A. James, Brian D. Ripley,  R/S
Interfaces to Databases  in Proceedings of the
Distributed Statistical Computing 2001 Workshop,
2001,Vienna University of Technology.
http://www.ci.tuwien.ac.at/Conferences/DSC-2001/Proceedings/HothornJamesRipley.pdf

LuÃs Torgo, Data Mining with R. Learning by case
studies, Maggio 2003
http://www.liacc.up.pt/~ltorgo/DataMiningWithR/

Best
Vito

You wrote:

Hi,

how can I import really big plain text data files
(several GB) from an
ERP-System (SAP-Tables) to R?
The Header of these files are always similar, for
example:

Tabelle:        T009
Angezeigte Felder:  7 von  7  Feststehende
FÃ¼hrungsspalten: 2  Listbreite
0250
----------------------------------------------------------------------
|X|MANDT|PERIV|XKALE|XJABH|ANZBP|ANZSP|LTEXT          
              |
----------------------------------------------------------------------
|X|001  |01   |X    |     |012  |02   |ABC            
              |
|X|001  |V9   |     |     |012  |04   |Okt. - Sep., 4
Sonderperioden |
|X|001  |WK   |     |X    |053  |00   |Kalenderwochen 
              |
----------------------------------------------------------------------

(including the first 5 rows in each downloaded table,
row # 4 =field names,
length of 1 row > 1023 bytes, count of fields > 256,
size = several GB,
count records = several million)

What is an appropriate way to read such tables in?

Greetings
Stefan

P.S. I am a beginner with R. Until now I have used ACL
(http://www.acl.com)
for data mining purposes and I'm doing now my first
try with R.
Yes, I have
[X] Read R Data Import/Export
[X] Read Using R for Data Analysis
[X] Read Simple R
[X] Read Manuals
[X] Read read.table() and scan() command

=====
Diventare costruttori di soluzioni

"The business of the statistician is to catalyze 
the scientific learning process."  
George E. P. Box

Visitate il portale http://www.modugno.it/
e in particolare la sezione su Palese http://www.modugno.it/archivio/cat_palese.shtml