[R] big data file geting truncated

Martin Maechler maechler at stat.math.ethz.ch
Wed Aug 13 09:46:25 CEST 2003


>>>>> "Dibakar" == Dibakar Ray <dibakar at hub.nic.in>
>>>>>     on Wed 13 Aug 2003 12:33:21 +0530 (IST) writes:

    Dibakar> I am very new to R. I was trying to load some
    Dibakar> publicly available Expression data in to R.

    Dibakar> I used the following commands
    Dibakar> mydata<-read.table("dataALLAMLtrain.txt", header=TRUE, sep
    Dibakar>                    ="\t",row.names=NULL)
    Dibakar> It reads data without any error

(really?, how do you know?  
 It seems you are trying to check this via the following ?
)
    Dibakar> Now if I use
    Dibakar> edit(mydata)
    Dibakar> It shows only 3916 entries, whereas the actual file
    Dibakar> contains 7129 entries). My data is something like

    Dibakar> Gene Description Gene Accession

    Dibakar> Number	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15	16	17	18	19	20	21	22	23	24	25	26	27	34	35	36	37	38	28	29	30	31	32	33
    Dibakar> AFFX-BioB-5_at (endogenous
    Dibakar> control)	AFFX-BioB-5_at	-214	-139	-76	-135	-106	-138	-72	-413	5	-88	-165	-67	-92	-113	-107	-117	-476	-81	-44	17	-144	-247	-74	-120	-81	-112	-273	-20	7	-213	-25	-72	-4	15	-318	-32	-124	-135

(this probably has an extraneous  "wrap-around" in your post).

    Dibakar> So it seems R is truncating the data. How can I
    Dibakar> load the complete file?

edit() has been having problems with large files, however only
with more than 65535 rows.

HOWEVER, using edit() after read.table() to check your data is
not very recommended. 
Use 	dim(mydata)
	str(mydata)
and possibly also
	names(mydata)
	summary(mydata)
	
to check if the data frame was okay *before* you edited it,
using edit().

Martin Maechler <maechler at stat.math.ethz.ch>	http://stat.ethz.ch/~maechler/
Seminar fuer Statistik, ETH-Zentrum  LEO C16	Leonhardstr. 27
ETH (Federal Inst. Technology)	8092 Zurich	SWITZERLAND
phone: x-41-1-632-3408		fax: ...-1228			<><




More information about the R-help mailing list