[R] [Rd] Scan data from a .txt file

Marc Schwartz (via MN) mschwartz at mn.rr.com
Thu Nov 17 17:24:47 CET 2005


I have a feeling that Vasu wants (mistakenly) this:

dat <- read.table("clipboard", header = FALSE)

> dat
      V1     V2     V3     V4
1   Name Weight Height Gender
2   Anne    150     65      F
3    Rob    160     68      M
4 George    180     65      M
5   Greg    205     69      M

> str(dat)
`data.frame':   5 obs. of  4 variables:
 $ V1: Factor w/ 5 levels "Anne","George",..: 4 1 5 2 3
 $ V2: Factor w/ 5 levels "150","160","180",..: 5 1 2 3 4
 $ V3: Factor w/ 4 levels "65","68","69",..: 4 1 2 1 3
 $ V4: Factor w/ 3 levels "F","Gender","M": 2 1 3 3 3

> dat$V1
[1] Name   Anne   Rob    George Greg
Levels: Anne George Greg Name Rob

> dat$V2
[1] Weight 150    160    180    205
Levels: 150 160 180 205 Weight

> dat$V3
[1] Height 65     68     65     69
Levels: 65 68 69 Height

> dat$V4
[1] Gender F      M      M      M
Levels: F Gender M


So that the colnames are actually part of the data frame columns.

Vasu, note however that all values become factors or you can convert to
character, for example:

> as.character(dat$V1)
[1] "Name"   "Anne"   "Rob"    "George" "Greg"

neither of which I suspect is what you really want.


You can access the column names of the data frame using colnames():

> dat <- read.table("clipboard", header = TRUE)

> dat
    Name Weight Height Gender
1   Anne    150     65      F
2    Rob    160     68      M
3 George    180     65      M
4   Greg    205     69      M

> colnames(dat)
[1] "Name"   "Weight" "Height" "Gender"


This keeps the column names separate from the actual data, which unless
we are missing something here, is the proper way to do this. Think of a
data frame as a rectangular data set, which can contain more than one
data type across the columns, much like a spreadsheet.  The difference
here (unlike a spreadsheet) is that the first row does not contain the
column names/labels. These are separate from the data itself, which in a
typical spreadsheet would start on row 2.

Note as Andy pointed out, that in this case, you should use
read.table(), not scan().

Review "An Introduction To R" and the "R Data Import/Export" manuals for
more information. Both are available with your installation and/or from
the main R web site under Documentation.

HTH,

Marc Schwartz


On Thu, 2005-11-17 at 10:41 -0500, Liaw, Andy wrote:
> [Re-directing to R-help, as this is more appropriate there.]
> 
> I tried copying the snippet of data into the windows clipboard and tried it:
> 
> > dat <- read.table("clipboard", header=T)
> > dat
>     Name Weight Height Gender
> 1   Anne    150     65      F
> 2    Rob    160     68      M
> 3 George    180     65      M
> 4   Greg    205     69      M
> > str(dat)
> `data.frame':   4 obs. of  4 variables:
>  $ Name  : Factor w/ 4 levels "Anne","George",..: 1 4 2 3
>  $ Weight: int  150 160 180 205
>  $ Height: int  65 68 65 69
>  $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat <- read.table("clipboard", header=T, row=1)
> > str(dat)
> `data.frame':   4 obs. of  3 variables:
>  $ Weight: int  150 160 180 205
>  $ Height: int  65 68 65 69
>  $ Gender: Factor w/ 2 levels "F","M": 1 2 2 2
> > dat
>        Weight Height Gender
> Anne      150     65      F
> Rob       160     68      M
> George    180     65      M
> Greg      205     69      M
> 
> Don't see how it "doesn't work".  Please give more detail on what "doesn't
> work" means.
> 
> Andy
> 
> > From: Vasundhara Akkineni
> > 
> > Hi all,
> > Am trying to read data from a .txt file in such a way that i 
> > can access the
> > column names too. For example, the data in the table.txt file 
> > is as below:
> >  Name Weight Height Gender
> > Anne 150 65 F
> > Rob 160 68 M
> > George 180 65 M
> > Greg 205 69 M
> >  i used the following commands:
> >  data<-scan("table.txt",list("",0,0,0),sep="")
> > a<-data[[1]]
> > b<-data[[2]]
> > c<-data[[3]]
> > d<-data[[4]]
> >  But this doesn't work because of type mismatch. I want to 
> > pull the col
> > names also into the respective lists. For example i want 'b' to have
> > (weight,150,160,180,205) so that i can access the col name 
> > and also the
> > induvidual weights. I tried using the read.table method too, 
> > but couldn't
> > get this working. Can someone suggest a way to do this.
> > Thanks,
> > Vasu.
> >




More information about the R-help mailing list