[R] column names with rbind loop

Vining, Kelly Kelly.Vining at oregonstate.edu
Tue Aug 30 21:46:02 CEST 2011


Thanks much for your help! This almost works. However, now I am getting the following error:

> for(i in all.files) {
+ if (i==all.files[1]) new.data <- read.table(i,header=TRUE) else {
+ new.data <- rbind(new.data, read.table(i))}}
Error in match.names(clabs, names(xi)) : 
  names do not match previous names

I am wondering if this is because R adds row numbers as a numerical column to the table of the first file it reads?


________________________________________
From: Weidong Gu [anopheles123 at gmail.com]
Sent: Tuesday, August 30, 2011 12:00 PM
To: Vining, Kelly
Cc: r-help at r-project.org
Subject: Re: [R] column names with rbind loop

How about to add a conditional statement to get the header from 1st file

for(i in all.files) {
if (i==all.files[1]) new.data <- read.table(i,header=TRUE) else {
new.data <- rbind(new.data, read.table(i))}}


Weidong Gu


On Tue, Aug 30, 2011 at 1:42 PM, Vining, Kelly
<Kelly.Vining at oregonstate.edu> wrote:
> Hello R  users.
>
> This is a fairly basic question:
>
> I am concatenating data from sets of files in a directory using a loop. The column names in all files are exactly the same. My understanding is that rbind takes column names from the first file it reads. However, my output is showing that the column names are treated as a first data row, not treated as headers.
>
> I compile my file names like this:
>
>> all.files <- list.files()
>> all.files
>  [1] "1.rpkm"  "10.rpkm" "11.rpkm" "12.rpkm" "13.rpkm" "14.rpkm"
>  [7] "15.rpkm" "16.rpkm" "17.rpkm" "18.rpkm" "19.rpkm" "2.rpkm"
> [13] "3.rpkm"  "4.rpkm"  "5.rpkm"  "6.rpkm"  "7.rpkm"  "8.rpkm"
> [19] "9.rpkm"
>
> Then loop through them like this:
>> new.data <- NULL
>> for(i in all.files) {
> + in.data <- read.table(i)
> + new.data <- rbind(new.data, in.data)}
>> head(new.data)
>         V1               V2        V3     V4     V5    V6     V7
> 1     seq_id           source      type  start    end score strand
> 2 scaffold_1 Ptrichocarpav2_0 gene_body  12639  13384     .      +
> 3 scaffold_1 Ptrichocarpav2_0 gene_body  22190  22516     .      +
> 4 scaffold_1 Ptrichocarpav2_0 gene_body  74076 75893     .      +
> 5 scaffold_1 Ptrichocarpav2_0 gene_body  80207  81289     .      -
> 6 scaffold_1 Ptrichocarpav2_0 gene_body 105236 107712     .      +
>
>
> As you can see, R is putting a "V1, V2..." header row here because I didn't say "header=TRUE" in my read.table command. But if I do this within the loop, I get an error. If I try to delete the V1, V2 row after the fact by
>
> new.data <- new.data[-1,]
>
> R deletes my "real" header row.
>
> How can I get the header that I want?
>
> Thanks for any help,
> --Kelly V.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list