[R] How to insert filename as column in a file

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Tue Apr 24 18:21:53 CEST 2012


Programatically dealing with large numbers of separately-named objects leads to syntactically complicated code that is hard to read and maintain. 

Load the data frames into a list so you can access them by numeric or named index, and then getting at the loaded data will be much easier.

fnames = list.files(path = getwd())
# preallocating the list for efficiency (execution speed)
dtalist <- vector( "list", length(fnames) )
for (i in seq_len(length(fnames))){
  dtalist[[i]] <- read.csv.sql(fnames[i], sql = "select * from file where V3 == 'XXX' and V5=='YYY'",header = FALSE, sep= '|', eol ="\n"))
 dtalist[[i]]$date <-  substr(fnames[i],1,8)) 
}
names(dtalist) <- fnames
# now you can optionally refer to dtalist$file20120424.csv or dtalist[["file20120424"]] if you wish.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.



Shivam <shivamsingh at gmail.com> wrote:

>Reposting in hope of a reply.
>
>On Tue, Apr 24, 2012 at 1:12 AM, Shivam <shivamsingh at gmail.com> wrote:
>
>> Thanks for the quick response. It works for an individual dataframe,
>but I
>> have many dataframes. This is the code so far
>>
>> fnames = list.files(path = getwd())
>> for (i in 1:length(fnames)){
>> assign(paste("file",i,sep=""),read.csv.sql(fnames[i], sql = "select *
>from
>> file where V3 == 'XXX' and V5=='YYY'",header = FALSE, sep= '|', eol =
>"\n"))
>> }
>>
>> This generates dataframes named as as file1,file2,...,file250. Is
>there a
>> way to do something like below within the same loop?
>>
>> file1$date = substr(fnames[1],1,8))
>> file2$date = substr(fnames[2],1,8))
>> .
>> .
>> file250$date = substr(fnames[250],1,8))
>>
>> assign(paste("file",i,sep="")$date doesnt work.
>>
>> Any help?
>>
>>
>>
>>
>>
>> On Tue, Apr 24, 2012 at 12:01 AM, MacQueen, Don
><macqueen1 at llnl.gov>wrote:
>>
>>> This little example might help.
>>>
>>> > foo <- data.frame(a=1:10, b=letters[1:0])
>>> > foo
>>>    a b
>>> 1   1 a
>>> 2   2 a
>>> 3   3 a
>>> 4   4 a
>>> 5   5 a
>>> 6   6 a
>>> 7   7 a
>>> 8   8 a
>>> 9   9 a
>>> 10 10 a
>>> > foo$date <- '20120423'
>>> > foo
>>>    a b     date
>>> 1   1 a 20120423
>>> 2   2 a 20120423
>>> 3   3 a 20120423
>>> 4   4 a 20120423
>>> 5   5 a 20120423
>>> 6   6 a 20120423
>>> 7   7 a 20120423
>>> 8   8 a 20120423
>>> 9   9 a 20120423
>>> 10 10 a 20120423
>>>
>>>
>>> In other words, immediately after reading the data into a data
>frame, add
>>> a date column as in the example. You'll have to extract the date
>from the
>>> filename, of course.
>>>
>>> -Don
>>>
>>>
>>> --
>>> Don MacQueen
>>>
>>> Lawrence Livermore National Laboratory
>>> 7000 East Ave., L-627
>>> Livermore, CA 94550
>>> 925-423-1062
>>>
>>>
>>>
>>>
>>>
>>> On 4/23/12 9:29 AM, "Shivam" <shivamsingh at gmail.com> wrote:
>>>
>>> >Hi,
>>> >
>>> >I am relatively new to R. Have scourged the help files and the www
>but
>>> >havent been able to get a solution.
>>> >
>>> >I have around 250 csv files, one file for each date. They have
>columns of
>>> >all types, numeric, string etc. The name of each file is the date
>in the
>>> >form of 'yyyymmdd'. There is no column within the file which helps
>me
>>> >identify the date on which the file was generated, only the
>filename has
>>> >that info.
>>> >
>>> >I am selecting some data (using read.csv.sql) from each file and
>creating
>>> >a
>>> >dataset for each day. Ultimately I will combine all the datasets. I
>can
>>> >accomplish the select and combine part, but after combining I wont
>have a
>>> >record as to the date corresponding to the data.
>>> >
>>> >Hence I want to insert the filename as a column in the respective
>file to
>>> >help me in identifying to what date each data row belongs to.
>>> >
>>> >Sorry for the long mail, but wanted to make myself clear. Any help
>would
>>> >be
>>> >greatly appreciated.
>>> >
>>> >Thanks in advance,
>>> >Shivam
>>> >
>>> >       [[alternative HTML version deleted]]
>>> >
>>> >______________________________________________
>>> >R-help at r-project.org mailing list
>>> >https://stat.ethz.ch/mailman/listinfo/r-help
>>> >PLEASE do read the posting guide
>>> >http://www.R-project.org/posting-guide.html
>>> >and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>> --
>> *Victoria Concordia Crescit*
>>
>
>
>
>-- 
>*Victoria Concordia Crescit*
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list