[R] lapply getting names of the list

Joshua Wiley jwiley.psych at gmail.com
Thu Dec 9 19:06:30 CET 2010


Hi Sashi,

On Thu, Dec 9, 2010 at 9:44 AM, Sashi Challa <challa at ohsu.edu> wrote:
> Hello All,
>
> I have a toy dataframe like this. It has 8 columns separated by tab.
>
> Name    SampleID        Al1     Al2     X       Y       R       Th
> rs191191        A1      A       B       0.999   0.09    0.78    0.090
> abc928291       A1      B       J       0.3838  0.3839  0.028   0.888
> abcnab  A1      H       K       0.3939  0.939   0.3939  0.77
> rx82922 B1      J       K       0.3838  0.393   0.393   0.00
> rcn3939 B1      M       O       0.000   0.000   0.000   0.77
> tcn39399        B1      P       I       0.393   0.393   0.393   0.56
>
> Note that the SampleID is repeating. So I want to be able to split the dataset based on the SampleID and write the splitted dataset of every SampleID into a new file.
> I tried split followed by lapply to do this.
>
> infile <- read.csv("test.txt", sep="\t", as.is = TRUE, header = TRUE)
> infile.split  <- split(infile, infile$SampleID)
> names(infile.split[1])  ## outputs “A1”

correct, names() returns the top level names of infile.split (i.e.,
the two data frame names)

> ## now A1, B1 are two lists in infile.split as I understand it. Correct me if I am wrong.

It is a single, named list containing two data frames (A1 and B1)
(though data frames are built from lists, I think so I suppose in a
way it contains two lists, but that is not really the point).

>
> lapply(infile.split,function(x){
>              filename <- names(x) #### here I expect to see A1 or B1, I didn’t, I tried (names(x)[1]) and that gave me “Name” and not A1 or B1.

by using lapply() on the actual object, your function is getting each
element of the list.  That is:

infile.split[[1]]
infile.split[[2]]

trying names() on those:

names(infile.split[[1]])

should show what you are getting

>              final_filename <- paste(filename,”toy_set.txt”,sep=”_”)
>              write.table(x, file = paste(path, final_filename,sep=”/”, row.names=FALSE, quote=FALSE,sep=”\t”)

FYI I think you are missing a parenthesis in there somewhere
>  } )
>
> In lapply I wanted to give a unique filename to all the split Sample Ids, i.e. name them here as A1_toy_set.txt, B1_toy_set_txt.
> How do I get those names, i.e. A1, B1 to a create a filename like above.

Try this:

## read your data from the clipboard (obviously you do not need to)
infile <- read.table("clipboard", header = TRUE)
split.infile <- split(dat, dat$SampleID) #split data
path <- "~" # generic path

## rather than applying to the data itself, instead apply to the names
lapply(names(split.infile), function(x) {
  write.table(x = split.infile[[x]],
    file = paste(path, paste(x, "toy_set.txt", sep = "_"), sep = "/"),
    row.names = FALSE, quote = FALSE, sep = "\t")
  cat("wrote ", x, fill = TRUE)
})

it will return two NULL lists, but that is fine because it should have
written the files.

> When I write each of the element in the list obtained after split into a file, the column names would have names like A1.Name, A1.SampleID, A1.Al1, ….. Can I get rid of “A1” in the column names within the lapply (other than reading in the file again and changing the names) ?

Can you report the results of str(yourdataframe) ?  I did not have
that issue just copying and pasting from your email and using the code
I showed above.

Cheers,

Josh

>
> Thanks for your time,
>
> Regards
> Sashi
>
>
>        [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/



More information about the R-help mailing list