[R] formatting a list

Charles C. Berry cberry at tajo.ucsd.edu
Fri Oct 26 20:24:00 CEST 2007


On Fri, 26 Oct 2007, Tomas Vaisar wrote:

> Hi Chuck,
>
> I finally got to install v 2.6.0 and tried your initial suggestions - with 
> the new version the
>
> dat <- as.data.frame( matrix( scan('tmp.txt'), nr=19) )
>
> did not make the list in the desired format, however the other two worked.

Tomas,

I am glad to hear that those were successful.

I believe that

 	 dat <- as.data.frame( <etc> )

did indeed create a list in the 'desired format'. This use of 
'as.data.frame' is a standard trick for turning a matrix into a list 
whose componenets are the columns of the matrix (which in the above case 
are the rows of your data file).

But I suspect that you printed it (or several elements like 'dat[1:3]' ) 
out and were fooled by what you saw.

This would happen because in this case class(dat) =='data.frame'.

data.frames are lists - try

 	is.list(dat)

There is a print method for data.frame, so the appearance of

 	print( dat[ 1:3 ] )

and

 	print( unclass( dat[ 1:3 ] ) )

on your screen is rather different.

Chuck



>
> Thanks a lot again.
>
> Tomas
>
> Charles C. Berry wrote:
>>
>>  Tomas,
>>
>>  Are you using R-2.6.0 ??
>>
>>  Each method works for me producing as list of 7000 vectors.
>>
>>  The file I used to test this is created by:
>>
>>  for (i in 1:7000) cat( seq(from=i,by=1,length=19),"\n",
>>      sep='\t',file="tmp.tab",append=TRUE)
>>
>>  The first line is:
>> 
>> >  scan("tmp.tab",nlines=1)
>>  Read 19 items
>>   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
>> > 
>>
>>  The last line is
>> 
>> >  scan("tmp.tab",skip=6999,nlines=1)
>>  Read 19 items
>>  [1] 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013
>>  7014 7015 7016 7017 7018
>> > 
>>
>>  and each method recapitulates this:
>> 
>> >  dat[[1]]
>>   [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19
>> >  dat[[7000]]
>>  [1] 7000 7001 7002 7003 7004 7005 7006 7007 7008 7009 7010 7011 7012 7013
>>  7014 7015 7016 7017 7018
>> > 
>>
>>  The second method threw lots of warnings because open connections must be
>>  closed. Those could be eliminated by explicitly opening and closing the
>>  connection. Before using the second method closeAllconnections() was
>>  sometimes needed, but the error it reported differs from the one you
>>  mention.
>>
>>  I am using
>> 
>> >  version
>>                 _
>>  platform       i386-pc-mingw32
>>  arch           i386
>>  os             mingw32
>>  system         i386, mingw32
>>  status
>>  major          2
>>  minor          6.0
>>  year           2007
>>  month          10
>>  day            03
>>  svn rev        43063
>>  language       R
>>  version.string R version 2.6.0 (2007-10-03)
>> > 
>> 
>>
>>  Chuck
>>
>>  On Mon, 22 Oct 2007, Tomas Vaisar wrote:
>> 
>> >  Hi Chuck,
>> > 
>> >  thanks for your responses.   I did not ignore your suggestions - I did
>> >  try them and they did not produce what I need.
>> > 
>> >  The first one produced table with the same format as a read.table would
>> >  generate, not not a list of lists.
>> >  Second one gave me an error after returning Read 19 items multiple times
>> > :  Error in textConnection(x) : all connections are in use
>> >  The last one gave me similar error on the first step - Error in
>> >  file(con, "r") : all connections are in use
>> > 
>> >  However, your last suggestion to make list of lists seems that it
>> >  works.  I will have to test more.
>> > 
>> >  Cheers,
>> > 
>> >  Tomas
>> > 
>> >  Charles C. Berry wrote:
>> > > 
>> > >  Tomas,
>> > > 
>> > >  Three different ways to create a list of 7000 vectors from a file of
>> > >  7000 rows and 19 columns are given here:
>> > > 
>> > >      http://article.gmane.org/gmane.comp.lang.r.general/97032
>> > > 
>> > >  which I think is what you are asking for.
>> > > 
>> > >  If you truly need  a list of 7000 lists each of length 1 containing a
>> > >  vector of length 19, then do this:
>> > > 
>> > >  list.of.lists.of.one.vector.each <- lapply( list.of.vectors, list )
>> > > 
>> > > 
>> > >  BTW, as this thread appears in
>> > > 
>> > >      http://news.gmane.org/gmane.comp.lang.r.general
>> > > 
>> > >  the above article was the firt reply to your original query. I am
>> > >  puzzled as to why you did not simply implement one of the three
>> > >  methods shown there.
>> > > 
>> > >  Chuck
>> > > 
>> > >  On Mon, 22 Oct 2007, Tomas Vaisar wrote:
>> > > 
>> > > >  Hi Jim,
>> > > > 
>> > > >  I really appreciate your help.
>> > > >  From the input file I have - 19 columns, 7000 rows - the scan gives 
>> > > >  me
>> > > >  the desired format of a list consisting of 19 lists with 7000 values
>> > > >  each.
>> > > >  However I need a list of 7000 lists with 19 values each. (e.g. each 
>> > > >  row
>> > > >  of my input file should be a separate list bound in a list of all 
>> > > >  these
>> > > >  lists)
>> > > >  I use both commands you suggested -
>> > > >  x <- scan('temp.txt', what=c(rep(list(0), 19)))
>> > > >  followed by
>> > > >  x.matrix <- do.call('rbind', x)  # gives 7000 x 19 matrix.
>> > > > 
>> > > >  Although this makes a matrix of the correct dimensions it is not the
>> > > >  "list of lists" the ROCR package expects as input.  Can you convert 
>> > > >  this
>> > > >  matrix into a "list of lists"?  Or is there a simple way in R to 
>> > > >  convert
>> > > >  a table into such a "list of lists"?
>> > > > 
>> > > >  Thanks again,
>> > > > 
>> > > >  Tomas
>> > > > 
>> > > > 
>> > > >  jim holtman wrote:
>> > > > >  That is what I thought and that is the format that the 'scan' 
>> > > > >  approach
>> > > > >  should provide.  I was just confused when you said that you were 
>> > > > >  going
>> > > > >  to have to transpose it, write it and then read it back in for 
>> > > > >  some
>> > > > >  reason.  I understand that Excel can not handle 7000 columns, but 
>> > > > >  was
>> > > > >  wondering where that came into play.
>> > > > > 
>> > > > >  On 10/21/07, Tomas Vaisar <tvaisar at u.washington.edu> wrote:
>> > > > > 
>> > > > > >  The data I have is tab delimited file with 7000 lines of 19 
>> > > > > >  values
>> > > > > >  each
>> > > > > >  (representing 7000 permutations on 19 variables). I want to get 
>> > > > > >  it
>> > > > > >  into
>> > > > > >  the ROCR package which expects the data to be in lists - single
>> > > > > >  list of
>> > > > > >  19 values for each permutation, e.g. list of 7000 lists of 19
>> > > > > >  values each.
>> > > > > > 
>> > > > > >  I hope this is little clearer.
>> > > > > > 
>> > > > > >  Tomas
>> > > > > > 
>> > > > > >  jim holtman wrote:
>> > > > > > 
>> > > > > > >  What is it that you want to do?  The 'scan' statement give you 
>> > > > > > >  a list
>> > > > > > >  of length 7000 with 19 entries each.  Do you want to create a 
>> > > > > > >  matrix
>> > > > > > >  that has 7000 rows by 19 columns?  If so, then you just have 
>> > > > > > >  to take
>> > > > > > >  the output of the 'scan' and do:
>> > > > > > > 
>> > > > > > >  x.matrix <- do.call('rbind', x)  # gives 7000 x 19 matrix.
>> > > > > > > 
>> > > > > > >  So I am still not sure exactly what your input is and what you
>> > > > > > >  want to
>> > > > > > >  do with it.
>> > > > > > > 
>> > > > > > >  On 10/21/07, Tomas Vaisar <tvaisar at u.washington.edu> wrote:
>> > > > > > > 
>> > > > > > > 
>> > > > > > > >  Hi Jim,
>> > > > > > > > 
>> > > > > > > >  thanks a lot.  It works, however - my other problem is that 
>> > > > > > > >  I
>> > > > > > > >  need to
>> > > > > > > >  transpose the original table before reading it into the list
>> > > > > > > >  because the
>> > > > > > > >  data come from Excel and it can't handle 7000 columns.  I 
>> > > > > > > >  could
>> > > > > > > >  read it
>> > > > > > > >  in R transpose end write into a new tab delim file and then 
>> > > > > > > >  read
>> > > > > > > >  it back
>> > > > > > > >  in,  but I would think that there might be a way in R to do 
>> > > > > > > >  both.
>> > > > > > > >  Would you know about the way?
>> > > > > > > > 
>> > > > > > > >  Tomas
>> > > > > > > > 
>> > > > > > > >  jim holtman wrote:
>> > > > > > > > 
>> > > > > > > > 
>> > > > > > > > >  another choice is:
>> > > > > > > > > 
>> > > > > > > > >  x <- scan('temp.txt', what=c(rep(list(0), 19)))
>> > > > > > > > > 
>> > > > > > > > >  On 10/20/07, Tomas Vaisar <tvaisar at u.washington.edu> 
>> > > > > > > > >  wrote:
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > > > >  Hi,
>> > > > > > > > > > 
>> > > > > > > > > >  I am new to R and need to read in a file with 19 columns 
>> > > > > > > > > >  and
>> > > > > > > > > >  7000 rows
>> > > > > > > > > >  and make it into a list of 7000 lists with 19 items 
>> > > > > > > > > >  each.  For a
>> > > > > > > > > >  simpler case of 10 by 10 table I used x <-scan("file",
>> > > > > > > > > >  list(0,0,0,0,0,0,0,0,0,0)), perhaps clumsy, but it did 
>> > > > > > > > > >  the job.
>> > > > > > > > > >  However with the large 19x7000 (which needs to be 
>> > > > > > > > > >  transposed) I
>> > > > > > > > > >  am not
>> > > > > > > > > >  sure how to go about it.
>> > > > > > > > > > 
>> > > > > > > > > >  Coudl somebody suggest a way?
>> > > > > > > > > > 
>> > > > > > > > > >  Thanks,
>> > > > > > > > > > 
>> > > > > > > > > >  Tomas
>> > > > > > > > > > 
>> > > > > > > > > >  ______________________________________________
>> > > > > > > > > >  R-help at r-project.org mailing list
>> > > > > > > > > >  https://stat.ethz.ch/mailman/listinfo/r-help
>> > > > > > > > > >  PLEASE do read the posting guide
>> > > > > > > > > >  http://www.R-project.org/posting-guide.html
>> > > > > > > > > >  and provide commented, minimal, self-contained, 
>> > > > > > > > > >  reproducible code.
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > > 
>> > > > > > > > > 
>> > > > > > > 
>> > > > > > > 
>> > > > > 
>> > > > > 
>> > > > > 
>> > > > 
>> > > >  ______________________________________________
>> > > >  R-help at r-project.org mailing list
>> > > >  https://stat.ethz.ch/mailman/listinfo/r-help
>> > > >  PLEASE do read the posting guide
>> > > >  http://www.R-project.org/posting-guide.html
>> > > >  and provide commented, minimal, self-contained, reproducible code.
>> > > > 
>> > > 
>> > >  Charles C. Berry                            (858) 534-2098
>> > >                                              Dept of Family/Preventive
>> > >  Medicine
>> > >  E mailto:cberry at tajo.ucsd.edu                UC San Diego
>> > >  http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego
>> > >  92093-0901
>> > > 
>> > > 
>> > 
>> >  ______________________________________________
>> >  R-help at r-project.org mailing list
>> >  https://stat.ethz.ch/mailman/listinfo/r-help
>> >  PLEASE do read the posting guide 
>> >  http://www.R-project.org/posting-guide.html
>> >  and provide commented, minimal, self-contained, reproducible code.
>> > 
>>
>>  Charles C. Berry                            (858) 534-2098
>>                                              Dept of Family/Preventive
>>  Medicine
>>  E mailto:cberry at tajo.ucsd.edu                UC San Diego
>>  http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901
>> 
>> 
>

Charles C. Berry                            (858) 534-2098
                                             Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu	            UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901



More information about the R-help mailing list