[R] From data frame to list object

David Winsemius dwinsemius at comcast.net
Mon Jan 31 20:31:21 CET 2011


On Jan 31, 2011, at 2:26 PM, David Winsemius wrote:

>
> On Jan 31, 2011, at 2:18 PM, Bogaso Christofer wrote:
>
>> Sorry if I did not clarify that. Here I have a data frame with many  
>> columns,
>> which was taken from some outside DB. Now I want to split that data  
>> frame
>> and create a "list" object (to make my further calculation easier),  
>> on basis
>> of a typical column of that DB. I cannot post my original DB here  
>> (due to
>> some security reason and ofcourse it's huge size), therefore I  
>> posted an
>> artificial DB.
>>
>> Here my artificial DB with 3 columns:
>>
>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>> z=rep(c("x", "y", "z"), each=2))
>>
>> I would like to create a list object where each element now is a  
>> matrix or
>> data frame, based on that "y" column. 1st element of that list will  
>> be a
>> data frame with observations of "x" and "z" columns, that address the
>> attribute "y = a". Similarly other two.
>
> It is a job for split. The result is a data.frame but that is, after  
> all, just a list with certain attributes.
>
Er, the result is a list of dataframes.

> > split(dfrm, dfrm$y)
> $a
>            x y z
> 1 -1.16790385 a x
> 2 -0.84831139 a x
> 3 -0.64312051 a y
> 4 -1.66841121 a y
> 5  0.03737404 a z
> 6 -0.42165643 a z
>
> $b
>            x y z
> 7   1.1045024 b x
> 8   1.4787933 b x
> 9   0.5278083 b y
> 10  0.1770083 b y
> 11 -0.5054573 b z
> 12 -0.6512499 b z
>
> $c
>             x y z
> 13  0.61225420 c x
> 14 -0.45032691 c x
> 15  0.36502921 c y
> 16  0.33505288 c y
> 17  0.02189088 c z
> 18 -0.53893624 c z
>
> > str(split(dfrm, dfrm$y)$a)
> 'data.frame':	6 obs. of  3 variables:
> $ x: num  -1.1679 -0.8483 -0.6431 -1.6684 0.0374 ...
> $ y: Factor w/ 3 levels "a","b","c": 1 1 1 1 1 1
> $ z: Factor w/ 3 levels "x","y","z": 1 1 2 2 3 3
>
> -- 
> David
>>
>> Hope I could be able to make my intentions clearer.
>>
>> Any idea how I can achieve that?
>>
>> Thanks,
>>
>>
>>
>> -----Original Message-----
>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>> Sent: 01 February 2011 00:13
>> To: Bogaso Christofer
>> Cc: r-help at r-project.org
>> Subject: Re: [R] From data frame to list object
>>
>>
>> On Jan 31, 2011, at 1:56 PM, Bogaso Christofer wrote:
>>
>>> Thanks David for this reply. However if my data frame has only 2
>>> columns
>>> then it is working fine. It is not working for a general setting:
>>>
>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6),
>>> z=rep(c("x", "y", "z"), each=2))
>>> tapply(dfrm[,1], dfrm$y, c) # this is working fine
>>>
>>>> tapply(dfrm[,c(1,3)], dfrm$y, c)  # this is giving error!
>>> Error in tapply(dfrm[, c(1, 3)], dfrm$y, c) :
>>> arguments must have same length
>>>
>>> Can you please help me how to modify that?
>>
>> You will need to specify what you goals are. What to you want to
>> happen to those two columns referred to by dfrm[, c(1,3)]? It's
>> possible that split() may be the answer, but clarify the goals first.
>> You should provide an example that represents the complexity of the
>> task.
>>
>>>
>>> Thanks,
>>>
>>> -----Original Message-----
>>> From: David Winsemius [mailto:dwinsemius at comcast.net]
>>> Sent: 31 January 2011 23:26
>>> To: Bogaso Christofer
>>> Cc: r-help at r-project.org
>>> Subject: Re: [R] From data frame to list object
>>>
>>>
>>> On Jan 31, 2011, at 1:03 PM, Bogaso Christofer wrote:
>>>
>>>> Dear all, let say I have following data frame:
>>>>
>>>>
>>>
>>>> dfrm <- data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))  >
>>> tapply(dfrm$x, dfrm$y, c) $a [1]  0.9711995  1.4018345 -1.4355713
>>> -0.5106138
>>> -0.8470171 [6]  1.1634586
>>>
>>> $b
>>> [1] -0.8058164  0.4977112  1.1556391  0.8158588  0.2549273 [6]
>>> 3.0758260
>>>
>>> $c
>>> [1]  0.437345128 -0.415874363  0.003230285 -0.737117910 [5]
>>> 1.247972964
>>> 0.903001077
>>>
>>>
>>>>
>>>>> data.frame(x=rnorm(18), y=rep(c("a", "b", "c"), each=6))
>>>>
>>>>           x y
>>>>
>>>> 1  -1.072152537 a
>>>>
>>>> 2   0.382985265 a
>>>>
>>>> 3   0.058877377 a
>>>>
>>>> 4  -0.006911939 a
>>>>
>>>> 5  -2.355269051 a
>>>>
>>>> 6  -0.303095553 a
>>>>
>>>> 7   0.484038422 b
>>>>
>>>> 8   0.733928931 b
>>>>
>>>> 9  -1.136014346 b
>>>>
>>>> 10  0.503552090 b
>>>>
>>>> 11  1.708609658 b
>>>>
>>>> 12 -0.294599403 b
>>>>
>>>> 13  1.239308497 c
>>>>
>>>> 14  0.754081946 c
>>>>
>>>> 15 -0.237346858 c
>>>>
>>>> 16 -0.051011439 c
>>>>
>>>> 17 -0.618675146 c
>>>>
>>>> 18  0.537612359 c
>>>>
>>>>
>>>>
>>>>> From this data frame I want to create a "list" of length 3, where
>>>>> each
>>>> element of this list will be a vector corresponding to the value of
>>>> y.
>>>> For example, 1st element will be all "x" values corresponding to  
>>>> the
>>>> "y=a", and similarly the other elements of this list. Can somebody
>>>> point me how to do this without having some "for" loop?
>>>>
>>>>
>>>>
>>>> Thanks and regards,
>>>>
>>>>
>>>> 	[[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> David Winsemius, MD
>>> West Hartford, CT
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list