[R] Split dataframe into new dataframes

David Winsemius dwinsemius at comcast.net
Thu Feb 9 00:57:49 CET 2012


On Feb 8, 2012, at 6:29 PM, Johannes Radinger wrote:

>
> Am 08.02.2012 um 23:47 schrieb David Winsemius:
>
>>
>> On Feb 8, 2012, at 5:06 PM, Johannes Radinger wrote:
>>
>>>
>>> Am 08.02.2012 um 22:19 schrieb David Winsemius:
>>>
>>>>
>>>> On Feb 8, 2012, at 4:11 PM, Johannes Radinger wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I want to split a dataframe based on a grouping variable (in one  
>>>>> column). The resulting new
>>>>> dataframes should be stored in a new variable. I tried to split  
>>>>> the dataframe using split() and
>>>>> to store it using a FOR loop, but thats not working so far:
>>>>>
>>>>> df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4))
>>>>>
>>>>> Fsplit <- function(x,y){
>>>>> 	ls <- split(x,f=x$y)
>>>>> 	for (i in names(ls)){
>>>>> 		i <- ls$i
>>>>> 	}
>>>>> }
>>>>>
>>>>> Fsplit(df,A) #1st argument is dataframe to split, 2nd argument  
>>>>> grouping variable
>>>>>
>>>>
>>>> It appears you want the name of the levels of df$A to be the  
>>>> names of separate variables in the global environment. If that is  
>>>> correct, then see the FAQ. I'm not sure which one it is among the  
>>>> Miscellaneous section, but you should be looking of the one that  
>>>> tells you how to construct a named variable.
>>>>
>>>
>>> Your hint with the global environment brought me on track. It  
>>> seems that I this task can be done with list2env() although there  
>>> is still a problem with my function. How
>>> can I parse the name of the dataframe and the column name in the  
>>> function...
>>>
>>> df <- data.frame(A=c("A1","A1","A2","A2"),B=seq(1:4))
>>>
>>> Fsplit <- function(x,y){
>>> 	ls <- split(x,f=x$y)
>>> 	list2env(ls,envir = .GlobalEnv)
>>> }
>>>
>>> Fsplit(df,A)
>>
>> I still have not figured out what you really want to do. The simple  
>> answer to what you ask for in your written request is simply:
>>
>> dfvar <- split(df, df$A)
>>
>> So what is it about that result that is not useful for your (as yet  
>> unstated)  destination?
>>
>> > split(df, df$A)
>> $A1
>>    A B
>> 1 A1 1
>> 2 A1 2
>>
>> $A2
>>    A B
>> 3 A2 3
>> 4 A2 4
>>
>>
>
> Sorry for not being clear enough, and your are
> right as "split(df, df$A)" is what I want. Additionally I want to  
> store afterwards
> the single objects of the list in new dataframes
> where variable name = name of list object (which can be done with  
> list2env()).
> Is that clear enough so far?

If you want to put that list in an environment, it's fine with me. Or  
you can access it from the split-object-list-of-dataframes, dfvar,  
using with()

 >  with(dfvar, A1)
    A B
1 A1 1
2 A1 2

Note: with() does not work well inside other functions
For programming purposes this might be safer..

 > new.env <- environment()
 > list2env(dfvar, new.env)
<environment: R_GlobalEnv>
 > new.env$A1
    A B
1 A1 1
2 A1 2




>
> What I want exactly is to express that two operations (split,  
> list2env) within
> one function. I need the function for other tasks in R.
>
> /johannes
>
>>
>>
>>
>>>
>>> /johannes
>>>
>>>> Or:
>>>>
>>>> ? assign
>>>>
>>>> -- 
>>>> David Winsemius, MD
>>>> West Hartford, CT
>>>>
>>>
>>
>> David Winsemius, MD
>> West Hartford, CT
>>
>

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list