[R] Split a dataframe by rownames and/or colnames

Sergio Fonda sergio.fonda99 at gmail.com
Mon Feb 23 13:12:56 CET 2015


Did you try "dplyr" package?
Sergio
Il 23/feb/2015 13:05 "Tim Richter-Heitmann" <trichter at uni-bremen.de> ha
scritto:

> Thank you very much for the line. It was doing the split as suggested.
> However, i want to release all the dataframes to the environment (later
> on, for each dataframe, some dozen lines of code will be carried out, and i
> dont know how to do it w lapply or for-looping, so i do it separately):
>
> list2env(split(df, sub(".+_","", rownames(df))), envir=.GlobalEnv)
>
> Anyway, the dataframes have now numeric names in some cases, and cannot be
> easily accessed because of it.
> How would the line be  altered to add an "df_" for each  of the dataframe
> names resulting from list2env?
>
> Thank you very much!
>
>
>
> Thanks, On 20.02.2015 20:36, David Winsemius wrote:
>
>> On Feb 20, 2015, at 9:33 AM, Tim Richter-Heitmann wrote:
>>
>>  Dear List,
>>>
>>> Consider this example
>>>
>>> df <- data.frame(matrix(rnorm(9*9), ncol=9))
>>> names(df) <- c("c_1", "d_1", "e_1", "a_p", "b_p", "c_p", "1_o1", "2_o1",
>>> "3_o1")
>>> row.names(df) <- names(df)
>>>
>>>
>>> indx <- gsub(".*_", "", names(df))
>>>
>>> I can split the dataframe by the index that is given in the column.names
>>> after the underscore "_".
>>>
>>> list2env(
>>>   setNames(
>>>     lapply(split(colnames(df), indx), function(x) df[x]),
>>>     paste('df', sort(unique(indx)), sep="_")),
>>>   envir=.GlobalEnv)
>>>
>>> However, i changed my mind and want to do it now by rownames. Exchanging
>>> colnames with rownames does not work, it gives the exact same output (9
>>> rows x 3 columns). I could do
>>> as.data.frame(t(df_x),
>>> but maybe that is not elegant.
>>> What would be the solution for splitting the dataframe by rows?
>>>
>> The split.data.frame method seems to work perfectly well with a
>> rownames-derived index argument:
>>
>>  split(df, sub(".+_","", rownames(df) ) )
>>>
>> $`1`
>>        c_1   d_1  e_1   a_p   b_p   c_p  1_o1 2_o1  3_o1
>> c_1 -0.11 -0.04 1.33 -0.87 -0.16 -0.25 -0.75 0.34  0.14
>> d_1 -0.62 -0.94 0.80 -0.78 -0.70  0.74  0.11 1.44 -0.33
>> e_1  0.98 -0.83 0.48  0.19 -0.32 -1.01  1.28 1.04 -2.16
>>
>> $o1
>>         c_1   d_1   e_1   a_p   b_p   c_p  1_o1  2_o1  3_o1
>> 1_o1 -0.93 -0.02  0.69 -0.67  1.04  1.04 -1.50 -0.36  0.50
>> 2_o1  0.02 -0.16 -0.09 -1.50 -0.02 -1.04  1.07 -0.45  1.56
>> 3_o1 -1.42  0.88 -0.05  0.85 -1.35  0.21  1.35  0.92 -0.76
>>
>> $p
>>        c_1   d_1   e_1   a_p  b_p   c_p  1_o1  2_o1  3_o1
>> a_p -1.35  0.91 -0.58 -0.63 0.94 -1.13  0.71  0.25  0.82
>> b_p -0.25 -0.73 -0.41 -1.71 1.28  0.19 -0.35  1.74 -0.93
>> c_p -0.01 -1.11 -0.12  0.58 1.51  0.03 -0.99 -0.23 -0.03
>>
>>  Thank you very much!
>>>
>>> --
>>> Tim Richter-Heitmann
>>>
>>>
>
> --
> Tim Richter-Heitmann (M.Sc.)
> PhD Candidate
>
>
>
> International Max-Planck Research School for Marine Microbiology
> University of Bremen
> Microbial Ecophysiology Group (AG Friedrich)
> FB02 - Biologie/Chemie
> Leobener Straße (NW2 A2130)
> D-28359 Bremen
> Tel.: 0049(0)421 218-63062
> Fax: 0049(0)421 218-63069
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list