[R] How to extract same columns from identical dataframes in a list?

Wolfgang Waser waser at frankenfoerder-fg.de
Tue Feb 9 10:03:01 CET 2016


Hi,

sorry if my description was too short / unclear.

> I have a list of 7 data frames, each data frame having 24 rows (hour of
> the day) and 5 columns (weeks) with a total of 5 x 24 values

[1]
	week1	week2	week3	...
1	x	a	m	...
2	y	b	n
3	z	c	o
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.


[2]
	week1 week2 week3 ...
1	x2	a2	m2	...
2	y2	b2	n2
3	z2	c2	o2
.	.	.	.
.	.	.	.
.	.	.	.
24	.	.	.


[3]
...

.
.
.


[7]
...



I now would like to extract e.g. all week2 columns of all data frames in
the list and combine them in a new data frame using cbind.

new data frame

week2 ([1])	week2 ([2])	week2 ([3])	...
a		a2		.
b		b2		.
c		c2		.
.
.
.

I will then do further row-wise calculations using e.g. apply(x,1,mean),
the result being a vector of 24 values.


I have not found a way to extract specific columns of the data frames in
a list.


As mentioned I can use

sapply(list_of_dataframes,"[",1:24)

which will pick the first 24 values (first column) of each data frame in
the list and arrange them as an array of 24 rows and 7 columns (7 data
frames are in the list).
To pick the second column (week2) using sapply I have to use the next 24
values from 25 to 48:

sapply(list_of_dataframes,"[",25:48)


It seems that sapply treats the data frames in the list as vectors. I
can of course extract all consecutive weeks using consecutive blocks of
24 values, but this seems cumbersome.


The question remains, how to select specific columns from data frames in
a list, e.g. all columns 3 of all data frames in the list.


Reformatting (unlist(), dim()) in one data frame with one column for
each week does not help, since I'm not calculating colMeans etc, but
row-wise calculations using apply(x,1,FUN) ("applying a function to
margins of an array or matrix").



Thanks for you help and suggestions!


Wolfgang



On 08/02/16 18:00, Dénes Tóth wrote:
> Hi,
> 
> Although you did not provide any reproducible example, it seems you
> store the same type of values in your data.frames. If this is true, it
> is much more efficient to store your data in an array:
> 
> mylist <- list(a = data.frame(week1 = rnorm(24), week2 = rnorm(24)),
>                b = data.frame(week1 = rnorm(24), week2 = rnorm(24)))
> 
> myarray <- unlist(mylist, use.names = FALSE)
> dim(myarray) <- c(nrow(mylist$a), ncol(mylist$a), length(mylist))
> dimnames(myarray) <- list(hour = rownames(mylist$a),
>                           week = colnames(mylist$a),
>                           other = names(mylist))
> # now you can do:
> mean(myarray[, "week1", "a"])
> 
> # or:
> colMeans(myarray)
> 
> 
> Cheers,
>   Denes
> 
> 
> On 02/08/2016 02:33 PM, Wolfgang Waser wrote:
>> Hello,
>>
>> I have a list of 7 data frames, each data frame having 24 rows (hour of
>> the day) and 5 columns (weeks) with a total of 5 x 24 values
>>
>> I would like to combine all 7 columns of week 1 (and 2 ...) in a
>> separate data frame for hourly calculations, e.g.
>>> apply(new.data.frame,1,mean)
>>
>> In some way sapply (lapply) works, but I cannot directly select columns
>> of the original data frames in the list. As a workaround I have to
>> select a range of values:
>>
>>> sapply(list_of_dataframes,"[",1:24)
>>
>> Values 1:24 give the first column, 25:48 the second and so on.
>>
>> Is there an easier / more direct way to select for specific columns
>> instead of selecting a range of values, avoiding loops?
>>
>>
>> Cheers,
>>
>> Wolfgang
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
> 

-- 
Frankenförder Forschungsgesellschaft mbH
Dr. Wolfgang Waser
Wissenschaftsbereich Berlin
Chausseestraße 10
10115 Berlin
Tel.:  +49(0)30 2809 1936
Fax.:  +49(0)30 2809 1940
E-Mail: waser at frankenfoerder-fg.de

Frankenförder Forschungsgesellschaft mbH (FFG)
Sitz: Luckenwalde,Amtsgericht Potsdam, HRB: 6499
Geschäftsführerin: Dipl. Agraring. Doreen Sparborth
Tel.: +49(0)30 2809 1931, E-Mail: info at frankenfoerder-fg.de
http://www.frankenfoerder-fg.de



More information about the R-help mailing list