[R] iterating over groups of columns

09wkj Bill.K.Jannen at williams.edu
Wed Jun 9 00:32:24 CEST 2010


In the code fragment, I used 'by' to actually compute the min value (part of the statement with the eval) - and I agree that an apply would work there wonderfully. 

However, my hope was to use an apply for the subsetting of the data.frame's columns, so that I could then use an apply to compute the min across each row of the subsets.

Something that would give me the results of the following, but programmatically:
apply(the.data[,1], 1, min)        #min of the first column
apply(the.data[,2:3], 1, min)    #min of the next 2 columns
apply(the.data[,4:6], 1, min)    #min of the next 3 columns
apply(the.data[,7:10], 1, min)  #min of the next 4 columns
...
apply(the.data[,46:55], 1, min)#min of the next 10 columns



Like, can I make a vector of levels with 'rep(1:10,1:10)', and then apply the function across all columns in each level? And then how could I cbind them together?


Thanks for any help,
Bill




On Jun 8, 2010, at 5:08 PM, Jannis wrote:

> you should have found a solution for that in the help page of apply.
> 
> just run
> 
> min.values = apply(the.data,1,min)
> 
> the '1' marks the direction (e.g. whether apply is applied to rows or columns), it could be a 2 as well. Check that yourself in the apply documentation.
> 
> Then run rbind(the.data,min.values) (could be cbind as well, I am not sure again ;-) ) and you get what you want.
> 
> 09wkj schrieb:
>> I am mainly a Java/C++ programmer, so my mind is used to iterating over data with for loops. After a long break, I am trying to get back into the "R mindset", but I could not find a solution in the documentation for the applys, aggregate, or by.
>> 
>> I have a data.frame where each row is an entry with 10 groups of measurements. The first measurement spans 1 column, the second spans 2 columns, third 3, and so on (55 total columns). What I want to do is add to my data.frame 10 new columns containing the minimum value of each measurement.
>> 
>> dim(the.data)
>> [1] 1679  55
>> 
>>  
>>> colnames(the.data)
>>>    
>>  [1] "k.1.1"   "k.2.1"   "k.2.2"   "k.3.1"   "k.3.2"   "k.3.3"   "k.4.1"    [8] "k.4.2"   "k.4.3"   "k.4.4"   "k.5.1"   "k.5.2"   "k.5.3"   "k.5.4"   [15] "k.5.5"   "k.6.1"   "k.6.2"   "k.6.3"   "k.6.4"   "k.6.5"   "k.6.6"   [22] "k.7.1"   "k.7.2"   "k.7.3"   "k.7.4"   "k.7.5"   "k.7.6"   "k.7.7"   [29] "k.8.1"   "k.8.2"   "k.8.3"   "k.8.4"   "k.8.5"   "k.8.6"   "k.8.7"   [36] "k.8.8"   "k.9.1"   "k.9.2"   "k.9.3"   "k.9.4"   "k.9.5"   "k.9.6"   [43] "k.9.7"   "k.9.8"   "k.9.9"   "k.10.1"  "k.10.2"  "k.10.3"  "k.10.4"  [50] "k.10.5"  "k.10.6"  "k.10.7"  "k.10.8"  "k.10.9"  "k.10.10"
>> 
>> I want to add to the.data new columns: min.k.1, min.k.2, ..., min.k.10
>> 
>> This is the section of code I would like to improve, hopefully getting rid of the eval and the for loop:
>> 
>> for(k in 1:10){
>>    s <- subset(the.data, select=paste("k", k, 1:k, sep="."))
>>    eval(parse(text = paste("the.data$min.k.", k, "<-as.vector(by(s, 1:nrow(s), min))", sep="")))
>> }
>> 
>> Thanks for any help,
>> Bill
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>> 
>>  
> 



More information about the R-help mailing list