[R] OK - I got the data - now what? :-)

Uwe Ligges ligges at statistik.tu-dortmund.de
Sun Jul 5 20:38:41 CEST 2009



David Winsemius wrote:
> 
> On Jul 5, 2009, at 1:19 PM, Uwe Ligges wrote:
> 
>>>> snippedpreample
>>>>
>>>> test <- data.frame(A=1:10, B=100, C1=runif(10), C2=runif(10),
>>>> C3=runif(10), C4=runif(10), C5=runif(10), C6=runif(10))
>>>> test<-round(test,2)
>>>>
>>>> #Make array ragged
>>>> test$C3[2]<-0;test$C4[2]<-0;test$C5[2]<-0;test$C6[2]<-0
>>>> test$C4[3]<-0;test$C5[3]<-0;test$C6[3]<-0
>>>> test$C6[7]<-0
>>>> test$C4[8]<-0;test$C5[8]<-0;test$C6[8]<-0
>>>>
>>>> test
>>>>
>>>> #C1 always the same so calculate it only once
>>>> StartCol <- which(names(test)=="C1")
>>>>
>>>> #Print row 3 explicitly
>>>> test[3,][StartCol :(which(test[3,] == 0.0)[1]-1)]
>>>>
>>>> #Row 6 fails because 0 is not found
>>>> test[6,][StartCol :(which(test[6,] == 0.0)[1]-1)]
>>>>
>>>> EndCol <- which(test[6,] == 0.0)[1]-1
>>>> EndCol
>>>>
>>> It's getting a bit Baroque, but here is a solution that handles an NA:
>>> test[6,][StartCol :ifelse(is.na( which(test[6,] == 0.0)[1]) ,
>>>                              ncol(test),   which(test[6,] == 0.0)[1]-1 )
>>>            ]
>>> #####-----
>>>    C1   C2   C3   C4   C5   C6
>>> 6 0.33 0.84 0.51 0.86 0.84 0.15
>>> Maybe an R-meister can offer something more compact?
>>
>>
>> So let's wait for some R-meister, I'd write even more ....
>>
>> Reason: testing for exactly zero after possible calculations is a bit 
>> dangerous and ifelse() is designed for vectorized operations but is 
>> not efficient for scalar operations, particularly since both 
>> expressions are evaluated, so if() else would be preferable, but we 
>> could use min() instead. Finally, a:b could end up in 5:3 without a 
>> warning and I'd use seq() instead.
>>
>> Hence I'd prefer:
>>
>> temp <- which(sapply(test[6,], function(x, y) isTRUE(all.equal(x,y)), 
>> 0))[1]
> 
> This appears to be learning moment for me. Do I have it correctly that 
> the first argument to sapply, the vector(test[6,],  gets passed 
> element-wise to the first parameter of the function, x, 


Yes.


> and the second 
> argument, 0, is getting passed via recycling to the second parameter, y, 
> through the , ...)  mechanism of the sapply function?


No, each time the whole thing (which is just 0 here) is passed to 
sapply, not via recycling.



>> test[6, seq(from = StartCol, to = min(c(temp - 1, ncol(test)), na.rm = 
>> TRUE), by = 1)]
> 
> I had tried a min() solution and got Inf in return when there was an NA 
> in the vector, but did not realize that it had an na.rm mode.
> 
> Thanks for the meisterhaft corrections.


:-)


Uwe


>>
>>
>> Best,
>> Uwe Ligges
> 
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>




More information about the R-help mailing list