[R] problems subsetting

David Winsemius dwinsemius at comcast.net
Thu Nov 18 18:12:10 CET 2010


On Nov 18, 2010, at 10:42 AM, David Winsemius wrote:

>
> On Nov 18, 2010, at 10:25 AM, Martin Tomko wrote:
>
>> Hi Gerrit,
>> indeed, that works. Excellent tip!
>>
>> For reference, I did this:
>>
>> subset1<-subset(summarystats,(Type==1)&(Class==1)&(Category==1))
>>
>> I am still not totally sure when one uses "&" amd when "&&"  - I  
>> was under the impression that && stands for logical AND....
>
> Both stand for logical AND. "&" is used for vectorized comparisons,  
> while "&&" will only compare the first elements of the two sides  
> (usually, but apparently not always) with a warning if there are  
> longer objects than expected.

A little bird (actually more like an eagle in these parts) has  
suggested that I mention that the reason for two different types of  
logical operators is not just for confusing the unwary, but rather  
because the "&&"/"||" versions will not evaluate its second argument  
if its first argument is TRUE. Since this form is mostly used within  
the if( ...  && ... ){} else{} construction, there can be increased  
efficiency when the second argument is an involved function. It won't  
need to be evaluated if the first argument to "&&" is FALSE or the  
first to "||" is TRUE.

-- 
David.
>
> > c(1,0,1,0,1) & c(0,0,1,1,-1)
> [1] FALSE FALSE  TRUE FALSE  TRUE
>
> > c(1,0,1,0,1) && c(0,0,1,1,-1)
> [1] FALSE
>
> > c(1,0,1,0,1) && c(1,0,1,1,-1)
> [1] TRUE
>
> -- 
> David.
>
>>
>> Thanks a lot.
>>
>>
>> Martin
>>
>> On 11/18/2010 3:58 PM, Gerrit Eichner wrote:
>>> Hello, Martin,
>>>
>>> as to your first problem, look at function subset(), and  
>>> particularly at its argument "subset".
>>>
>>> HTH,
>>>
>>> Gerrit
>>>
>>>
>>> On Thu, 18 Nov 2010, Martin Tomko wrote:
>>>
>>>> Dear all,
>>>> I have searched the forums for an answer - and there is plenty of  
>>>> questions along the same line - but none of the paproaches shown  
>>>> worked to my problem:
>>>>
>>>> I have a data frame that I get from a csv:
>>>>
>>>> summarystats<-as.data.frame(read.csv(file=f_summary));
>>>>
>>>> where I have the columns Dataset, Class, Type, Category,..
>>>> Problem1:  I want to find a subset of this frame, based on values  
>>>> in multiple columns
>>>> What I do currently is:
>>>>
>>>> subset1 <- summarystats
>>>> subset1<-subset1[subset1$Class == 1,]
>>>> subset1<-subset1[subset1$Type == 1,]
>>>> subset1<-subset1[subset1$Category == 1,]
>>>>
>>>> Now, this works, but is UGLY! I tried using "&&" or "&" , for  
>>>> isntance : subset1<-subset1[ (subset1$Class == 1)&&  
>>>> (subset1$Category == 1),]
>>>> but it returns an empty data frame.
>>>>
>>>> Anyway, the main problem is
>>>> Problem2:
>>>> I have a second data frame - a square matrix (rownames ==  
>>>> colnames), distm:
>>>>
>>>> distm<-read.table(file=f_simmatrix, sep = ",");
>>>> what I want is select ONLY the columns and rows entries matching  
>>>> the above subset1:
>>>>
>>>> subset2<-distm[subset1$Dataset,subset1$Dataset] returns a matrix  
>>>> of correct size, but with incorrect entries (established by  
>>>> visual inspection).
>>>>
>>>> this is the same as:
>>>> selectedrows<-as.vector(subset1$Dataset)
>>>> subset2<-distm[selectedrows,selectedrows]
>>>>
>>>> also verified using:
>>>> rownames(subset2)%in% selectedrows
>>>> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  
>>>> FALSE FALSE
>>>> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  
>>>> FALSE FALSE
>>>> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE  
>>>> FALSE FALSE
>>>> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>>>
>>>> What am I missing?
>>>>
>>>> Thanks
>>>> Martin
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> AOR Dr. Gerrit Eichner               Mathematical Institute, Room  
>>> 212
>>> gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University  
>>> Giessen
>>> Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen,  
>>> Germany
>>> Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/ 
>>> eichner
>>> ---------------------------------------------------------------------
>>>
>>
>>
>> -- 
>> Martin Tomko
>> Postdoctoral Research Assistant
>>
>> Geographic Information Systems Division
>> Department of Geography
>> University of Zurich - Irchel
>> Winterthurerstr. 190
>> CH-8057 Zurich, Switzerland
>>
>> email: 	martin.tomko at geo.uzh.ch
>> site:	http://www.geo.uzh.ch/~mtomko
>> mob: 	+41-788 629 558
>> tel: 	+41-44-6355256
>> fax: 	+41-44-6356848
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list