[R] problems subsetting

Martin Tomko martin.tomko at geo.uzh.ch
Thu Nov 18 16:25:38 CET 2010


Hi Gerrit,
indeed, that works. Excellent tip!

For reference, I did this:

subset1<-subset(summarystats,(Type==1)&(Class==1)&(Category==1))

I am still not totally sure when one uses "&" amd when "&&"  - I was 
under the impression that && stands for logical AND....

Thanks a lot.


Martin

On 11/18/2010 3:58 PM, Gerrit Eichner wrote:
> Hello, Martin,
>
> as to your first problem, look at function subset(), and particularly 
> at its argument "subset".
>
> HTH,
>
> Gerrit
>
>
> On Thu, 18 Nov 2010, Martin Tomko wrote:
>
>> Dear all,
>> I have searched the forums for an answer - and there is plenty of 
>> questions along the same line - but none of the paproaches shown 
>> worked to my problem:
>>
>> I have a data frame that I get from a csv:
>>
>> summarystats<-as.data.frame(read.csv(file=f_summary));
>>
>> where I have the columns Dataset, Class, Type, Category,..
>> Problem1:  I want to find a subset of this frame, based on values in 
>> multiple columns
>> What I do currently is:
>>
>> subset1 <- summarystats
>> subset1<-subset1[subset1$Class == 1,]
>> subset1<-subset1[subset1$Type == 1,]
>> subset1<-subset1[subset1$Category == 1,]
>>
>> Now, this works, but is UGLY! I tried using "&&" or "&" , for 
>> isntance : subset1<-subset1[ (subset1$Class == 1)&& (subset1$Category 
>> == 1),]
>> but it returns an empty data frame.
>>
>> Anyway, the main problem is
>> Problem2:
>> I have a second data frame - a square matrix (rownames == colnames), 
>> distm:
>>
>> distm<-read.table(file=f_simmatrix, sep = ",");
>> what I want is select ONLY the columns and rows entries matching the 
>> above subset1:
>>
>> subset2<-distm[subset1$Dataset,subset1$Dataset] returns a matrix of 
>> correct size, but with incorrect entries (established by visual 
>> inspection).
>>
>> this is the same as:
>> selectedrows<-as.vector(subset1$Dataset)
>> subset2<-distm[selectedrows,selectedrows]
>>
>> also verified using:
>> rownames(subset2)%in% selectedrows
>> [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
>> FALSE
>> [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
>> FALSE FALSE
>> [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE 
>> FALSE FALSE
>> [37] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
>>
>> What am I missing?
>>
>> Thanks
>> Martin
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide 
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> ---------------------------------------------------------------------
> AOR Dr. Gerrit Eichner               Mathematical Institute, Room 212
> gerrit.eichner at math.uni-giessen.de   Justus-Liebig-University Giessen
> Tel: +49-(0)641-99-32104          Arndtstr. 2, 35392 Giessen, Germany
> Fax: +49-(0)641-99-32109        http://www.uni-giessen.de/cms/eichner
> ---------------------------------------------------------------------
>


-- 
Martin Tomko
Postdoctoral Research Assistant

Geographic Information Systems Division
Department of Geography
University of Zurich - Irchel
Winterthurerstr. 190
CH-8057 Zurich, Switzerland

email: 	martin.tomko at geo.uzh.ch
site:	http://www.geo.uzh.ch/~mtomko
mob: 	+41-788 629 558
tel: 	+41-44-6355256
fax: 	+41-44-6356848



More information about the R-help mailing list