[R] [External] Help with sub-setting

Burgess, Jamie J@m|e@Burge@@ @end|ng |rom ||verpoo|@@c@uk
Tue May 26 14:41:18 CEST 2020


Dear all,


Apologies for the late reply - I have just got back from my shift. I am unfortunately a little sleep deprived hehe


Hi Bert,

Thank-you for your reply


Yes, apologies - the syntax was lost in translation whilst changing the names of the groups, imported data-set file name and variables.


data<-data.frame(X=sample(1:2,100,TRUE),Y=sample(1:2,100,TRUE),
>  Z=rnorm(100))

by(data$Z,data[,c("X","Y")],summary)


In your example, if one of my variables recorded integer data and the other continuous data, does "1:2" specify columns, and "100" the number of entries I would like to select?


Dear Richard,


Thank-you for your reply


I have had previous success sub-grouping by one variable using the following:


group1<-subset(dataset1,dataset1$A==1)


I have subsequently been summarising the data using:


table(group1$variable) or summarise(group1$variable)


Using your suggestion I have managed to sub-group using the following:


GroupAB<-subset(data,data$A==1 & is.na (data$B)==FALSE)




I will also try your suggestion datasubset <- data[data$A ==1 & data$B ==  1 ,]) it is much appreciated. Does my entry do the same thing as yours?


I thought the problem was to do with the size of my data-set (4.9GB) and the presence of ~500,000 entries. However, as another command worked I am unsure what the problems was

I am only actually interested in around one third of these which is the reason I wish to sub-group by the two variables I have selected.

I was wondering why this new script worked.



Kind regards,


Jamie



________________________________
From: Bert Gunter <bgunter.4567 using gmail.com>
Sent: 25 May 2020 18:36:18
To: Richard M. Heiberger
Cc: Burgess, Jamie; r-help using r-project.org
Subject: Re: [R] [External] Help with sub-setting

Yes. In particular:

data$variable==1 & data

makes no sense (data is a data frame). A typo perhaps? Or as Richard indicated, consult references/tutorials to learn proper syntax for (vectorized) predicates.

Bert Gunter

"The trouble with having an open mind is that people keep coming along and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Mon, May 25, 2020 at 10:20 AM Richard M. Heiberger <rmh using temple.edu<mailto:rmh using temple.edu>> wrote:
I think the syntax you are looking for is

datasubset <- data[ data$A ==1 & data$B ==  1 , ] )

This gives the subset of your original data for variable A with value
1 and variable B with value 1.


On Mon, May 25, 2020 at 12:57 PM Burgess, Jamie
<Jamie.Burgess using liverpool.ac.uk<mailto:Jamie.Burgess using liverpool.ac.uk>> wrote:
>
> Dear all,
>
> I hope this message finds you well. I am currently trying to subset my data by two variables, so far, I have tried two different ways to stratify participants into groups. I would like to use the �summary� and �table� arguments to characterise the data of participants based on the presence of two variables and summarise this sub-set against a third variable.
> I have used this method:
>
> dgb001<-subset(data,data$variable==1 & data,data$variable)
>
>
> However, I get the following error: �Error: cannot allocate vector of size 16.0 Gb�. Is there another method I can try?
>
>
> Kind regards,
>
>
> Jamie Burgess
>
> PhD Student Endocrinology and Diabetes
>
> University of Liverpool
>
> Aintree University Hospital &
>
> The Walton Centre
>
> Institute of Ageing & Chronic Disease
>
> 0151 529 5936
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help using r-project.org<mailto:R-help using r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list