[R] Filtering using multiple rows in dplyr

Ulrik Stervbo U|r|k@Stervbo @end|ng |rom ruhr-un|-bochum@de
Thu May 31 10:17:10 CEST 2018


Hi Sumitrajit,

dplyr has a function for this - it's called filter.

For each group you can count the number of SNR > 3 (you can use sum on 
true/false). You can filter on the results directly or add a column as 
you plan. The latter might make your intention more clear.

HTH
Ulrik

On 2018-05-30 18:18, Sumitrajit Dhar wrote:

> Hi Folks,
> 
> I have just started using dplyr and could use some help getting
> unstuck. It could well be that dplyr is not the package to be using,
> but let me just pose the question and seek your advice.
> 
> Here is my basic data frame.
> 
> head(h)
> subject ageGrp ear hearingGrp sex freq L2       Ldp     Phidp
> NF       SNR
> 1 HALAF032      A   L          A   F    2  0 -23.54459  55.56005
> -43.08282 19.538232
> 2 HALAF032      A   L          A   F    2  2 -32.64881  86.22040
> -23.31558 -9.333224
> 3 HALAF032      A   L          A   F    2  4 -18.91058  42.12168
> -35.60250 16.691919
> 4 HALAF032      A   L          A   F    2  6 -23.85937 297.94499
> -20.70452 -3.154846
> 5 HALAF032      A   L          A   F    2  8 -14.45381 181.75329
> -24.17094  9.717128
> 6 HALAF032      A   L          A   F    2 10 -20.42384  67.12998
> -35.77357 15.349728
> 
> 'subject' and 'freq' together make a set of data and I am interested
> in how the last four columns vary as a function of L2. So I grouped by
> 'subject' and 'freq' and can look at basic summaries.
> 
> h_byFunc <- h %>% group_by(subject, freq)
> 
>> h_byFunc %>% summarize(l = mean(Ldp), s = sd(Ldp) )
> 
> # A tibble: 1,175 x 4
> # Groups:   subject [?]
> subject   freq       l     s
> <fct>    <int>   <dbl> <dbl>
> 1 HALAF032     2 -13.8    8.39
> 2 HALAF032     4 -15.8   11.0
> 3 HALAF032     8 -23.4    6.51
> 4 HALAF033     2 -14.2    9.64
> 5 HALAF033     4 -12.3    8.92
> 6 HALAF033     8  -6.55  12.3
> 7 HALAF036     2 -14.9   12.6
> 8 HALAF036     4 -16.7   11.2
> 9 HALAF036     8 -21.7    6.56
> 10 HALAF039     2   0.242 12.4
> # ... with 1,165 more rows
> 
> What  I would like to do is filter some groups out based on various
> criteria. For example, if SNR > 3 in three consecutive L2 within a
> group, that group qualifies and I would add a column, say "clean" and
> assign it a value "Y." Is there a way to do this in dplyr or should I
> be looking at a different way.
> 
> Thanks in advance for your help.
> 
> Regards,
> Sumit
> 
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.




More information about the R-help mailing list