[R] help with grouping data and calculating the means

Bert Gunter bgunter@4567 @ending from gm@il@com
Thu Nov 15 21:50:55 CET 2018


On further thought -- and subject to my prior interpretation -- I
think a foolproof way of truncating to 4 decimal digits is to treat
them as character strings rather than numerics and use regex
operations:

> with(df,tapply(TK.QUADRANT, sub("(\\.[[:digit:]]{4}).*","\\1",as.character(LAT)),mean))
54.9249 54.9749 55.0249 55.0749
10158.5 10156.5  9163.5  9161.5

I should have realized this before!!!!

Cheers,
Bert





Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, Nov 15, 2018 at 12:19 PM Bert Gunter <bgunter.4567 using gmail.com> wrote:
>
> On Thu, Nov 15, 2018 at 10:40 AM Boris Steipe <boris.steipe using utoronto.ca> wrote:
> >
> > Use round() with the appropriate  "digits" argument. Then use unique() to define your groups.
>
> No.
> > round(c(.124,.126),2)
> [1] 0.12 0.13
>
> As I understand it, the OP said he wanted the last decimal to be ignored.
>
> The OP also did not specify what he wanted to calculate means of. I
> assume TK-QUADRANT. It is also not clear whether the calculations are
> to be done separately by latitude and longitude, or both together.
> I'll assume separately. In which case, the calculation of TK-QUADRANT
> means by e.g. grouped according to 4 decimal digit values of latitude
> could be done using(using the provided example data):
> (Note: ignore all that follows if my interpretation is incorrect)
>
> > with(df, tapply(TK.QUADRANT, floor(1e4*LAT),mean))
>  549249  549749  550249  550749
> 10158.5 10156.5  9163.5  9161.5
>
> ## Note that this assumes positive values of latitude, because:
> > floor(c(-1.2,1.2))
> [1] -2  1
>
> This could be easily modifed if both positive and negative values were
> used: e.g.
> > x <-c(-1.2,1.2)
> > sign(x)*floor(abs(x))
> [1] -1  1
>
> Confession: I suspect that this exponentiate and floor() procedure
> might fail with lots of decimal places due to the usual issues of
> binary representations of decimals. But maybe it fails even here. If
> so, I would appreciate someone pointing this out and, if possible,
> providing a better strategy.
>
> Cheers,
> Bert
>
>
>
> >
> > HTH,
> > B.
> >
> >
> > > On 2018-11-15, at 11:48, sasa kosanic <sasa.kosanic using gmail.com> wrote:
> > >
> > > Dear All,
> > >
> > > I would very much appreciate the help with following:
> > > I need to calculate the mean of  different lat/long points that should be
> > > grouped.
> > > However I would like that r excludes taking  values that are different in
> > > only last decimal.
> > > So instead 4 values in the group it would calculate the mean for only 3(
> > > excluding the ones that differs in only one decimal).
> > > # construct the dataframe
> > > `TK-QUADRANT` <- c(9161,9162,9163,9164,10152,10154,10161,10163)
> > > LAT <- c(55.07496,55.07496,55.02495,55.02496
> > > ,54.97496,54.92495,54.97496,54.92496)
> > > LON <-
> > > c(8.37477,8.458109,8.37477,8.45811,8.291435,8.291437,8.374774,8.374774)
> > > df <- data.frame(`TK-QUADRANT`=`TK-QUADRANT`,LAT=LAT,LON=LON)
> > >
> > >
> > > I would like to group the data and calculate means by group but in a way to
> > > exclude every number that differs in only last decimal.
> > >
> > >
> > > Also please see pdf. example-attached .
> > >
> > > Many thanks!
> > > Best wishes,
> > > Sasha
> > >
> > > --
> > >
> > > Dr Sasha Kosanic
> > > Ecology Lab (Biology Department)
> > > Room M644
> > > University of Konstanz
> > > Universitätsstraße 10
> > > D-78464 Konstanz
> > > Phone: +49 7531 883321 & +49 (0)175 9172503
> > >
> > > http://cms.uni-konstanz.de/vkleunen/
> > > https://tinyurl.com/y8u5wyoj
> > > https://tinyurl.com/cgec6tu
> > > <dataset example.pdf>______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
> >
> > ______________________________________________
> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list