AW: [R] Rank and extract data from a series
"Unternährer Thomas, uth"
uth at zhwin.ch
Tue Sep 23 14:23:48 CEST 2003
Hi,
>I would like to rank a time-series of data, extract the top ten data items from this series, determine the
>corresponding row numbers for each value in the sample, and take a mean of these *row numbers* (not the data).
>I would like to do this in R, rather than pre-process the data on the UNIX command line if possible, as I need to >calculate other statistics for the series.
>I understand that I can use 'sort' to order the data, but I am not aware of a function in R that would allow me
>to extract a given number of these data and then determine their positions within the original time series.
>e.g.
>Time series:
>1.0 (row 1)
>4.5 (row 2)
>2.3 (row 3)
>1.0 (row 4)
>7.3 (row 5)
>Sort would give me:
>1.0
>1.0
>2.3
>4.5
>7.3
>I would then like to extract the top two data items:
>4.5
>7.3
>and determine their positions within the original (unsorted) time series:
>4.5 = row 2
>7.3 = row 5
>then take a mean:
>2 and 5 = 3.5
>Thanks in advance.
>James Brown
X <- c(1, 4.5, 2.3, 1, 7.3)
X1 <- sort(X, decreasing=TRUE)[1:2]
X2 <- match(X1, X)
mean(X2)
Hope this helps
Thomas
___________________________________________
James Brown
Cambridge Coastal Research Unit (CCRU)
Department of Geography
University of Cambridge
Downing Place
Cambridge
CB2 3EN, UK
Telephone: +44 (0)1223 339776
Mobile: 07929 817546
Fax: +44 (0)1223 355674
E-mail: jdb33 at cam.ac.uk
E-mail: james_510 at hotmail.com
http://www.geog.cam.ac.uk/ccru/CCRU.html
___________________________________________
On Wed, 10 Sep 2003, Jerome Asselin wrote:
> On September 10, 2003 04:03 pm, Kevin S. Van Horn wrote:
> >
> > Your method looks like a naive reimplementation of integration, and
> > won't work so well for distributions that have the great majority of
> > the probability mass concentrated in a small fraction of the sample
> > space. I was hoping for something that would retain the
> > adaptability of integrate().
>
> Yesterday, I've suggested to use approxfun(). Did you consider my
> suggestion? Below is an example.
>
> N <- 500
> x <- rexp(N)
> y <- rank(x)/(N+1)
> empCDF <- approxfun(x,y)
> xvals <- seq(0,4,.01)
> plot(xvals,empCDF(xvals),type="l",
> xlab="Quantile",ylab="Cumulative Distribution Function")
> lines(xvals,pexp(xvals),lty=2)
> legend(2,.4,c("Empirical CDF","Exact CDF"),lty=1:2)
>
>
> It's possible to tune in some parameters in approxfun() to better
> match your personal preferences. Have a look at help(approxfun) for
> details.
>
> HTH,
> Jerome Asselin
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>
______________________________________________
R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help
More information about the R-help
mailing list