[Rd] problem using "by" with custom function?

thalarctos kmiddel at gmail.com
Mon Dec 10 19:57:23 CET 2007

I'm relatively new to R and R development, so please forgive me for any
obvious errors.  

What I am trying to do is use the command dpik within the package KernSmooth
to estimate bandwidth parameters for GPS telemetry data.  I have been able
to get this to work on a case by case basis without any problem, but would
like to extend this so that I can batch process many different animals for
pre-determined time periods (months of the year).  I have written a function
that first standardises the data based on the X and Y values, then uses dpik
to calculate the bandwidth for each variable.  The result is the average of
the X and Y estimates.  I then use the command "by" to run the function on a
dataframe which has X and Y in columns 5 and 6, and a grouping variable
"animonth" to individualise the data by animal and month.  when I run the by
command on a small data table (only a few different levels of animonth) it
works perfectly.  

The problem is when I try to run it on all the data (or more than a few
levels) at once.  I get the error posted below.  However, if I run a simple
embedded function like summary within by, there is no error.  Can anyone
provide me with some assistance in interpreting this error?  Any suggestions
on alternative commands to use would be appreciated as well, as I'm not
commited to using by, it was just the one that seemed to work.

Data table example:
    uniqid  animal month animonth       x        y
1   11748   W079    12  W079_12 1494206 12134126
2   11749   W079    12  W079_12 1494123 12134051
3   11750   W079    12  W079_12 1493639 12133705
4   11751   W079    12  W079_12 1493353 12135892
5   11752   W079    12  W079_12 1495157 12137797
6   11753   W079    12  W079_12 1498039 12132112
7   11754   W079    12  W079_12 1497991 12131842
8   11755   W079    12  W079_12 1497918 12131631
9   11756   W079    12  W079_12 1498019 12131638
10  11757   W079    12  W079_12 1498017 12131633

Function for calculating bandwidth:
> kern.est
function(data) {
x.var <- (data$x / sd(data$x)); y.var <- (data$y / sd(data$y))
dpik.x <- dpik(x.var, gridsize = round((max(data$x) - min(data$x))/100))
dpik.y <- dpik(y.var, gridsize = round((max(data$y) - min(data$y))/100))
bw.avg <- ((dpik.x + dpik.y)/2)

by command used:
junk3 <- by(w079.all[,5:6], w079.all$animonth, kern.est)

output from small files (only a few levels of animonth):
w079.all$animonth: W079_1
[1] 0.2117635
w079.all$animonth: W079_12
[1] 0.2837849

Error on larger files:
Error in rep(0, P - 2 * L - 1) : invalid 'times' argument

Thank in advance for all any help,
View this message in context: http://www.nabble.com/problem-using-%22by%22-with-custom-function--tp14259137p14259137.html
Sent from the R devel mailing list archive at Nabble.com.

More information about the R-devel mailing list