[R] Need a faster function to replace missing data

Tim Clark mudiver1200 at yahoo.com
Fri May 22 06:45:26 CEST 2009

Dear List,

I need some help in coming up with a function that will take two data sets, determine if a value is missing in one, find a value in the second that was taken at about the same time, and substitute the second value in for where the first should have been.  My problem is from a fish tracking study.  We put acoustic tags in fish and track them for several days.  Location data is supposed to be automatically recorded every time we detect a "ping" from the fish.  Unfortunately the GPS had some problems and sometimes the fishes depth was recorded but not its location.  I fortunately had a back-up GPS that was taking location data every five minutes.  I would like to merge the two files, replacing the missing value in the vscan (automatic) file with the location from the garmin file.  Since we were getting vscan records every 1-2 seconds and garmin records every 5 minutes, I need to find the right place in the vscan file to place the garmin record - i.e. the
 closest in time, but not greater than 5 minutes.  I have written a function that does this. However, it works with my test data but locks up my computer with my real data.  I have several million vscan records and several thousand garmin records.  Is there a better way to do this?

My function and test data:


minute.diff<-1/24/12   #Time diff is in days, so this is 5 minutes
for (k in 1:nrow(myvscan))  
if (is.na(myvscan$Latitude[k]))
if ((min(abs(mygarmin$DateTime-myvscan$DateTime[k]))) < minute.diff )

I appreciate your help and advice.



Tim Clark
Department of Zoology 
University of Hawaii

More information about the R-help mailing list