[R] Read files in a folder when new data files come

Barry Rowlingson b.rowlingson at lancaster.ac.uk
Sun Jan 24 21:21:35 CET 2010


On Sun, Jan 24, 2010 at 8:05 PM, jlfmssm <jlfmssm at gmail.com> wrote:
> Hello,
>
> I am working on a project. The new data files is coming as the data
> collectors get data, then
> the data collectors put these new data files in a folder. I need to
> read these new data files when they are in folder.
> so far, I did this job manually, that is to say, each time I go to
> that folder and find new data files, then use my R program to
> read these new data files. I am wondering if anyone know how to
> perform this job automatically in R.

Without needing some operating-system specific hackery, the easiest
way would be to use 'list.files()' and look for new files every so
many minutes or seconds (depending on how urgent it is). Or to check
file.info() on your directory and test the modification time. You'd
then write that into a  .R file and run that in the background using
your operating system's background job functionality (as a 'service'
in Windows, or as a background process in Unix). Use
Sys.sleep(seconds) to wait in your loop. Something like (totally
untested):

lastChange = file.info(dumpLocation)$mtime
while(TRUE){
  currentM = file.info(dumpLocation)$mtime
  if(currentM != lastChange){
    lastChange = currentM
    doSomethingWithStuffIn(dumpLocation)
  }
# try again in 10 minutes
Sys.sleep(600)
}

 There are ways for programs to get directory content change events
when files appear in directories, but they will probably be very
operating system specific. There's also the problem of your code
firing up when a file is only half-uploaded - what do you do then?
Does your data format have an 'end of data' marker?

 Barry

-- 
blog: http://geospaced.blogspot.com/
web: http://www.maths.lancs.ac.uk/~rowlings
web: http://www.rowlingson.com/
twitter: http://twitter.com/geospacedman
pics: http://www.flickr.com/photos/spacedman



More information about the R-help mailing list