[BioC] R: Can an R script be run through a cron job ?

mauede at alice.it mauede at alice.it
Fri Nov 20 15:49:26 CET 2009


I reattached my script. I had attached it to an earlier message that maybe was overlooked.

As you can see yourself, I scan a big data set, named hsTargets, that contains plenty of target gene 
transcript IDs with a handle to the relative miRNA.
I process such a data base one miRNA at a time. That is, I gather all the transcript IDs for the current miRNA
and query biomaRT asking for the 3'utr for all such transcrpts whose ENST are in a vector that I pass as input parameter to the query. Therefore I do use the  vectorized capabilities of R, don't I ?

My mistake is to keep the connection to biomaRt opened while processing as many miRNAs as I can.
Therefore I acknowledge I have to improve my script and catch the exception so that I have to delete the file currently being written (as in general it will be incomplete) and have the script die gently.
Then I have to get my script pause and disconnect from biomaRT regularly to avoid hammering the provided
 service. 
Eventually my process can even end itself instead of sleeping, after saving its current status.  
However, I need to set up the task scheduler to restart it some time later ...

Regards,
Maura






-----Messaggio originale-----
Da: Kasper Daniel Hansen [mailto:khansen at stat.berkeley.edu]
Inviato: ven 20/11/2009 15.12
A: mauede at alice.it
Cc: Bioconductor  List
Oggetto: Re: [BioC] Can an R script be run through a cron job ?
 
Maura

Unfortunately you never showed us your code, despite repeated requests  
to do so.  That makes it hard to help (and frankly, ignoring requests  
for information from people trying to help you is extremely  
counterproductive).

Your comments in your last email in the last thread indicates that you  
have code that essentially do this

for(i in 1:100)
   getBM(...)

If this is true (which we would know if we can see the code), this is  
why your script fail.  There are two problems with this (1) you are  
not using the vectorized capabilities of R, but more important is (2)  
you are sending many requests to Biomart and typically such behaviour  
might mean your IP address will be banned temporarily.  They don't  
like people hammering their services with repeated requests.

Instead you should create a query that essentially asks for all your  
return objects in one request.  That should be easy to write, and will  
be much faster.  You might think that processing the output is  
slightly harder, but that is the thing to do (and with more R  
experience, processing a big output is actually easier).

Regarding your actual question in this email, you seem to be very  
confused regarding the meaning of a batch job.  This word has many  
different interpretations (not related to R), so it is hard to google  
for.  What you are specifically asking for has everything to do with  
what operating system you are using (Windows, Linux, OS X) and nothing  
to do with R.

Kasper


On Nov 19, 2009, at 18:24 , <mauede at alice.it> <mauede at alice.it> wrote:

> I am running a script that extracts many long strings from remote  
> data bases.
> Every now and then the remote data base gets out of sync and closes  
> the connection.
> I have been adviced to implement an R script that queries the data  
> base in batch modality.
> I never ran an R script in batch modality. I think I have to use R  
> CMD BATCH or something similar
> Given the amount of data I am extracting, I am concerned about  
> having to parse a huge data file looking for the
> informattion I need.
> The less painful modification would consist in running the R script  
> as is but through a cron job. So that the script
> should be set to sleep  on an established frequency and when  
> awakened it should resume from where it was interrupted.
> Is such a scheme doable in R ? If it is then what are the most  
> important commands to make a script sleep and wake up
> on a regular basis ?
>
> Thank you in advance,
> Maura
>
>
>
>
> tutti i telefonini TIM!
>
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor






e tutti i telefonini TIM!
Vai su 


More information about the Bioconductor mailing list