[BioC] R: Can an R script be run through a cron job ?

Francois Pepin fpepin at cs.mcgill.ca
Fri Nov 20 16:48:15 CET 2009


Hi Maura,

your attachment was scrubbed by the list software, it wasn't overlooked. 
You would be better to have the relevant parts in your e-mail instead.

Kasper is referring to the fact that you are sending a different query 
for each miRNA. Grouping everything together such that you only have a 
single query.

This is basically Cei's suggestion, although I would suggest limiting 
yourself to the Ensembl transcript IDs of interest as opposed to 
querying all unique IDs.

Francois

On 11/20/2009 09:49 AM, mauede at alice.it wrote:
> I reattached my script. I had attached it to an earlier message that maybe was overlooked.
>
> As you can see yourself, I scan a big data set, named hsTargets, that contains plenty of target gene
> transcript IDs with a handle to the relative miRNA.
> I process such a data base one miRNA at a time. That is, I gather all the transcript IDs for the current miRNA
> and query biomaRT asking for the 3'utr for all such transcrpts whose ENST are in a vector that I pass as input parameter to the query. Therefore I do use the  vectorized capabilities of R, don't I ?
>
> My mistake is to keep the connection to biomaRt opened while processing as many miRNAs as I can.
> Therefore I acknowledge I have to improve my script and catch the exception so that I have to delete the file currently being written (as in general it will be incomplete) and have the script die gently.
> Then I have to get my script pause and disconnect from biomaRT regularly to avoid hammering the provided
>   service.
> Eventually my process can even end itself instead of sleeping, after saving its current status.
> However, I need to set up the task scheduler to restart it some time later ...
>
> Regards,
> Maura
>
>
>
>
>
>
> -----Messaggio originale-----
> Da: Kasper Daniel Hansen [mailto:khansen at stat.berkeley.edu]
> Inviato: ven 20/11/2009 15.12
> A: mauede at alice.it
> Cc: Bioconductor  List
> Oggetto: Re: [BioC] Can an R script be run through a cron job ?
>
> Maura
>
> Unfortunately you never showed us your code, despite repeated requests
> to do so.  That makes it hard to help (and frankly, ignoring requests
> for information from people trying to help you is extremely
> counterproductive).
>
> Your comments in your last email in the last thread indicates that you
> have code that essentially do this
>
> for(i in 1:100)
>     getBM(...)
>
> If this is true (which we would know if we can see the code), this is
> why your script fail.  There are two problems with this (1) you are
> not using the vectorized capabilities of R, but more important is (2)
> you are sending many requests to Biomart and typically such behaviour
> might mean your IP address will be banned temporarily.  They don't
> like people hammering their services with repeated requests.
>
> Instead you should create a query that essentially asks for all your
> return objects in one request.  That should be easy to write, and will
> be much faster.  You might think that processing the output is
> slightly harder, but that is the thing to do (and with more R
> experience, processing a big output is actually easier).
>
> Regarding your actual question in this email, you seem to be very
> confused regarding the meaning of a batch job.  This word has many
> different interpretations (not related to R), so it is hard to google
> for.  What you are specifically asking for has everything to do with
> what operating system you are using (Windows, Linux, OS X) and nothing
> to do with R.
>
> Kasper
>
>
> On Nov 19, 2009, at 18:24 ,<mauede at alice.it>  <mauede at alice.it>  wrote:
>
>> I am running a script that extracts many long strings from remote
>> data bases.
>> Every now and then the remote data base gets out of sync and closes
>> the connection.
>> I have been adviced to implement an R script that queries the data
>> base in batch modality.
>> I never ran an R script in batch modality. I think I have to use R
>> CMD BATCH or something similar
>> Given the amount of data I am extracting, I am concerned about
>> having to parse a huge data file looking for the
>> informattion I need.
>> The less painful modification would consist in running the R script
>> as is but through a cron job. So that the script
>> should be set to sleep  on an established frequency and when
>> awakened it should resume from where it was interrupted.
>> Is such a scheme doable in R ? If it is then what are the most
>> important commands to make a script sleep and wake up
>> on a regular basis ?
>>
>> Thank you in advance,
>> Maura
>>
>>
>>
>>
>> tutti i telefonini TIM!
>>
>>
>> 	[[alternative HTML version deleted]]
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>
>
>
>
>
>
> e tutti i telefonini TIM!
> Vai su
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list