[BioC] retrieving mRNA sequences via biomaRt

Wolfgang Huber whuber at embl.de
Thu Aug 6 17:57:47 CEST 2009


Hi Simon,

with all respect, for a first contact with the Bioconductor project I'd 
also recommend studying some of the documentation.

A (slightly biased) set of points to start with are the "Bioconductor 
Case Studies" book by Hahne, Huber, Gentleman, Falcon and the paper 
"Mapping identifiers for the integration of genomic datasets with the 
R/Bioconductor package biomaRt." by Durinck et al. in Nature Protocols 
2009;4(8):1184-91.

	Best wishes
	Wolfgang




Simon ha scritto:
> Hello everybody,
> 
> I am trying to solve the following tasks as a first contact with the 
> bioconductor project:
> 
> # Task 1:
> # find:
> #   * mRNA sequence (5'UTR, Coding region, 3'UTR)
> #   * position of start codon in sequence
> #   * position of stop codon in sequence
> #   * ID (Which ID(s) would I choose to reference my
> #     sequence hits? Embl, ensembl transcript id,
> #     Entrez Gene id, RefSeq, etc.?)
> #   * name of associated protein product
> #
> #  where:
> #   * origin is human
> #     Entrez Search would be: human[ORGN]
> #   * sequence is mRNA transcript
> #     Entrez Search for Molecule Type: biomol_mRNA[PROP]?
> #   * mRNA sequence length is 3000 to 5000 nts
> #     * Entrez Search for Sequence Length: 3000:5000[SLEN]
> #   * coding region of mRNA length is 2000 to 3000 nts
> #     * Entrez Search Field for stop and start of
> #       coding region: start:stop[CDS]
> #
> #
> # Task 2:
> # store the retrieved information to file for the first 200 hits
> # (Which would be a suitable file formate?)
> 
> I started by using and playing around with the biomaRt package for R, 
> but I got overwhelmed by its many possibilities.
> 
> I would be glad to get any feedback, on how to start or even solve my 
> tasks.
> 
> Best regards,
> Simon
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: 
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 

Best wishes
      Wolfgang

-------------------------------------------------------
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber



More information about the Bioconductor mailing list