[BioC] How to set up genes argument for makeTranscriptDb( ) function in GenomicFeatures?

Tue Oct 9 18:57:15 CEST 2012

Hi Ying,

Since you are describing two transcripts, it should be the 1st one.  The 
second would for describing the exons that go with these transcripts.

But just out of curiousity, why are you starting with a GAF?  I see you 
using UCSC IDs, so why not just use one of these known gene packages?:

http://www.bioconductor.org/packages/release/BiocViews.html#___TranscriptDb

   Marc

On 10/09/2012 09:21 AM, Ying Chen wrote:
> Sorry, one more question regarding the transcripts argument:
>
> For the following two transcripts:
>
> 	uc001aaa.3	chr1:11874-12227,12613-12721,13221-14409:+
> 	uc010nxq.1	chr1:11874-12227,12595-12721,13403-14409:+
>
> What should be the format for tx_start and tx_end?
>
> transcripts<- data.frame(
> 		tx_id=c("uc001aaa.3"," uc010nxq.1"),
> 		tx_chrom="chr1",
> 		tx_strand="+",
> 		tx_start=c(11874, 11874),
> 		tx_end=c(14409, 14409))
>
> or:
>
> transcripts<- data.frame(
> 		tx_id=c("uc001aaa.3","uc001aaa.3","uc001aaa.3","uc010nxq.1","uc010nxq.1","uc010nxq.1"),
> 		tx_chrom="chr1",
> 		tx_strand="+",
> 		tx_start=c(11874,12613,13221,11874,12595,13403),
> 		tx_end=c(12227,12721,14409,12227,12721,14409))
>
> Thanks a lot for the help!
>
> Ying
>
> -----Original Message-----
> From: bioconductor-bounces at r-project.org [mailto:bioconductor-bounces at r-project.org] On Behalf Of Ying Chen
> Sent: Tuesday, October 09, 2012 10:57 AM
> To: bioconductor at r-project.org
> Subject: [BioC] How to set up genes argument for makeTranscriptDb( ) function in GenomicFeatures?
>
> Hi guys,
>
> I am new to the GenomicsFeatures package and I am trying to make my own txdb out of TCGA GAF file using the function makeTranscriptDb(transcripts, splicings,genes) in GenomicsFeatures. The manual says that genes is a data frame containing the genes associated to a set of transcripts. I just wonder what format should it be. Will the following work?
>
>   genes<- data.frame(
>                     tx_id=c(txid1,txid2, txid3, txid4, txid5, txid6, txid7, txid8, txid9,............),                                                              # first 9 tx_ids for 3 gene_ids are listed
>                    gene_id=c(geneid1, geneid1, geneid2, geneid2, geneid2, geneid3, geneid3, geneid3, geneid3,.......))     # first 3 gene_ids are listed
>
> Thanks a lot for the help!
>
> Ying
>
> Confidentiality Note:\ This e-mail, and any attachment t...{{dropped:19}}
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor