[BioC] using easyRNASeq to calculate RPKM values

Fatemehsadat Seyednasrollah fatsey at utu.fi
Tue Jan 22 17:10:43 CET 2013


Many Thanks.
________________________________________
From: Nicolas Delhomme [delhomme at embl.de]
Sent: Tuesday, January 22, 2013 4:46 PM
To: Fatemehsadat Seyednasrollah
Cc: bioconductor at r-project.org
Subject: Re: [BioC] using easyRNASeq to calculate RPKM values

Dear Fatemehsadat,

It is indeed possible. The function RPKM would do that for you. Have a look at the help page by doing ?RPKM after loading easyRNASeq. The last example takes as argument a matrix (your count table), the gene sizes (or whatever feature you used, e.g. transcripts) and the sizes of your RNA-Seq libraries. These two last arguments should be named vectors where the name are the rownames and colnames of your count table, respectively. The library size can be retrieved simply by summing your columns, i.e. colSums(count.table).

Words of caution though, RPKM is a correction and not a normalization, so it's fine for visualizing the data, but I would not use it as input to any statistical tools such as DESeq, edgeR, etc. Moreover, depending on how you counted your reads per feature, you might have counted some reads multiple time in which case, it is better to retrieve your library size from your original BAM file using samtools.

HTH,

Nico

---------------------------------------------------------------
Nicolas Delhomme

Genome Biology Computational Support

European Molecular Biology Laboratory

Tel: +49 6221 387 8310
Email: nicolas.delhomme at embl.de
Meyerhofstrasse 1 - Postfach 10.2209
69102 Heidelberg, Germany
---------------------------------------------------------------





On Jan 22, 2013, at 2:03 PM, Fatemehsadat Seyednasrollah wrote:

> Dear list,
>
> I have used HTSeq to get the count table of an RNA seq dataset which has 8 biological replicates and two conditions ( so 4 biological replicates for each condition ) and the count table is like below:
>
>> head(a)
>
>           V1  V2 V3  V4  V5 V6  V7 V8  V9
> 1 1/2-SBSRNA4   3  5   4   4  2   3  1   1
> 2        A1BG 200 93 246 102 86  46 58  85
> 3    A1BG-AS1  24 28  16  32 17  10 19  14
> 4        A1CF   1  1   1   2  1   0  0   1
> 5       A2LD1 100 71  98  97 59 128 88 114
> 6         A2M   5  5  23   1  5   6 10   5
>
> Now for getting familiar with the expression level of each gene I want to calculate the RPKM values. Can I use the easyRNASeq package over the above count table to calculate the values or not?
>
> Thank you in advance
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor




More information about the Bioconductor mailing list