[BioC] finding differential expression pattern between paired columns of data frame.

Jakub Stanislaw Nowak jakub.nowak at ed.ac.uk
Wed Aug 6 18:10:56 CEST 2014


Hi Jim,

Thanks for your reply. It become clear for me.

I computed the fold changes and added them to the data frame

> LineA.LineD <- data.selected.list$LineA / data.selected.list$LineD
> LineB.LineC <- data.selected.list$LineB/data.selected.list$LineC
> data.merged <- cbind(data.selected.list, LineAD=LineA.LineD, LineBC=LineB.LineC)
> View(data.merged)
> data.merged
                ID       LineA       LineB      LineC      LineD       LineAD     LineBC
1   mmu-miR-23a-5p -2.28461513 -2.75356783 -2.0132004  0.1397216 -16.35119386 1.36775647
2    mmu-miR-22-3p  1.65924854  0.03717303  0.4909068  2.4100528   0.68846978 0.07572319
3     mmu-miR-5113 -0.91337275 -0.87022608 -1.3224488  1.3906263  -0.65680676 0.65804140
4   mmu-miR-324-3p  1.44133874 -0.07767235 -0.2187064 -0.2201993  -6.54561061 0.35514432
5 mmu-miR-365-1-5p -0.03644659  0.72755490  1.0445618  1.3071159  -0.02788321 0.69651688
6  mmu-miR-3068-5p  0.49474466 -0.28506641 -0.2995997  1.3754639   0.35969294 0.95149110
> 

I want to ask for suggestion what would be the best way to sort this data frame now by:
1. Descending by fold changes in LineAD column 
2. In the same time removing these miRNAs that fold change do not differ so much in LineBC column

Many thanks for your great help,

Jakub


On 6 Aug 2014, at 15:26, James W. MacDonald <jmacdon at uw.edu> wrote:

> Hi Jakub,
> 
> The problem here is that you don't have any replication, so you cannot fit a model. The only thing you can do is compute fold changes, which are not very reliable.
> 
> Best,
> 
> Jim
> 
> 
> 
> 
> On Wed, Aug 6, 2014 at 10:02 AM, Jakub Stanislaw Nowak <jakub.nowak at ed.ac.uk> wrote:
> Hello Bioconductor list members,
> 
> I am trying to analyse output of my smallRNA sequencing data. My starting point is a table with differentially expressed miRs between two stages of different cell types.
> My goal is to identify miRs that are changing in LineA - Line D and not changing in LineB - LineC and reciprocal situation.
> 
> Below is a head of my data.frame that I work on.
> 
> > data.selected.list
>                 ID                      Line A        Line B        Line C     Line D
> 1   mmu-miR-23a-5p -2.28461513 -2.75356783 -2.0132004  0.1397216
> 2    mmu-miR-22-3p  1.65924854  0.03717303  0.4909068  2.4100528
> 3     mmu-miR-5113 -0.91337275 -0.87022608 -1.3224488  1.3906263
> 4   mmu-miR-324-3p  1.44133874 -0.07767235 -0.2187064 -0.2201993
> 5 mmu-miR-365-1-5p -0.03644659  0.72755490  1.0445618  1.3071159
> 6  mmu-miR-3068-5p  0.49474466 -0.28506641 -0.2995997  1.3754639
> 
> As this very much looked for me as microarray experiment I originally was thinking to apply some tools for microarray analysis to get this done.
> 
> So i did as below
> 1. Constructed matrix
> > design <- model.matrix(~-1+factor(c(1,2,3,4)))
> > colnames(design) <- c("LineA","LineB","LineC","LineD")
> > design
>   LineA LineB LineC LineD
> 1     1     0     0     0
> 2     0     1     0     0
> 3     0     0     1     0
> 4     0     0     0     1
> attr(,"assign")
> [1] 1 1 1 1
> attr(,"contrasts")
> attr(,"contrasts")$`factor(c(1, 2, 3, 4))`
> [1] “contr.treatment”
> 
> 2. Constructed comparison matrix
> > contrast.matrix <- makeContrasts(LineA-LineD, LineB-LineC, levels=design)
> > contrast.matrix
>        Contrasts
> Levels  LineA - LineD LineB - LineC
>   LineA             1             0
>   LineB             0             1
>   LineC             0            -1
>   LineD            -1             0
> 
> 3. Perform fitting and of course got benched for my naive thinking that this is going to work
> 
> > fit <- lmFit(data.selected[,(2:5)],design)
> > fit2 <- contrasts.fit(fit, contrast.matrix)
> > fit2 <- eBayes(fit2)
> Error in ebayes(fit = fit, proportion = proportion, stdev.coef.lim = stdev.coef.lim,  :
>   No residual degrees of freedom in linear model fits
> 
> I thought that maybe there is something wrong with the input for my fitting and I checked that limma is using normalised log data so I followed my very naive thinking and try to normalise the table
> 
> > eset<-rma(data.selected.list[,2:5])
> Error in (function (classes, fdef, mtable)  :
>   unable to find an inherited method for function ‘probeNames’ for signature ‘"data.frame”’
> 
> As it hasn’t work I have two questions:
> 1. Can I somehow edit my data.frame that it will become a better fit for limma package and perform comparison I want.
> 2. I won’t be surprised if you say that this approach is not valid therefore I will welcome any suggestion about other models/methods I can use to make this comparison for my initial table.
> 
> Thank you very much for help,
> 
> Jakub
> 
> 
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> 
> 
> 
> -- 
> James W. MacDonald, M.S.
> Biostatistician
> University of Washington
> Environmental and Occupational Health Sciences
> 4225 Roosevelt Way NE, # 100
> Seattle WA 98105-6099 

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: not available
URL: <https://stat.ethz.ch/pipermail/bioconductor/attachments/20140806/8cf821aa/attachment.pl>


More information about the Bioconductor mailing list