[BioC] ExpressionSet error message: "featureNames differ"

Martin Morgan mtmorgan at fhcrc.org
Sun Nov 17 22:23:20 CET 2013


On 11/17/2013 12:11 PM, Jobin K. Varughese wrote:
> Hello again. I was informed that I should share this as a public Dropbox
> link rather than as an attachment. This link can be found here:
> https://dl.dropboxusercontent.com/u/6322354/ExpressionSet_error.zip

Hi Jobin -- I installed and loaded the package then

   debug(loadReadCounts)
   source("stats.R")

and then stepped through the function until I got to

	res <- new("shrnaSet", assayData=adata, phenoData=pdata, featureData=fdata)

when I ran this I got the error

Browse[2]> new("shrnaSet", assayData=adata, phenoData=pdata, featureData=fdata)
Error in validObject(.Object) :
   invalid class "shrnaSet" object: featureNames differ between assayData and 
featureData

so I looked at the rownames of the objects inside adata, and the rownames of 
fdata, which are supposed to be the same

Browse[2]> eapply(adata, function(x) head(rownames(x)))
$readCounts1
[1] "X4501844.NM_001088.1.38903" "X4501844.NM_001088.1.38904"
[3] "X4501844.NM_001088.1.38905" "X4501844.NM_001088.1.38906"
[5] "X4501844.NM_001088.1.38907" "X4501844.NM_001088.1.54352"

$readCounts2
[1] "X4501844.NM_001088.1.38903" "X4501844.NM_001088.1.38904"
[3] "X4501844.NM_001088.1.38905" "X4501844.NM_001088.1.38906"
[5] "X4501844.NM_001088.1.38907" "X4501844.NM_001088.1.54352"

Browse[2]> head(rownames(pData(fdata)))
[1] "4501844|NM_001088.1|38903" "4501844|NM_001088.1|38904"
[3] "4501844|NM_001088.1|38905" "4501844|NM_001088.1|38906"
[5] "4501844|NM_001088.1|38907" "4501844|NM_001088.1|54352"

They're not the same, because earlier in the code the package author had made 
the names of the assay data with 'make.names'

     x <- read.delim(file=paste(dataPath,condition2,sep="/"), header=TRUE)
     condition2 <- as.matrix(x[,2:dim(x)[2]])
     rownames(condition2) <- make.names(x[,1],unique=TRUE)

but the names of the feature data were not mangled

     fdat = read.delim(file=paste(dataPath,shRNAData,sep="/"), header=TRUE)
     rownames(fdat) <- fdat[,1]

I guess a work-around would be to mangle the names of the first column of 
shRNALibrary.txt using make.names(). I think I would have written the code above 
without mangling names as

     condition2 = as.matrix(read.delim(file.path(dataPath, condition2), 
header=TRUE, row.names=1))

and (with similar changes elsewhere in this code)

     fdat = read.delim(file.path(dataPath,shRNAData), header=TRUE, row.names=1)

Martin

>
> Thank you.
>
> Jobin
>
>
> On Sat, Nov 16, 2013 at 1:25 PM, Jobin K. Varughese <jobinv at gmail.com>wrote:
>
>> Hello. My name is Jobin K. Varughese MD PhD, and I work at the University
>> of Bergen, Norway.
>>
>> My issue is that I am having trouble with the ExpressionSet class. It
>> appears when running the following command (adapted from the shRNAseq<http://rock.icr.ac.uk/software/shrnaseq.jsp>vignette):
>>
>>> library(shRNAseq)
>>> dataPath <- ".."
>>>
>>> x <- loadReadCounts(
>> +   screenData="phenoData.txt",
>> +   shRNAData="shRNALibrary.txt",
>> +   condition1="raw_tmz.txt", condition1Name="TMZ",
>> +   condition2="raw_ctrl.txt", condition2Name="Control",
>> +   dataPath=dataPath)
>>
>> Error in validObject(.Object) :
>>    invalid class “shrnaSet” object: featureNames differ between assayData and featureData
>>
>> I've made sure that the featureNames are indeed identical between the
>> various files. Next, I tried contacting the developer of shRNAseq and even
>> sent him my raw data files. He attempted several strategies to solve the
>> issue: *"I've changed the names in the first columns to numbers 1:27500
>> and made sure they match exactly between the files - didn't help. I've
>> checked line endings (Mac/Unix/Windows) - didn't help. I've removed the
>> final blank lines - didn't help. I've stripped quotes - didn't help."*
>>
>> Eventually he also was unable to get past the error message, but could
>> provide the following: *"The shRNAset object is basically an expression
>> set (Eset) and the methods are inherited from that (including the error
>> message)."*
>>
>> The error is reproducible with the sample files that I've attached
>> (compressed with 7zip). I have also attached the output from sessionInfo()
>> and traceback().
>>
>> Can anyone advise me as to what might be the problem? I would be very
>> grateful indeed.
>>
>> Sincerely,
>> Jobin K. Varughese
>>
>
> 	[[alternative HTML version deleted]]
>
>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>


-- 
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793



More information about the Bioconductor mailing list