[BioC] regarding package ArrayExpress

Thu Sep 10 15:29:47 CEST 2009

Hi,

This would work assuming the featureData is kept synchronised with the
assayData. I guess the alternative would be to take mean or median for
the duplicated reporters, which might be more useful in some cases.
Perhaps that could be added as an option? I know quite a few
custom-printed arrays had duplicated reporter identifiers such as
these; it should be less of a problem for the commercial arrays.

Cheers,

Tim

2009/9/10 Misha Kapushesky <ostolop at ebi.ac.uk>:
> Hi,
>
> Without tweaking read.table, you'd have to read row names as one of the data
> columns, then make.names on that set of names and set the row names to the
> modified ones. So, something like
>
> d <- read.table("foo.tab") ## if read.table("foo.tab", row.names=1) fails
>
> rownames(d) <- make.names(d[,1], unique=TRUE)
>
> d <- d[,-1]                ## to remove the column used
>
> Whether these newly made "unique" row names are what you need is a good
> question... :)
>
> --Misha
>
> On Thu, 10 Sep 2009, audrey at ebi.ac.uk wrote:
>
>> Dear Amit,
>>
>> You are not making any mistakes. This is the proper way of calling the
>> functions to create an object from a processed dataset. However the
>> problem comes from the dataset itself. It contains duplicate probe
>> identifiers as row names, which is not allowed by the function read.table
>> that is used in the procset function.
>> Unfortunately I do not have an idea on how to prevent this. Does someone
>> know how I could allow duplicate row names in my function?
>>
>> Best regards,
>> Audrey
>>
>> --
>> Audrey Kauffmann
>> EMBL - EBI
>> Cambridge UK
>> +44 (0) 1223 492 631
>> http://www.ebi.ac.uk/~audrey
>>
>>> Hello! List,
>>>
>>> I am trying to build an object from Array Express processed data using
>>> bioconductor package ArrayExpress. I did following:-
>>>
>>> CAGE99d = getAE("E-GAGE-99",type="processed")
>>> colname = getcolproc(CAGE99d)
>>> CAGE99p = procset(CAGE99d, colname[3])
>>>
>>> and I got following error:-
>>> Error in `row.names<-.data.frame`(`*tmp*`, value = c(6995L, 7017L, 7006L,
>>> :
>>>
>>>  duplicate 'row.names' are not allowed
>>> In addition: Warning message:
>>> non-unique values when setting 'row.names': ?R:A-MEXP-58:210099?,
>>> ?R:A-MEXP-58:210100?, ?R:A-MEXP-58:210111?,
>>> ?R:A-MEXP-58:210123?,?R:A-MEXP-
>>> [... truncated]
>>>
>>> I am not able to figure out mistake I am making. Please Help!
>>> Amit
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives:
>>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
>