[BioC] edgeR import error

James W. MacDonald jmacdon at uw.edu
Tue Feb 19 15:07:25 CET 2013



On 2/19/2013 4:20 AM, ES [guest] wrote:
> Hi,
>
> I have been trying to create a DGEList object in edgeR, using the following command:
>
>> tar<-read.delim("test_target.txt")
>> b<-readDGE(tar)
> Error in readDGE(tar) : Repeated tag sequences in tmp1.txt
>
> where test target is
>   files groups
> 1 tmp1.txt    fem
> 2 tmp2.txt    fem
> 3 tmp3.txt   male
> 4 tmp4.txt   male
>
> and tmp1.txt is a tab delimit text file with 2 columns (col1: geneID, col2: raw counts).
>
> target_id       uniq_counts
> 0	51
> 1	0
> 10	24
> 100	13
> 1000	5
> 10000	227
> 10001	7
> 10002	28
> 10003	21
>
> I have checked that the geneID/tags are unique using awk '{print$1}' | sort|uniq and yet I keep getting this error.

Try

x <- read.delim("tmp1.txt", stringsAsFactors = FALSE)
any(duplicated(x[,1]))

and possibly

x[duplicated(x[,1]),]

because R thinks you have duplicates.

Best,

Jim
>
> I have used edgeR successfully in the past and am currently also able to use it with other older data files.
>
> Thanks
> E
>
>
>   -- output of sessionInfo():
>
> R version 2.15.2 (2012-10-26)
> Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
>
> locale:
> [1] C
>
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods   base
>
> other attached packages:
> [1] BiocInstaller_1.8.3 edgeR_3.0.8         limma_3.14.4
>
> loaded via a namespace (and not attached):
> [1] tools_2.15.2
>
> --
> Sent via the guest posting facility at bioconductor.org.
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099



More information about the Bioconductor mailing list