[Rd] read.table with ":" in column names (PR#8511)

Peter Dalgaard p.dalgaard at biostat.ku.dk
Fri Jan 20 12:22:53 CET 2006


peverlorenvanthemaat at amc.uva.nl writes:

> Full_Name: emiel ver loren
> Version: 2.2.0
> OS: Windows XP
> Submission from: (NULL) (145.117.31.248)
> 
> 
> Dear R-community and developers,
> 
> I have been trying to read in a tab delimeted file where the column names and
> the row names are of the form "GO:0000051" (gene ontology IDs). When using:
> 
> > gomat<-read.table("test.txt")
> > colnames(gomat)[1]
> [1] "GO.0000051"
> > rownames(gomat)[1]
> [1] "GO:0000002"
> 
> Which means that ":" is transformed into a "." !! This seems like Excel when it
> is trying to guess what I am really ment (and turning 1/1/1 into 1-1-2001).

This is what check.names=FALSE is for... (and NOT a bug, please don't
abuse the bug repository, use the mailing lists)
 
> Furthermore, I found the following quite strange as well:
> 
> > gomat2<-read.delim2("test.txt",header=FALSE)
> > gomat2[1,1:2]
>           V1         V2
> 1 GO:0000051 GO:0000280
> >  as.character(gomat2[1,1:2])
> [1] "8" "2"
> > as.character(gomat2[1,1])
> [1] "GO:0000051"
> 
> I have found a way to work around it, but I am wandering what's happening....

Yes, this is a bit nasty, but... What is happening is similar to this:

> d <- data.frame(a=factor(LETTERS), b=factor(letters))
> d[1,]
  a b
1 A a
> as.character(d[1,])
[1] "1" "1"
> as.character(d[1,1])
[1] "A"
> as.character(d[1,1,drop=F])
[1] "1"

or this:

> l <- list(a=factor("x"),b=factor("y"))
> l
$a
[1] x
Levels: x

$b
[1] y
Levels: y

> as.character(l)
[1] "1" "1"

The thing is that as.character on a list will first coerce factors to
numeric, then numeric to character. I'm not sure whether there could
be a rationale for it, but it isn't S-PLUS compatible (not 6.2.1
anyway, which is the most recent one that I have access to).


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907



More information about the R-devel mailing list