[R] factors and characters when attaching data...more info.

Gary Collins gco at eortc.be
Thu Apr 5 11:05:50 CEST 2001


R-helpers...
Please find an ammendment to a problem I posted yesterday (04/04/01).
Unfortunately I recieved only one response, so I will give some more details
to the problem.
I have read some data called Version3.Studies, and to make life slightly
easier and programming less wordy, I want to attach a dataframe, but when I
do, all of my charater fields are forced into factors.

> Version3.Studies_read.table("c:\\Version3.Studies.dat",
header=TRUE,as.is=TRUE, strip.white=TRUE) 
> summary(Version3.Studies$Group)
   Length      Mode 
     3103 character 
> is.character(Version3.Studies$Group) # Just to make sure...
[1] TRUE
> unique(Version3.Studies$Group)
 [1] "Lung"         "Mesothelioma" "Breast"       "HeadandNeck"
"Oesophagus"  
 [6] "Ovary"        "Brain"        "Prostate"     "Testes"       "Stomach"

[11] "ColonRectum" 
> 

Now I attach the data...

> attach(Version3.Studies)
> is.character(Group)
[1] FALSE
> is.factor(Group)
[1] TRUE
> 

> unique(Group)
 [1]         Lung Mesothelioma       Breast  HeadandNeck   Oesophagus
 [6]        Ovary        Brain     Prostate       Testes      Stomach
[11]  ColonRectum 
Levels:          Lung        Brain        Ovary       Breast       Testes
Stomach     Prostate   Oesophagus  ColonRectum  HeadandNeck Mesothelioma 

Now, consider the following simple example, I want to extract another field,
say PF in Version3.Studies but indexing by a label in Group, say Lung.
Without attaching the data, I can simply do

> Version3.Studies$PF[Version3.Studies$Group=="Lung"]

and this calls the apropriate data.

After attaching the data, to retrieve the same data, I need to do

> PF[Group=="        Lung"]

inserting the neccesary white space.

What my question is why is R forcing my character fields to factor when
attaching a dataframe, is this what is supposed to happen, and is there a
way around it, keeping my original character fields as character and not as
factor.
Trying to force the Group to a character field still keeps white space which
was created when attaching the dataframe.

> unique(as.character(Group))
 [1] "        Lung" "Mesothelioma" "      Breast" " HeadandNeck" "
Oesophagus"
 [6] "       Ovary" "       Brain" "    Prostate" "      Testes" "
Stomach"
[11] " ColonRectum"
> 
Any help would be greatly appreciated.

Gary Collins.
__________________________________________________
Dr. Gary S. Collins,
Statistics Research Fellow,
Quality of Life Unit, 
European Organisation for Research and Treatment of Cancer, 
EORTC Data Center, 
Avenue E. Mounier 83, bte. 11,
B-1200 Brussels, Belgium.

Tel: +32 2 774 1 606
Fax: +32 2 779 4 568
Email: gco at eortc.be
http://www.eortc.be/home/qol/
__________________________________________________


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._



More information about the R-help mailing list