[R] Changing entries of column of type "factor"/Adding a new level to a factor

Bert Gunter gunter.berton at gene.com
Mon Aug 27 19:18:51 CEST 2012


Well ...See below.

-- Cheers, Bert

On Mon, Aug 27, 2012 at 9:19 AM, David Winsemius <dwinsemius at comcast.net> wrote:
>
> On Aug 27, 2012, at 3:09 AM, Fridolin wrote:
>
>> What is a smart way to change an entry inside a column of a dataframe or
>> matrix which is of type "factor"?
>>
>> Here is my script incl. input data:
>>>
>>> #set working directory:
>>> setwd("K:/R")
>>>
>>> #read in data:
>>> input<-read.table("Exampleinput.txt", sep="\t", header=TRUE)
>>>
>>> #check data:
>>> input
>>
>>   Ind      M1      M2      M3
>> 1    1   96/98 120/120     0/0
>> 2    2 102/108 120/124 305/305
>> 3    3  96/108 120/120     0/0
>> 4    4     0/0 116/120 300/305
>> 5    5  96/108 120/130 300/305
>> 6    6   98/98 116/120 300/305
>> 7    7  98/108 120/120 305/305
>> 8    8  98/108 120/120 305/305
>> 9    9  98/102 120/124 300/300
>> 10  10 108/108 120/120 305/305
>>>
>>> str(input)
>>
>> 'data.frame':   10 obs. of  4 variables:
>> $ Ind: int  1 2 3 4 5 6 7 8 9 10
>> $ M1 : Factor w/ 8 levels "0/0","102/108",..: 5 2 4 1 4 8 7 7 6 3
>> $ M2 : Factor w/ 4 levels "116/120","120/120",..: 2 3 2 1 4 1 2 2 3 2
>> $ M3 : Factor w/ 4 levels "0/0","300/300",..: 1 4 1 3 3 3 4 4 2 4
>>>
>>>
>>> #replace 0/0 by 999/999:
>>> for (r in 1:10)
>>
>> +   for (c in 2:4)
>> +     if (input[r,c]=="0/0") input[r,c]<-"999/999"
>> Warnmeldungen:
>> 1: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") :
>>  invalid factor level, NAs generated
>> 2: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") :
>>  invalid factor level, NAs generated
>> 3: In `[<-.factor`(`*tmp*`, iseq, value = "999/999") :
>>  invalid factor level, NAs generated
>>>
>>> input
>>
>>   Ind      M1      M2      M3
>> 1    1   96/98 120/120    <NA>
>> 2    2 102/108 120/124 305/305
>> 3    3  96/108 120/120    <NA>
>> 4    4    <NA> 116/120 300/305
>> 5    5  96/108 120/130 300/305
>> 6    6   98/98 116/120 300/305
>> 7    7  98/108 120/120 305/305
>> 8    8  98/108 120/120 305/305
>> 9    9  98/102 120/124 300/300
>> 10  10 108/108 120/120 305/305
>>
>>
>> I want to replace all "0/0" by "999/999". My code should work for columns
>> of
>> type "character" and "integer". But to make it work for a "factor"-column
>> I
>> would need to add the new level of "999/999" at first, I guess. How do I
>> add
>> a new level?
>
>
> ?levels
>
> levels(input$M1) <- c(levels(input$M1), "999/999")

This adds an additional level; then you have to replace the "0/0"
level with this one; then you have to call levels again to remove the
"0/0" level.

I think the following slight tweak may be preferred, as illustrated
with a little example (opinions?):

> x <- factor(letters[1:3])
> x
[1] a b c
Levels: a b c

## create a new levels vector
> newlvl <- levels(x)
> newlvl[newlvl == "a"] <- "d"

## Create the new factor and replace the old with it

> x <- factor(newlvl[x])
> x
[1] d b c
Levels: b c d

Note, however, as Bill D. said, in either case your level ordering --
which will be used, e.g. in printing and displaying -- will be weird.



>
> --
>
> David Winsemius, MD
> Heritage Laboratories
> West Hartford, CT
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm




More information about the R-help mailing list