[R] Extracting part of a factor

Sarah Goslee sarah.goslee at gmail.com
Fri Mar 4 22:32:37 CET 2016


You're not saving the result of mutate(). You're just printing it to the screen.

Try instead:
test <- mutate(testdata, place = substr(testdata$subject, 1,3))
test$place <- as.factor(test$place) # or factor() if you'd rather

This is why we ask for reproducible examples with data and code.
Look through the following and see if you understand.


test <- structure(list(subject = structure(1:6, .Label = c("001-002",
"002-003", "003-004", "004-005", "005-006", "006-007"), class = "factor"),
    group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
    "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
    1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
    8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names = c("subject",
"group", "wk1", "wk2", "wk3", "wk4"), class = "data.frame", row.names = c(NA,
-6L))

> str(test)
'data.frame': 6 obs. of  6 variables:
 $ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
 $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
 $ wk1    : int  2 7 9 5 2 1
 $ wk2    : int  3 6 4 7 6 4
 $ wk3    : int  4 5 6 8 3 7
 $ wk4    : int  5 4 1 9 8 4
> mutate(test, place = substr(testdata$subject, 1,3))
  subject group wk1 wk2 wk3 wk4 place
1 001-002  boys   2   3   4   5   001
2 002-003  boys   7   6   5   4   002
3 003-004  boys   9   4   6   1   003
4 004-005 girls   5   7   8   9   004
5 005-006 girls   2   6   3   8   005
6 006-007 girls   1   4   7   4   006
> str(test)
'data.frame': 6 obs. of  6 variables:
 $ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
 $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
 $ wk1    : int  2 7 9 5 2 1
 $ wk2    : int  3 6 4 7 6 4
 $ wk3    : int  4 5 6 8 3 7
 $ wk4    : int  5 4 1 9 8 4



test <- mutate(testdata, place = substr(testdata$subject, 1,3))
test$place <- as.factor(test$place)

> str(test)
'data.frame': 6 obs. of  7 variables:
 $ subject: Factor w/ 6 levels "001-002","002-003",..: 1 2 3 4 5 6
 $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
 $ wk1    : int  2 7 9 5 2 1
 $ wk2    : int  3 6 4 7 6 4
 $ wk3    : int  4 5 6 8 3 7
 $ wk4    : int  5 4 1 9 8 4
 $ place  : Factor w/ 6 levels "001","002","003",..: 1 2 3 4 5 6



On Fri, Mar 4, 2016 at 4:13 PM, KMNanus <kmnanus at gmail.com> wrote:
> Here’s where I’m stumped -
>
> when I call mutate(test, place = substr(test$subject, 1,3)) to create a
> place variable, I get this, with place as a character variable.
>
>  subject  group   wk1   wk2   wk3   wk4 place
>    (fctr) (fctr) (int) (int) (int) (int) (chr)
> 1 001-002   boys     2     3     4     5   001
> 2 002-003   boys     7     6     5     4   002
> 3 003-004   boys     9     4     6     1   003
> 4 004-005  girls     5     7     8     9   004
> 5 005-006  girls     2     6     3     8   005
> 6 006-007  girls     1     4     7     4   006
>
> When I call test$place <- factor(test$place), I receive the msg  - "Error in
> `$<-.data.frame`(`*tmp*`, "place", value = integer(0)) :
>   replacement has 0 rows, data has 6.
>
> If I call mutate this way - mutate(test, place =
> factor(substr(test$subject,1,3))), I get the same output as above but when I
> call class(test$place), I get NULL and the variable disappears.
>
> I can’t figure out why.
>
> Ken
> kmnanus at gmail.com
> 914-450-0816 (tel)
> 347-730-4813 (fax)
>
>
> On Mar 4, 2016, at 3:46 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>
> I much prefer the factor function over the as.factor function for converting
> character to factor, since you can set the levels in the order you want them
> to be.
> --
> Sent from my phone. Please excuse my brevity.
>
> On March 4, 2016 10:07:27 AM PST, Sarah Goslee <sarah.goslee at gmail.com>
> wrote:
>>
>> As everyone has been telling you, as.factor().
>> If you like the mutate approach, you can call as.factor(test$subject)
>> to convert it.
>>
>> Here's a one-liner with reproducible data.
>>
>>
>> testdata <- structure(list(subject = structure(1:6, .Label = c("001-002",
>> "002-003", "003-004", "004-005", "005-006", "006-007"), class = "factor"),
>>     group = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("boys",
>>     "girls"), class = "factor"), wk1 = c(2L, 7L, 9L, 5L, 2L,
>>     1L), wk2 = c(3L, 6L, 4L, 7L, 6L, 4L), wk3 = c(4L, 5L, 6L,
>>     8L, 3L, 7L), wk4 = c(5L, 4L, 1L, 9L, 8L, 4L)), .Names = c("subject",
>> "group", "wk1", "wk2", "wk3", "wk4"), class = "data.frame", row.names =
>> c(NA,
>> -6L))
>>
>> testdata$subject <- as.factor(substring(as.character(testdata$subject), 1,
>> 3))
>>
>>>
>>> testdata
>>
>>   subject group wk1 wk2 wk3 wk4
>> 1     001  boys   2   3   4   5
>> 2     002  boys   7   6   5   4
>> 3     003  boys   9   4   6   1
>> 4     004 girls   5   7   8   9
>> 5     005 girls   2   6   3   8
>> 6     006 girls   1   4   7   4
>>>
>>>  str(testdata)
>>
>> 'data.frame': 6 obs. of  6 variables:
>>  $ subject: Factor w/ 6 levels "001","002","003",..: 1 2 3 4 5 6
>>  $ group  : Factor w/ 2 levels "boys","girls": 1 1 1 2 2 2
>>  $ wk1    : int  2 7 9 5 2 1
>>  $ wk2    : int  3 6 4 7 6 4
>>  $ wk3    : int  4 5 6 8 3 7
>>  $ wk4    : int  5 4 1 9 8 4
>>
>> Sarah
>>
>> On Fri, Mar 4, 2016 at 1:00 PM, KMNanus <kmnanus at gmail.com> wrote:
>>>
>>>
>>>  Here’s the dataset
>>> I’m working with, called test -
>>>
>>>  subject group wk1 wk2 wk3 wk4 place
>>>  001-002 boys 2 3 4 5
>>>  002-003 boys 7 6 5 4
>>>  003-004 boys 9 4 6 1
>>>  004-005 girls 5 7 8 9
>>>  005-006 girls 2 6 3 8
>>>  006-007 girls 1 4 7 4
>>>
>>>
>>>  if I call mutate(test, place = substr(subject,1,3), “001 is the first
>>> observation in the place column
>>>
>>>  But it’s a character and “subject” is a factor.  I need place to be a
>>> factor, too, but I need the observations to be ONLY the first three numbers
>>> of “subject.”
>>>
>>>  Does that make my request more understandable?
>>



More information about the R-help mailing list