[R] Selecting initial numerals

joris meys jorismeys at gmail.com
Wed Oct 14 17:12:39 CEST 2009


Josh,

One way would be to convert the numeric vector to a character and use
the function substr(). Following code returns a numeric vector with
the 2 first digits of every element.

naics=c(238321, 624410, 484121 ,238911, 811111, 531110,
621399,541613,524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110
,812320)

First_Two <- as.numeric ( substr ( as.character ( naics ), 1, 2 ) )
First_Two

see also ?substr

Cheers
Joris

On Wed, Oct 14, 2009 at 5:05 PM, Josh Roll <j_r_36 at hotmail.com> wrote:
> Joris,
>     I figured out that was my issue.  Thanks for your insights.  However i
> need the first two digits of the numeral not the last two.  How do i coerce
> the code to get this outcome.
>
> Cheers
>
>> Date: Wed, 14 Oct 2009 13:50:54 +0200
>> Subject: Re: [R] Selecting initial numerals
>> From: jorismeys at gmail.com
>> To: J_R_36 at hotmail.com
>> CC: r-help at r-project.org
>>
>> On Tue, Oct 13, 2009 at 6:48 PM, PDXRugger <J_R_36 at hotmail.com> wrote:
>> >
>> > II just want to create a new object with the first two numerals of the
>> > data.
>> > Not sure why this isnt working, consider the following:
>> >
>> > EmpEst$naics=c(238321, 624410, 484121 ,238911, 811111, 531110, 621399,
>> > 541613,
>> > 524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110 ,812320)
>> >
>> >
>> > EmpEst$naics2<-formatC(EmpEst$naics %% 1e2, width=2, flag="", mode
>> > ="integer")
>> > #RESULT:Warning message:
>> > #In Ops.factor(EmpEst$naics, 100) : %% not meaningful for factors
>>
>> Wild guess : you get this warning EmpEst$naics is a factor? Quite some
>> errors and warnings mean mostly what they say. If you see similar
>> errors or warnings, please use the function str() first to check your
>> data structure. For example :
>>
>> str(EmpEst$naics)
>>
>> You should also make sure you provide us with self contained,
>> reproducible code. As we don't have the dataframe EmpEst, I cannot run
>> the code you sent. If I change it, I don't get the error.
>>
>> Below a few code snippets to illustrate how the problem arises, and
>> how to get it away :
>>
>> > naics=c(238321, 624410, 484121 ,238911, 811111, 531110, 621399,541613,
>> + 524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110 ,812320)
>> >
>> > naics2<-formatC(naics %% 1e2, width=2, flag="", mode
>> + ="integer")
>> > naics2
>> [1] "21" "10" "21" "11" "11" "10" "99" "13" "10" "15" "21" "15" "15" "10"
>> "10"
>> [16] "20"
>>
>> No error, as vector naics is a numerical vector. I make it a factor :
>>
>> > naics=factor(c(238321, 624410, 484121 ,238911, 811111, 531110,
>> > 621399,541613,
>> + 524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110 ,812320))
>> >
>> > naics2<-formatC(naics %% 1e2, width=2, flag="", mode
>> + ="integer")
>> Warning message:
>> In Ops.factor(naics, 100) : %% not meaningful for factors
>> > naics2
>> [1] "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA" "NA"
>> "NA"
>> [16] "NA"
>>
>> Which is what you see. You can transform a factor to a numerical
>> vector with a combination of as.numeric(as.character()). This is
>> necessary as you would otherwise get the internal values for the
>> factor levels (i.e. the numbers 1, 2, ... n with n being the number of
>> levels.)
>>
>> > naics=factor(c(238321, 624410, 484121 ,238911, 811111, 531110,
>> > 621399,541613,
>> + 524210 ,236115 ,811121 ,236115 ,236115 ,621610 ,814110 ,812320))
>> >
>> > naics2<-formatC(as.numeric(as.character(naics)) %% 1e2, width=2,
>> > flag="", mode
>> + ="integer")
>> > naics2
>> [1] "21" "10" "21" "11" "11" "10" "99" "13" "10" "15" "21" "15" "15" "10"
>> "10"
>> [16] "20"
>




More information about the R-help mailing list