[R] Symbol/String comparison in R

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Thu Apr 14 04:00:26 CEST 2022


"I was not able to find answers to my questions (tried Google, Stack
Overflow, etc). Please correct me if anything is wrong here."

R has an extensive Help system. That should always be your first place
to look. In this case, ?"<" (at the R prompt) brings you to the Help
page for comparisons (as would ?Comparison, but only if the 'c" is in
upper case, unfortunately). Among lots of other stuff, it says:

"Comparison of strings in character vectors is lexicographic within
the strings using the collating sequence of the locale in use: see
locales." ... (+ lots more).

Incidentally, rseek.org and rdrr.io are another couple of good places
to look for R documentation.



Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Apr 13, 2022 at 5:10 PM Kristjan Kure <kristjan.kure.1 using gmail.com> wrote:
>
> Hi!
>
> Sorry, I am a beginner in R.
>
> I was not able to find answers to my questions (tried Google, Stack
> Overflow, etc). Please correct me if anything is wrong here.
>
> When comparing symbols/strings in R - raw numeric values are compared
> symbol by symbol starting from left? If raw numeric values are not used is
> there an ASCII / Unicode table where symbols have values/ranking/order and
> R compares those values?
>
> *2) Comparing symbols*
> Letter "a" raw value is 61, letter "b" raw value is 62? Is this correct?
>
> # Raw value for "a" = 61
> a_raw <- charToRaw("a")
> a_raw
>
> # Raw value for "b" = 62
> b_raw <- charToRaw("b")
> b_raw
>
> # equals TRUE
> "a" < "b"
>
> Ok, so 61 is less than 62 so it's TRUE. Is this correct?
>
> *3) Comparing strings #1*
> "1040" <= "12000"
>
> raw_1040 <- charToRaw("1040")
> raw_1040
> #31 *30* (comparison happens with the second symbol) 34 30
>
> raw_12000 <- charToRaw("12000")
> raw_12000
> #31 *32* (comparison happens with the second symbol) 30 30 30
>
> The symbol in the second position is 30 and it's less than 32. Equals to
> true. Is this correct?
>
> *4) Comparing strings #2*
> "1040" <= "10000"
>
> raw_1040 <- charToRaw("1040")
> raw_1040
> #31 30 *34*  (comparison happens with third symbol) 30
>
> raw_10000 <- charToRaw("10000")
> raw_10000
> #31 30 *30*  (comparison happens with third symbol) 30 30
>
> The symbol in the third position is 34 is greater than 30. Equals to false.
> Is this correct?
>
> *5) Problem - Why does this equal FALSE?*
> *"A" < "a"*
>
> 41 < 61 # FALSE?
>
> # Raw value for "A" = 41
> A_raw <- charToRaw("A")
> A_raw
>
> # Raw value for "a" = 61
> a_raw <- charToRaw("a")
> a_raw
>
> Why is capitalized "A" not less than lowercase "a"? Based on raw values it
> should be. What am I missing here?
>
> Thanks
> Kristjan
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list