[R] using an array of strings with strsplit, issue when including a space in split criteria

Tony Breyal tony.breyal at googlemail.com
Tue Sep 8 10:47:45 CEST 2009


After further investigation it appears that the problem is specific to
my Vista PC. I am able to get the correct results using R 2.9.2 on a
Window XP 64bit machine. However i do not know why this does not work
on my Vista PC. The following was done after rebooting Vista.

>From CMD.exe I ran the following line:
C:\Program Files\R\R-2.9.2\bin>Rgui --vanilla

This opened up R.

### R 2.9.2 START ###
> txt <- c("sales to 23 August 2008 published 29 August",
+ "sales to 6 September 2008 published 11 September")
>
> strsplit(txt, 'published', fixed=TRUE)
[[1]]
[1] "sales to 23 August 2008 " " 29 August"

[[2]]
[1] "sales to 6 September 2008 " " 11 September"

> strsplit(txt, 'published ', fixed=TRUE)
[[1]]
[1] "sales to 23 August 2008 " "29 August"

[[2]]
[1] "sales to 6 September 2008 published 11 September"

> sessionInfo()
R version 2.9.2 (2009-08-24)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

### R 2.9.2 END ###



The exact same thing happened when I used R 2.9.0  and R 2.8.1 on this
same vista computer.


### R 2.9.0 ###
> sessionInfo()
R version 2.9.0 (2009-04-17)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base

other attached packages:
[1] rcom_2.1-3     rscproxy_1.3-1

loaded via a namespace (and not attached):
[1] tools_2.9.0

### R 2.8.1 ###
> sessionInfo()
R version 2.8.1 (2008-12-22)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base
>



my computer details are:
Windows Vista Ultimate
Service Pack 1
Manufacturer: Dell
Rating: 3.4
Processor: Intel Core 2 Duo CPU E6750 @ 2.66 GHz
Memory (RAM): 4.00 GB
System type: 32-bit Operating System




2009/9/8 Gabor Grothendieck <ggrothendieck at gmail.com>:
> I am using the exact same version of R as you also on Vista
> but can't reproduce your result.  For me it splits properly.
>
> Try starting R like this (modify path if needed) from the
> Windows cmd line:
>
> \Program Files\R\R-2.9.2\bin\Rgui --vanilla
>
> and then try it.
>
> On Mon, Sep 7, 2009 at 11:40 AM, Tony Breyal<tony.breyal at googlemail.com> wrote:
>> Dear all,
>>
>> I'm having a problem understanding why a split does not occur with in
>> the 2nd use of the function strsplit below:
>>
>> # text strings
>>> txt <- c("sales to 23 August 2008 published 29 August",
>> + "sales to 6 September 2008 published 11 September")
>>
>> # first use
>>> strsplit(txt, 'published', fixed=TRUE)
>> [[1]]
>> [1] "sales to 23 August 2008 " " 29 August"
>>
>> [[2]]
>> [1] "sales to 6 September 2008 " " 11 September"
>>
>> # second use, but with a space ' ' in the split
>>> strsplit(txt, 'published ', fixed=TRUE)
>> [[1]]
>> [1] "sales to 23 August 2008 " "29 August"
>>
>> [[2]]
>> [1] "sales to 6 September 2008 published 11 September"
>>
>> Thank you kindly for any help in advance.
>> Tony
>>
>> O/S: Win Vista Ultimate
>>> sessionInfo()
>> R version 2.9.2 (2009-08-24)
>> i386-pc-mingw32
>>
>> locale:
>> LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.
>> 1252;LC_MONETARY=English_United Kingdom.
>> 1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>>
>> attached base packages:
>> [1] stats     graphics  grDevices utils     datasets  methods
>> base
>>
>> other attached packages:
>> [1] RODBC_1.3-0
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>



-- 
Tony Breyal




More information about the R-help mailing list