[R] Suprising behavior of paste or cat?

Duncan Murdoch murdoch at stats.uwo.ca
Thu Feb 11 19:55:25 CET 2010


I don't think you have said how you are examining the output files.  Is 
it possible that your text editor is assuming that the files are UCS-2 
(Unicode), even
though R is writing ASCII?

Duncan Murdoch

On 11/02/2010 1:44 PM, Russell Pierce wrote:
> Thank you for your input so far r-help denizens.  Neither David nor
> Peter were able to replicate my result.  Has anybody other than me
> been able to generate the failure I'm describing?  So far I've
> experienced it on 3 machines, Windows XP/P4/2.1.10, Windows
> XP/Atom/2.1.10/2.1.11(release), Windows Vista/Centrino/2.1.10, but
> found no problem on linux/2.7.1/x86_64.
>
> Bill's idea is interesting. There may be a mismatch between types
> occurring somewhere, but I haven't exactly where yet.  To test our his
> idea, I tried changing the order of the values in my vector "task" so
> my output would start off with "2," rather than "1,".  But I did not
> observe a change in behavior.
>
> I've generated further sample code to demonstrate the idiosyncraticy
> of what I'm observing.
>
> This code segment does not create a failure.
> #No failure
> lastcomma <- function(x) {return(paste(x,",",collapse="",sep=""))}
> h.long <- 150
> task1 <- c(rep(1,h.long),rep(2,h.long))
> task2<- c(rep(2,h.long),rep(1,h.long))
> res1 <- lastcomma(paste(task1,collapse=","))
> res2 <- lastcomma(paste(task2,collapse=","))
> write(file="write-okay1.txt",res1)
> cat(file="cat-okay2.txt",res2)
>
> This code segment, where the task vector is reordered using sample as
> an index, creates invalid files.
> #Failure of write and cat
> ord <- sample(1:(h.long*2))
> task1  <- task1[ord]
> task2  <- task2[ord]
> res1.bad <- lastcomma(paste(task1,collapse=","))
> res2.bad <- lastcomma(paste(task2,collapse=","))
> write(file="write-bad1.txt",res1.bad)
> cat(file="cat-bad2.txt",res2.bad)
>
> This code segment, where the task vector is shorter and reordered,
> creates invalid files with cat, but not with write, and only when task
> has been passed through my lastcomma function.
> #Inconsistent; cat fails but write does not, cat only fails when
> string has been passed through lastcomma
> h.long <- 100
> task1 <- c(rep(1,h.long),rep(2,h.long))
> task2<- c(rep(2,h.long),rep(1,h.long))
> ord <- sample(1:(h.long*2))
> task1  <- task1[ord]
> task2  <- task2[ord]
> res1.no.lastcomma <- paste(task1,collapse=",")
> res2.no.lastcomma <- paste(task2,collapse=",")
> res1.yes.lastcomma <- lastcomma(res1.no.lastcomma)
> res2.yes.lastcomma <- lastcomma(res2.no.lastcomma)
> write(file="write-1-nlc.txt",res1.no.lastcomma) #okay
> write(file="write-2-nlc.txt",res2.no.lastcomma) #okay
> cat(file="cat-1-nlc.txt",res1.no.lastcomma) #okay
> cat(file="cat-2-nlc.txt",res2.no.lastcomma) #okay
> write(file="write-1-lc.txt",res1.yes.lastcomma) #okay
> write(file="write-2-lc.txt",res2.yes.lastcomma) #okay
> cat(file="cat-1-lc.txt",res1.yes.lastcomma) #bad
> cat(file="cat-2-lc.txt",res2.yes.lastcomma) #bad
>
> Thanks,
>
> Russell
>
> On Thu, Feb 11, 2010 at 9:05 AM, William Dunlap <wdunlap at tibco.com> wrote:
> >
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >> -----Original Message-----
> >> From: r-help-bounces at r-project.org
> >> [mailto:r-help-bounces at r-project.org] On Behalf Of Russell Pierce
> >> Sent: Wednesday, February 10, 2010 9:21 PM
> >> To: r-help at r-project.org
> >> Subject: [R] Suprising behavior of paste or cat?
> >>
> >> I may be making a simple error, but I've looked at the str() of the
> >> resulting objects and I can't see any obvious reason I'm having the
> >> problem I am having, so I am reaching out to the R-help group.  I am
> >> generating a string in my code.  When I make a slight modification
> >> (add a comma at the end using my "lastcomma" function), I can no
> >> longer successfully write that string to a file.  Specifically, the
> >> resulting file contains only the "ⰱ" character.
> >
> > That character (which prints as an unfilled square when
> > I look at it in Outlook) is (when I copy and paste it
> > to R 2.10.0 on Windows):
> >   > "ⰱ"
> >   [1] "\u2c31"
> > The 2 bytes in it would be comma and one in ascii:
> >   > "\x2c"
> >   [1] ","
> >   > "\x31"
> >  [1] "1"
> > It looks like a ascii/UTF-8 mismatch.  Is the square Outlook's
> > way of saying it is illegal UTF-8?
> >
> > Bill Dunlap
> > Spotfire, TIBCO Software
> > wdunlap tibco.com
> >
> >> This occurs in:
> >> R version 2.10.0 (2009-10-26) & R version 2.10.1 (2009-12-14)
> >> i386-pc-mingw32
> >> locale:
> >> [1] LC_COLLATE=English_United States.1252
> >> [2] LC_CTYPE=English_United States.1252
> >> [3] LC_MONETARY=English_United States.1252
> >> [4] LC_NUMERIC=C
> >> [5] LC_TIME=English_United States.1252
> >> attached base packages:
> >> [1] stats     graphics  grDevices utils     datasets  methods   base
> >> but not in...
> >> R version 2.7.1 (2008-06-23)
> >> x86_64-pc-linux-gnu
> >> locale:
> >> LC_CTYPE=en_US.UTF-8;LC_NUMERIC=C;LC_TIME=en_US.UTF-8;LC_COLLA
> >> TE=en_US.UTF-8;LC_MONETARY=C;LC_MESSAGES=en_US.UTF-8;LC_PAPER=
> >> en_US.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREME
> >> NT=en_US.UTF-8;LC_IDENTIFICATION=C
> >> attached base packages:
> >> [1] stats     graphics  grDevices utils     datasets  methods   base
> >> Sample code:
> >> h.long <- 150
> >> task <- c(rep(1,h.long),rep(2,h.long))
> >> ord <- sample(1:length(task))
> >> task <- task[ord]
> >> taskout <- paste(task,collapse=",")
> >> write(file="please.txt",taskout)
> >> lastcomma <- function(x) {return(paste(x,",",collapse="",sep=""))}
> >> res <- lastcomma(taskout)
> >> write(file="fail.txt",res)
> >> cat(file="catfail.txt",res)
> >>
> >> Any ideas as to how to avoid this problem would be appriciated as well
> >> as suggestions as to whether this is expected behavior, or whether it
> >> ought to be reported as a bug.
> >>
> >> Best,
> >>
> >> Russell Pierce
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list