y with diaeresis (PR#143)

guido@hal.stat.unipd.it guido@hal.stat.unipd.it
Sun, 14 Mar 1999 11:09:32 +0100


[Not important for message below; in this moment link from CRAN
 to the 'R Bug Tracking system (http://blueberry.kubism.ku.dk/R)' 
 is down (at least, I get a "The requested URL /R was not 
 found on this server." message)]

To display properly this message, you mailer must handle properly
latin1 encoding. With other encoding, I suppose that the role
played by 'y diaeresis' should be played by the character
with code '255'

Look to the following example:
----------------------------------------------------------------------------
[~/tmp]% uname -a
Linux tui 2.0.34 #1 Thu Jan 14 18:56:24 CET 1999 i686 unknown
[~/tmp]% cat a.R
x <- "12ÀÁÿ"
cat(x,"\n")
[~/tmp]% ../src/R-release/bin/R --vanilla < a.R

R : Copyright 1999, The R Development Core Team
Version 0.63.3 in progress (February 25, 1999)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type    "?license" or "?licence" for distribution details.

R is a collaborative project with many contributors.
Type    "?contributors" for a list.

Type    "demo()" for some demos, "help()" for on-line help, or
        "help.start()" for a HTML browser interface to help.
Type    "q()" to quit R.

> x <- "12ÀÁÿ"
+ cat(x,"\n")
+ 
[~/tmp]% 
----------------------------------------------------------------------------
Look to the 'continuation' prompt. It seems that first line isn't 
parsed to end. 
[Note: I have tested this also with the final 0.63.3; simply,
it is not installed on machine I am writing in this moment]

Problem is due to 'y diaeresis'.

This is from ESS (not quoting the 'y diaeresis'; to insert it in 
a emacs buffer I have to hit ^Q377)
----------------------------------------------------------------------------
> ÿ

Save workspace image? [y/n/c]:
----------------------------------------------------------------------------
i.e, 'y diaeresis' seems interpreted as an EOF.

I don't know where is the solution but problem seems due to the
conversions from int to char (made implicitly by R_ReadConsole)
and then back to int made in src/main/gram.y(buffer_getc). Result
is that the 'if (c==EOF)' test in that function is true when c='y diaeresis'.

Indeed:
----------------------------------------------------------------------------
[~/tmp]% cat a.c
#include <stdio.h>

int main(int argc,char **argv) {
  int i,j;
  char c;
  while ((i=fgetc(stdin))!=EOF) {
    c = i;
    j = c;
    if (c=='\n') printf("\n");
    else  printf("/%d/%d/%d/%c/%d ",i,j,c,c,j==EOF);
  }
}
[~/tmp]% gcc -o a a.c
[~/tmp]% ./a < a.R
/120/120/120/x/0 /32/32/32/ /0 /60/60/60/</0 /45/45/45/-/0 /32/32/32/ /0 /34/34/34/"/0 /49/49/49/1/0 /50/50/50/2/0 /192/-64/-64/À/0 /193/-63/-63/Á/0 /255/-1/-1/ÿ/1 /34/34/34/"/0 
/99/99/99/c/0 /97/97/97/a/0 /116/116/116/t/0 /40/40/40/(/0 /120/120/120/x/0 /44/44/44/,/0 /34/34/34/"/0 /92/92/92/\/0 /110/110/110/n/0 /34/34/34/"/0 /41/41/41/)/0 
------------------------------------------------------------------------------
Look to the sequence including 'y diaeresis'. It shows that after the
double conversion 'y diaeresis'==EOF is TRUE.

g.




-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._