source() crashes on long lines (PR#1900)

ripley@stats.ox.ac.uk ripley@stats.ox.ac.uk
Wed, 14 Aug 2002 10:30:49 +0200 (MET DST)


  This message is in MIME format.  The first part should be readable text,
  while the remaining parts are likely unreadable without MIME-aware tools.
  Send mail to mime@docserver.cac.washington.edu for more info.

------=_NextPart_000_0006_01C2432C.7ACD6BA0
Content-Type: TEXT/PLAIN; CHARSET=iso-8859-1
Content-Transfer-Encoding: 8BIT
Content-ID: <Pine.GSO.4.44.0208140920312.15226@auk.stats>

This is a separate bug, reproducible on Solaris, so I am re-filing it as a
new report.

There may well need to be a linewidth limit, but this should be caught.

R --vanilla < TooLongLines.R does work.

R-bugs mangles attachments, but my mailer does not like very long lines.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

---------- Forwarded message ----------
Date: Wed, 14 Aug 2002 00:49:43 +0200
From: Henrik Bengtsson <hb@maths.lth.se>
To: ripley@stats.ox.ac.uk
Cc: r-devel@stat.math.ethz.ch, R-bugs@biostat.ku.dk
Subject: RE: R CMD check: Too long [R] code line generated (PR#1900)

I did some error tracking myself. I verified that the Perl generated
temporary [R] file was correct. I narrowed down the problem to be with the
redirecting of the standard input, i.e. <, into [R] and I also found that I
could not get the same problem if I source()'d the same file. Then I thought
it might be my shell (Cygwin/bash), but not! Doing

 % cat < R_CMD_check.out

is just fine, but when doing

  R --vanilla --quiet < R_CMD_check.out

I run into problems. It looks like every line that is long enough will get
some garbage bytes inserted at column 1023. I have attached the
R_CMD_check.out file with comments for you. As you see, it is not what R CMD
check have generated, but as the problem is not with R CMD check itself
anymore I removed the non-interesting parts.

Maybe related to this is another problem that makes source() crash if I have
a too long string (>2128 characters) inside a function definition. The
attached R script TooLongLines.R is supposed to illustrate this. It also
contains Dr. Mingw's report.

FYI: I'll be out of the office from Wednesday noon-Sunday.

Henrik Bengtsson

> -----Original Message-----
> From: ripley@stats.ox.ac.uk [mailto:ripley@stats.ox.ac.uk]
> Sent: Tuesday, August 13, 2002 11:43 AM
> To: hb@maths.lth.se
> Cc: r-devel@stat.math.ethz.ch; R-bugs@biostat.ku.dk
> Subject: Re: R CMD check: Too long [R] code line generated (PR#1900)
>
>
> On Tue, 13 Aug 2002 hb@maths.lth.se wrote:
>
> > Full_Name: Henrik Bengtsson
> > Version: 1.5.1
> > OS: WinMe
> > Submission from: (NULL) (217.210.0.243)
> >
> >
> > In the Perl script $R_HOME/bin/check there is a bug under the
> section "Check R
> > code for syntax errors" where the 'Rfiles <- c(...)' is build
> up. If there are
> > too many files in @Rfiles the source code line generated will
> be too long and
> > weird things will happen, e.g. strange bytes/characters will be
> inserted.
>
> > SUGGESTION/SOLUTION:
> > Add a new line character after each "join";
> >
> >  $Rcmd .= join("\",\n \"", @Rfiles) . "\")\n";
> >
> > instead of
> >
> >  $Rcmd .= join("\", \"", @Rfiles) . "\")\n";
> >
> >
> > SUGGESTION II:
> > I ran into similar problems in other situations where I
> autogenerated too long
> > source lines. The errors where hard to reproduce and took time
> to find. It would
> > be nice if the [R] parser, or some previous engine that reads
> the source file,
> > could given an explicit error saying that the "line was too long".
> >
> > EXAMPLE:
> > In my case I got in perl '@Rfiles' was:
> > "R/LogBook.R R/LogEntry.R R/GenePixData.R R/Layout.R R/ImaGeneData.R
> > R/MicroarrayData.PLOT.R R/QualityData.R R/MicroarrayData.R R/MAData.R
> > R/MAData.NORM.R R/zzz.R R/Replicates.R R/SMA.R R/FieldFilter.R
> R/Filter.R
> > R/QuantArrayData.R R/RGData.R R/RawData.R R/000.R R/MicroarrayData.IO.R
> > R/LayoutGroups.R R/BFilter.R R/com.braju.sma.R R/Copy of
> TFilter.R R/Copy of
> > SSMatrix.R R/private.utils.R R/WangetalData.R R/KHessFilter.R
> R/DfFilter.R
> > R/SpotData.R R/ScanAlyzeData.R R/BMAData.R R/AFilter.R R/MFilter.R
> > R/Layout.obsolete.R R/AndFilter.R R/OrFilter.R R/SEFilter.R R/TFilter.R
> > R/AcceptFilter.R R/SpotPosition.R R/SerialFilter.R R/GSRArray.R
> > R/MultiwayArray.R R/KHessData.R R/GeneAcceptFilter.R R/GeneData.R
> > R/GeneRejectFilter.R R/TMAData.R R/NotFilter.R R/ParallelFilter.R
> > R/RejectFilter.R R/ReplicateOutlierFilter.R R/WorkData.R R/Matrix.R
> > R/MicroarrayData.LOG.R R/MicroarrayData.NORM.R R/SSMatrix.R"
> >
> > but in [R] 'Rfiles' became:
> >  [1] "R/LogBook.R"                   "R/LogEntry.R"
> >  [3] "R/GenePixData.R"               "R/Layout.R"
> >  [5] "R/ImaGeneData.R"               "R/MicroarrayData.PLOT.R"
> >  [7] "R/QualityData.R"               "R/MicroarrayData.R"
> >  [9] "R/MAData.R"                    "R/MAData.NORM.R"
> > [11] "R/zzz.R"                       "R/Replicates.R"
> > [13] "R/SMA.R"                       "R/FieldFilter.R"
> > [15] "R/Filter.R"                    "R/QuantArrayData.R"
> > [17] "R/RGData.R"                    "R/RawData.R"
> > [19] "R/000.R"                       "R/MicroarrayData.IO.R"
> > [21] "R/LayoutGroups.R"              "R/BFilter.R"
> > [23] "R/com.braju.sma.R"             "R/Copy of TFilter.R"
> > [25] "R/Copy of SSMatrix.R"          "R/private.utils.R"
> > [27] "R/WangetalData.R"              "R/KHessFilter.R"
> > [29] "R/DfFilter.R"                  "R/SpotData.R"
> > [31] "R/ScanAlyzeData.R"             "R/BMAData.R"
> > [33] "R/AFilter.R"                   "R/MFilter.R"
> > [35] "R/Layout.obsolete.R"           "R/AndFilter.R"
> > [37] "R/OrFilter.R"                  "R/SEFilter.R"
> > [39] "R/TFilter.R"                   "R/AcceptFilter.R"
> > [41] "R/SpotPosition.R"              "R/SerialFilter.R"
> > [43] "R/GSRArray.R"                  "R/MultiwayArray.R"
> > [45] "R/KHessData.R"                 "R/GeneAcceptFilter.R"
> > [47] "R/GeneData.R"                  "R/GeneRejectFilter.R"
> > [49] "R/TMAData.R"                   "R/NotFilter.R"
> > [51] "R/ParallelFilter.R"            "R/RejectFilter.R"
> > [53] "R/ReplicateOutlierFilter.R"    "R/WorkData.R"
> > [55] "R/Matrix.R"                    "R/Micr\b‘_ÜIoarrayData.LOG.R"
> > [57] "R/MicroarrayData.NORM.R"       "R/SSMatrix.R"
> >
> > Look at the 56 element. Trying to add or remove files, I found
> that this was
> > always happening at the same column in the Rfiles row and
> inferred that this is
> > due to the maximum line length in [R] source files. Correct?
>
> There is no such line limit.  If I take your file list, convert it via the
> orginal join and read it into R, I don't get a problem, under Linux or
> Windows XP. I think you are hitting the limitations of your `operating
> system', not R.  But without a reproducible example it is hard to tell.
>
> Package survival has 92 files with file names at least as long as yours,
> and that works perfectly happily on Windows XP.
>
> Can you make available a sample package which exhibits the problem?
>
> Nevertheless, joining with "\n" should be a palliative measure.
>
> --
> Brian D. Ripley,                  ripley@stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272860 (secr)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>
>
>
>

------=_NextPart_000_0006_01C2432C.7ACD6BA0
Content-Type: APPLICATION/OCTET-STREAM; NAME="TooLongLines.R"
Content-Transfer-Encoding: QUOTED-PRINTABLE
Content-ID: <Pine.GSO.4.44.0208140920320.15226@auk.stats>
Content-Description: 
Content-Disposition: ATTACHMENT; FILENAME="TooLongLines.R"

# Doing source("TooLongLines.R") on this file and on [R] v1.5.1 on WinMe =
will
# make [R] crash. The "reason" is that there is a too long source line =
*inside*
# a function defintion. However, declaring a character variable with a =
very
# very long string works. See more comments below.


myFcn <- function() {
# I have inserted roman letter (L =3D 50, C =3D 100, D =3D 500, M =3D =
1000) in the string below to easier get the position of each character.
# Comments or something else, e.g.
x <- 2
# does not matter, the problem is the next very long line (>=3D2128 =
characters) inside a function() defintion:
"234567890123456789012345678901234567890123456789L12345678901234567890123=
45678901234567890123456789C1234567890123456789012345678901234567890123456=
789L1234567890123456789012345678901234567890123456789C1234567890123456789=
012345678901234567890123456789L123456789012345678901234567890123456789012=
3456789C1234567890123456789012345678901234567890123456789L123456789012345=
6789012345678901234567890123456789C12345678901234567890123456789012345678=
90123456789L1234567890123456789012345678901234567890123456789D12345678901=
23456789012345678901234567890123456789L1234567890123456789012345678901234=
567890123456789C1234567890123456789012345678901234567890123456789L1234567=
890123456789012345678901234567890123456789C123456789012345678901234567890=
1234567890123456789L1234567890123456789012345678901234567890123456789C123=
4567890123456789012345678901234567890123456789L12345678901234567890123456=
78901234567890123456789C1234567890123456789012345678901234567890123456789=
L1234567890123456789012345678901234567890123456789M1234567890123456789012=
345678901234567890123456789L123456789012345678901234567890123456789012345=
6789C1234567890123456789012345678901234567890123456789L123456789012345678=
9012345678901234567890123456789C12345678901234567890123456789012345678901=
23456789L1234567890123456789012345678901234567890123456789C12345678901234=
56789012345678901234567890123456789L1234567890123456789012345678901234567=
890123456789C1234567890123456789012345678901234567890123456789L1234567890=
123456789012345678901234567890123456789D123456789012345678901234567890123=
4567890123456789L1234567890123456789012345678901234567890123456789C123456=
7890123456789012345678901234567890123456789L12345678901234567890123456789=
01234567890123456789C1234567890123456789012345678901234567890123456789L12=
34567890123456789012345678901234567890123456789C1234567890123456789012345=
678901234567890123456789L123456789012345678901234567890123456789012345678=
9C1234567890123456789012345678901234567890123456789L123456789012345678901=
2345678901234567890123456789M12345678901234567890123456789012345678901234=
56789L1234567890123456789012345678901234567890123456789C12345678901234567=
8901234567"
# Comments or something else, e.g.
x <- 2
# does not matter. Any line shorter than 2128 works fine, i.e. removing
# the last '7' from the string above and source("TooLongLines.R") will
# be just fine.
} # myFcn()


# 'Rterm --vanilla' and 'RGUI.exe' ([R] v1.5.1 (2002-06-17) on WinMe) =
both
# give 'Signal 127' and Dr. Mingw reports:
#
#  RTERM.EXE caused an Access Violation at location 00486a03 in module =
R.DLL Writing to location 00486a95.
# =20
#  Registers:
#  eax=3D00486a95 ebx=3D0091e990 ecx=3D0091ecb0 edx=3D00484028 =
esi=3D00000000 edi=3D0000007b
#  eip=3D00486a03 esp=3D0091e8f4 ebp=3D0091e900 iopl=3D0         nv up =
ei pl nz na po cy
#  cs=3D018f  ss=3D0197  ds=3D0197  es=3D0197  fs=3D271f  gs=3D0000      =
       efl=3D00010207
# =20
#  Call stack:
#  00486A03  R.DLL:00486A03  R_Parse
#  004873D1  R.DLL:004873D1  Rf_yyerror
#  00487933  R.DLL:00487933  Rf_isValidName
#  00487ECC  R.DLL:00487ECC  Rf_yylex
#  00484D71  R.DLL:00484D71  Rf_yyparse
#  004866DF  R.DLL:004866DF  Rf_yyparse
#  004869E1  R.DLL:004869E1  R_Parse
#  00486B66  R.DLL:00486B66  R_ParseConn
#  004F6AC3  R.DLL:004F6AC3  do_parse
#  004AA5A7  R.DLL:004AA5A7  do_internal
#  0047F494  R.DLL:0047F494  Rf_eval
#  00480808  R.DLL:00480808  do_begin
#  0047F494  R.DLL:0047F494  Rf_eval
#  0047FAA6  R.DLL:0047FAA6  Rf_applyClosure
#  0047F6DD  R.DLL:0047F6DD  Rf_eval
#  00481065  R.DLL:00481065  do_set
#  0047F494  R.DLL:0047F494  Rf_eval
#  004812D9  R.DLL:004812D9  Rf_evalList
#  0047F5E2  R.DLL:0047F5E2  Rf_eval
#  00481065  R.DLL:00481065  do_set
#  0047F494  R.DLL:0047F494  Rf_eval
#  00480808  R.DLL:00480808  do_begin
#  0047F494  R.DLL:0047F494  Rf_eval
#  0047FAA6  R.DLL:0047FAA6  Rf_applyClosure
#  0047F6DD  R.DLL:0047F6DD  Rf_eval
#  0049D325  R.DLL:0049D325  R_PromptString
#  0049DC74  R.DLL:0049DC74  run_Rmainloop
#  0049DC8C  R.DLL:0049DC8C  Rf_mainloop
#  0040153D  RTERM.EXE:0040153D
#  0040139C  RTERM.EXE:0040139C
#  0040166D  RTERM.EXE:0040166D
#  004010F4  RTERM.EXE:004010F4
#  004011EF  RTERM.EXE:004011EF
#  BFF7B9E4  KERNEL32.DLL:BFF7B9E4  IsDBCSLeadByte
#  BFF7B896  KERNEL32.DLL:BFF7B896  IsDBCSLeadByte
#  BFF7A24F  KERNEL32.DLL:BFF7A24F  MakeCriticalSectionGlobal







------=_NextPart_000_0006_01C2432C.7ACD6BA0--

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._