[R] Argh! Trouble using string data read from a file

Prof Brian Ripley ripley at stats.ox.ac.uk
Thu Oct 16 06:20:27 CEST 2008


On Wed, 15 Oct 2008, Ted Byers wrote:

> Thanks Jim,
>
> I hadn't seen the distinction between the commandline in RGui and what
> happens within my code.
>
> I have, however seen other differences I don't understand.  For
> example, looking at the documentation for RScript, I see:
>
> Rscript [options] [-e expression] file [args]
>
> And the example:
>
> Rscript -e 'date()' -e 'format(Sys.time(), "%a %b %d %X %Y")'
>
>
> So I tried it (Windows XP; R2.7.2), and this is what I got with just
> copy directly from the documentation and pasting into the Windows
> commandline window:

Your problem is the shell quoting: the Windows shell requires ". E.g.

C:\> d:/R/R-2.7.2/bin/Rscript -e "date()" -e "format(Sys.time(), \"%a %b %d %X %Y\")"
[1] "Thu Oct 16 05:16:46 2008"
[1] "Thu Oct 16 05:16:46 2008"

Other shells (e.g. bash, tcsh) do allow '', and indeed that is the 
preferred form there.  See ?shQuote .

>
> C:\>Rscript -e 'date()' -e 'format(Sys.time(), "%a %b %d %X %Y")'
> [1] "date()"
>
> C:\>Rscript -e 'format(Sys.time(), "%a %b %d %X %Y")'
>
> C:\>
>
> But within RGui, I get:
>
>> date();format(Sys.time(), "%a %b %d %X %Y")
> [1] "Wed Oct 15 20:36:57 2008"
> [1] "Wed Oct 15 8:36:57 PM 2008"
>>
>
> Thanks again
>
> Ted
>
> On Wed, Oct 15, 2008 at 8:09 PM, jim holtman <jholtman at gmail.com> wrote:
>> You have to explicitly 'print' the value of x in the loop:    print(x)
>>
>> 'x' by itself is just it value.  At the command line, typing an
>> objects name is equivalent to printing that object, but it only
>> happens at the command line.  If you want a value printed, the 'print'
>> it.  Also works at the command line if you want to use it there also.
>>
>> On Wed, Oct 15, 2008 at 5:36 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
>>> Actually, I'd tried single brackets first.  Here is what I got:
>>>
>>>> for (i in 1:length(V4) ) { x = read.csv(V4[i], header = FALSE, na.strings="");x }
>>> Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>>>  'file' must be a character string or connection
>>>>
>>>
>>>
>>> the advice to use as.character worked, in that progress has been made.
>>>
>>> Can you guys explain the following output, though?
>>>
>>>> setwd("K:\\MerchantData\\RiskModel\\AutomatedRiskModel")
>>>> for (i in 1:length(V4) ) { x = read.csv(as.character(V4[[i]]), header = FALSE, na.strings="");x }
>>>> x
>>>  V1
>>> 1  0
>>>> x = read.csv(as.character(V4[[1]]), header = FALSE, na.strings="");x
>>>     V1
>>> 1     0
>>> 2     0
>>> 3    21
>>> 4     0
>>> 5     1
>>> 6     7
>>> 7    51
>>> 8    20
>>> 9     3
>>> 10    5
>>> 11    6
>>> 12    8
>>> 13    2
>>> 14    0
>>> 15    2
>>> 16    4
>>> 17   23
>>>
>>> Clearly, if I hand write a line to read the data, getting the file
>>> name from V4 (in this case V4[[1]]), I get the data into 'x', which I
>>> can then display.  I only displayed the first few as some of these
>>> files will have thousands of values.
>>>
>>> But what puzzles me is that I saw virtually no output from my loop.  I
>>> thought what would happen (with the x after the ';') is that the
>>> contents of each file would be displayed after it is read and before
>>> the first is read.  And after the loop finishes, there is nothing in
>>> x.  I don't see why the contents of x would disappear after the loop,
>>> unless R has scoping restrictions as stringent as, say, C++ (e.g. a
>>> variable declared inside a loop is not visible outside the loop).  But
>>> that would beg the question as to how to declare a variable before it
>>> is first used.
>>>
>>> This doesn't bode well for me, or perhaps my ability to learn a new
>>> trick at my age, when such a simple loop should give me such trouble.
>>> :-(
>>>
>>> Getting more grey hair by the minute.  :-(
>>>
>>> Thanks
>>>
>>> ted
>>>
>>> On Wed, Oct 15, 2008 at 5:12 PM, Rolf Turner <r.turner at auckland.ac.nz> wrote:
>>>>
>>>> On 16/10/2008, at 10:03 AM, jim holtman wrote:
>>>>
>>>>> try putting as.character in the call:
>>>>>
>>>>> x = read.csv(as.character(V4[[i]]), header = FALSE
>>>>
>>>> No.  This won't help.  V4 is a column of the data frame optdata,
>>>> and hence is a vector.  Not a list!  Use single brackets --- V4[i] ---
>>>> and all will be well.
>>>>
>>>>        cheers,
>>>>
>>>>                Rolf
>>>>>
>>>>> On Wed, Oct 15, 2008 at 4:46 PM, Ted Byers <r.ted.byers at gmail.com> wrote:
>>>>>>
>>>>>> Here is what I tried:
>>>>>>
>>>>>> optdata =
>>>>>> read.csv("K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat",
>>>>>> header = FALSE, na.strings="")
>>>>>> optdata
>>>>>> attach(optdata)
>>>>>> for (i in 1:length(V4) ) { x = read.csv(V4[[i]], header = FALSE,
>>>>>> na.strings="");x }
>>>>>>
>>>>>> And here  is the outcome (just a few of the 60 records successfully
>>>>>> read):
>>>>>>>
>>>>>>> optdata =
>>>>>>>
>>>>>>> read.csv("K:\\MerchantData\\RiskModel\\AutomatedRiskModel\\soptions.dat",
>>>>>>> header = FALSE, na.strings="")
>>>>>>> optdata
>>>>>>
>>>>>>   V1   V2 V3                        V4
>>>>>> 1  251 2008 18 Plus_Shipping.2008.18.dat
>>>>>> 2  251 2008 19 Plus_Shipping.2008.19.dat
>>>>>> 3  251 2008 20 Plus_Shipping.2008.20.dat
>>>>>> 4  251 2008 22 Plus_Shipping.2008.22.dat
>>>>>> 5  251 2008 23 Plus_Shipping.2008.23.dat
>>>>>> 6  251 2008 24 Plus_Shipping.2008.24.dat
>>>>>>
>>>>>> I can see the data has been correctly read.  But for some reason that
>>>>>> isn't
>>>>>> clear, read.csv doesn't like the data in the last column.
>>>>>>
>>>>>>> attach(optdata)
>>>>>>> for (i in 1:length(V4) ) { x = read.csv(V4[[i]], header = FALSE,
>>>>>>> na.strings="");x }
>>>>>>
>>>>>> Error in read.table(file = file, header = header, sep = sep, quote =
>>>>>> quote,
>>>>>> :
>>>>>>  'file' must be a character string or connection
>>>>>>>
>>>>>>> V4[[1]]
>>>>>>
>>>>>> [1] Plus_Shipping.2008.18.dat
>>>>>> 60 Levels: Easyway.2008.17.dat Easyway.2008.18.dat Easyway.2008.19.dat
>>>>>> Easyway.2008.20.dat ... Secured_Pay.2008.31.dat
>>>>>>
>>>>>>>
>>>>>>
>>>>>> The last column is comprised of valid Windows filenames (and no
>>>>>> whitespace,
>>>>>> so as not to confuse things).
>>>>>>
>>>>>> I see in the docuentation "`[[...]]' is the operator used to select a
>>>>>> single
>>>>>> element, whereas `[...]' is a general subscripting operator.", so I
>>>>>> assume
>>>>>> V4[[i]] is the correct way to get the ith value from V4.  So why does
>>>>>> read.csv complain that "'file' must be a character string or connection"?
>>>>>> It seems obvious that the value in V4[[i]i] is a string.  V4[[1]] does
>>>>>> give
>>>>>> me the right value, although that is followed by output I didn't ask for.
>>>>>>
>>>>>> In the loop above, I was going to replace the output obtained by 'x' with
>>>>>> output from fitdistr(x,"exponential"), but I can't proceed with that
>>>>>> until I
>>>>>> can get the data in these files read.
>>>>>>
>>>>>> What have I missed?
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Ted
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/Argh%21--Trouble-using-string-data-read-from-a-file-tp20002064p20002064.html
>>>>>> Sent from the R help mailing list archive at Nabble.com.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Jim Holtman
>>>>> Cincinnati, OH
>>>>> +1 513 646 9390
>>>>>
>>>>> What is the problem that you are trying to solve?
>>>>>
>>>>> ______________________________________________
>>>>> R-help at r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>
>>>>
>>>> ######################################################################
>>>> Attention:This e-mail message is privileged and confidential. If you are not
>>>> theintended recipient please delete the message and notify the sender.Any
>>>> views or opinions presented are solely those of the author.
>>>>
>>>> This e-mail has been scanned and cleared by
>>>> MailMarshalwww.marshalsoftware.com
>>>> ######################################################################
>>>>
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>>
>> --
>> Jim Holtman
>> Cincinnati, OH
>> +1 513 646 9390
>>
>> What is the problem that you are trying to solve?
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595



More information about the R-help mailing list