[R] seek(), Windows and Cygwin (was "a UNIX vs. Windows package question, please")

Mike Miller mbmiller+l at gmail.com
Tue Jan 13 23:05:42 CET 2015


Thanks, everyone.  This is very good news from Henrik because I am 
interested only in binary connections.  It sounds like a function that 
uses seek() is very likely to work well in Windows, so I won't bother to 
warn people.  I should do a little testing just to see that it's working, 
though.

Henrik -- I think you are saying that your experience has shown that the 
code you wrote for catching a corner case was not needed.  Is that right?

Mike


On Tue, 13 Jan 2015, Henrik Bengtsson wrote:

> I/we've been utilizing both read and write seek():s on *binary*
> connections across platforms and file systems, including Windows (at
> least NTFS, but probably also FAT/FAT32 back in the days) in the Aroma
> Framework (e.g. affxparser, R.huge) for ~8 years and counting.  There
> should be thousands and thousands of Windows CPU hours for this by now
> and I still have to see a case/report where seek() was an issue.
>
> Without further references and pointers, I consider that claim in
> help("seek") mostly anecdotal (e.g. someone at some point in time had
> issues on some version on Windows and gave up on narrow it down).  It
> did however made me add lots of internal sanity checks to catch a
> corner case where seek() on Windows is flaky - those assertions still
> haven't failed.
>
> I have little experience with seek() on *text* connections, so Jeff
> may have a point there.
>
> /Henrik
>
> On Tue, Jan 13, 2015 at 12:20 PM, Jeff Newmiller
> <jdnewmil at dcn.davis.ca.us> wrote:
>> I don't know why the R developers made that comment, and R-devel is probably a better place to follow up, but the usual problem is that Windows treats text files differently than binary files, so seeking n text files is a headache. Binary files ought to be okay, but that is a theoretical opinion, not from experience.
>> ---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>>                                       Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
>> ---------------------------------------------------------------------------
>> Sent from my phone. Please excuse my brevity.
>>
>> On January 13, 2015 10:51:18 AM PST, Mike Miller <mbmiller at umn.edu> wrote:
>>> On Fri, 9 Jan 2015, Duncan Murdoch wrote:
>>>
>>>> On 09/01/2015 5:32 PM, Erin Hodgess wrote:
>>>>> Hello again.
>>>>>
>>>>> Here is another question that I am puzzled about:  I had the
>>>>> (incorrect) impression that if I had Rtools on a Windows machine
>>> that I
>>>>> could use any tar.gz package.  However, that is not true.
>>>>>
>>>>> In particular, I was looking at the rPython package.  I do indeed
>>> have
>>>>> Python on this machine. But when I did R CMD INSTALL rPython, I got
>>> an
>>>>> error message that said, "this is a Unix package".  Interesting.
>>>>>
>>>>> Should I just stay with my Ubuntu laptop and behave?
>>>>
>>>> No, but you should not use packages that misbehave.  The ideal R
>>> package
>>>> will run on all platforms where R runs.  Some require effort from the
>>>
>>>> user to provide prerequisites, but no good R package runs only on one
>>>
>>>> platform.
>>>
>>>
>>> That reminds me to ask if anyone here can provide more details about
>>> the
>>> limitations of seek().  I'm working on some functions that use seek()
>>> and
>>> I may have to tell Windows users not to use these functions.
>>>
>>>> From the manual page for seek():
>>>
>>> http://stat.ethz.ch/R-manual/R-devel/library/base/html/seek.html
>>>
>>> "Use of seek on Windows is discouraged. We have found so many errors in
>>>
>>> the Windows implementation of file positioning that users are advised
>>> to
>>> use it only at their own risk, and asked not to waste the R developers'
>>>
>>> time with bug reports on Windows' deficiencies."
>>>
>>> My question is about whether this limitation is caused by the Windows
>>> filesystem, typically NTFS, or if the problem is in the Windows OS.  If
>>>
>>> the problem were in the filesystem, maybe the docs would have said so
>>> because NTFS can be used on other platforms.
>>>
>>> Secondly, can this problem be addressed at all by using Cygwin?  I know
>>>
>>> that Cygwin is running in Windows, so it's still Windows, but R might
>>> be
>>> compiled differently, so I just thought I'd ask!  ;-)
>>>
>>> And it doesn't matter which Windows version is used?
>>>
>>> Finally, if the problem is entirely in Windows, and R cannot possibly
>>> overcome it, I suppose that means that it is impossible to write a
>>> program
>>> to run under Windows that can seek (is it fseek in C?) reliably to a
>>> position in a file.  If that is the case, it's going to be hard to
>>> develop
>>> good systems for managing bioinformatic data on Windows.
>>>
>>> Thanks in advance.
>>>
>>> Mike
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>



More information about the R-help mailing list