[R] system() and file names with spaces

Brian D Ripley ripley at stats.ox.ac.uk
Thu Dec 9 09:58:02 CET 2004


The normal way to do this is to quote the string, here filename.  See
?shQuote.

Your comments really are not fair: a lot of work has been put into
supporting paths containing spaces on both Windows and Unix by the R
developers (or warning that they are not supported), but not by users.
That includes researching and writing functions like shQuote.

On Thu, 9 Dec 2004, Richard A. O'Keefe wrote:

> Consider the question we had recently:  "how do I count the lines in a file
> without reading it into R?"  The solution I suggested was
>
>     as.numeric(system(paste("wc -l <", filename), TRUE))
>
> Unfortunately, it doesn't work, or at least, not all the time.
> If you already know all about that, and don't care, or already have
> a solution, stop reading now.  Otherwise, let me try to undo any
> harm I may have done by providing a fuller solution.
>
> We've had several reports in this list about problems caused by Windows
> file names with spaces in them.  File names with spaces are also common
> in MacOS X, so common, in fact, that file name completion in a Terminal
> actually works (if you have a file name "Foo Bar", and type F, o, TAB
> you get Foo\ Bar).  File names with spaces are possible in other Unix
> systems too, and always have been, though they are less likely.

That's been a feature of Unix shells with file completion (e.g. tcsh) for
at least a decade -- credit where credit is due, please.

> So suppose there is a file "Foo Bar" you want to find the size of.
> > file.name <- "Foo Bar"
> > system(paste("wc -l <", File.name)
> executes the command
>    wc -l < Foo Bar
> which gives you the size of Bar if there is one, or fails if there is not,
> and ignores Foo (should there be one) and of course ignores "Foo Bar".
>
> What can we do about it?  Well, we can try this:
>
>     for.system <- function (s) gsub(" ", "\\\\ ", s)
>
>     system(paste("wc -l <", for.system(file.name)), TRUE)
>
> Great.  Works for files with spaces in their names.  Now we try some other
> file names.  (File names like this are abundant in MacOS X.)
>
>     file.name <- "Black & White Minstrels/1972"
>
> 	Whoops.  wc -l < Black\ &\ White\ Minstrels/1972
> 	forks off "wc -l <Black\ " and then tries to run
> 	"\ White\ Minstrels/1972".
>
>     file.name <- "Quake(R)/scores"
>
> 	Whoops.  "Badly placed ()'s".
>
>     file.name <- "Drunkard's walk/log-1'
>
> 	Whoops.  "Unmatched '"
>
> So try again.
>
>     for.system <-
> 	function (s) gsub("([][)(}{'\";&! \t])", "\\\\\\1", s)
>
>     line.count <-
> 	function (s) as.numeric(system(paste("wc -l <", for.system(s)), TRUE))
>
> This _still_ isn't perfect, but it is a whole lot better than the naive
> version.  The major remaining problem is that the set of special characters
> and the quoting mechanism need to be changed for Windows.  I _think_ the
> Windows version should be something like
>
>     for.system <- function (s) {
> 	i <- grep("[^-_:.A-Za-z0-9/\\\\]", s)
> 	s[i] <- sapply(s[i], function (s) paste("\"", s, "\"", sep=""))
> 	s
>     }
>
> But what if a file name contains a double quote?  Until someone tells me,
> I'm just going to hope it doesn't happen.  Putting the pieces together,
>
> f% cat >"Foo Bar"
> a b c
> d e
> f
> <EOF>
>
>
> for.system <-
>     if (.Platform$OS.type == "windows") {
>         function (s) {
>             i <- grep("[^-_:.A-Za-z0-9/\\\\]", s)
>             s[i] <- sapply(s[i], function (s) paste("\"", s, "\"", sep=""))
>             s
>         }
>     } else {
>         function (s) gsub("([][)(}{'\";&! \t\n])", "\\\\\\1", s)
>     }
>
> wc <- function (s) {
>     r <- scan(pipe(paste("wc <", for.system(s)), open="r"), n=3, quiet=TRUE)
>     names(r) <- c("lines", "words", "chars")
>     r
> }
>
> > wc("Foo Bar")
> lines words chars
>     3     6    12
> > system("cp $HOME/.login Drunkard\\'s\\ Walk")
> > wc("Drunkard's Walk")["chars"]
> chars
>  3633
> >
>
> If there's already something like for.system() built into R, I'd be very
> happy to know about it.  (It's a little odd that system() and pipe()
> don't already support something like this; in a multi-element character
> vector the first could be taken literally and the remaining ones could be
> taken quoted with leading spaces.)
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list