[R] dir() and RegEx and gsub()

Gabor Grothendieck ggrothendieck at gmail.com
Thu Jun 9 19:10:09 CEST 2005


On 6/9/05, Hans-Peter <gchappi at gmail.com> wrote:
> Dear R-Users,
> 
> I have two questions:
> 
> a)
> in a directory there are 3 files:
> [1] "Data.~csv"            "Kopie von Data.~csv"  "VorlageTradefile.csv"
> 
> The command "dir( fold, pattern = "\.csv" )" gives back *all* the 3 files
> With dir( fold, pattern = "\\.csv" ) I get back only VorlageTradefile.csv.
> I don't understand this behaviour, IMHO the regex expression "\.csv"
> becomes the string ".csv" and "\\.csv" becomes "\.csv". So the first
> string should catch it. This is also consistent with the result when I
> tried with the TRegExpr Tool. Could somebody explain what's going on
> here?

The dot (.) is a wildcard that matches any character so .csv will 
match the ~csv since the . matches the ~.

By the way, note that

1.  "[.]csv" is one way to specify a literal dot without using backslashes
2.  you probably want "[.]csv$" so that a.csv.txt is not matched.
3. Some regular expression functions have a fixed= argument that
    causes them to regard all special characters like . and * as regular
    characters but unfortunately dir lacks that argument.

> 
> b)
> I need to handle a copied windows file path. This is certainly often
> asked but I didn't find a solution.
> How can I convert, e.g.
> 
> myfile <- "D:\UebungenNDK\DataMining\DataMiningSeries.r"

Variable myfile, as you have written it above, has no backslashes in it 
so there is no way way to know where they are supposed to be.  Maybe \
what you mean is that you have a variable that is _stored_ as:

D:\UebungenNDK\...etc..

In that case its already the same as myfile <- "D:\\UebungenNDK\\...etc.."
Use nchar to check how many characters are stored.

e.g.

nchar("D:\\abc")  # there are 6, not 7, characters in this string

> in either:
> 
> myfile
> [1]  "D:\\UebungenNDK\\DataMining\\DataMiningSeries.r"
> 
> or:
> myfile
> [1]  "D:/UebungenNDK/DataMining/DataMiningSeries.r"
> 
> Would be great to hear about a possibility!

You can convert backslashes to forward slashes using gsub

gsub("\\", "/", "D:\\abc", fixed = TRUE)

Note that internally Windows understands forward slashes
although many of the Windows commands do not.

In case I did not understand your question have a look at ?file.path
and also ?glob2rx in package sfsmisc.  The first one will construct
paths and the second one allows you specify wildcards using globbing
instead of regular expressions.




More information about the R-help mailing list