[R] gsub help

Sarah Goslee sarah.goslee at gmail.com
Tue Nov 15 02:12:03 CET 2011


On Mon, Nov 14, 2011 at 7:59 PM, Debs Majumdar <debs_stata at yahoo.com> wrote:
> Hi,
>  I am working with the following list of files:
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info"
> [3] "study_chr1.one.phased.impute2.chunk1_info_by_sample"
> [4] "study_chr1.one.phased.impute2.chunk1_summary"
> [5] "study_chr1.one.phased.impute2.chunk1_warnings"
> The folder has many other files. I am trying to use gsub to give me just this file: study_chr1.one.phased.impute2.chunk1
> With Uwe's help I have tried the following:
> fls <- list.files(pattern="^study") # which gives me the list above.
> ufls <- unique(gsub("(_.*)_.*", "\\1", fls))  # which outputs
> [1] "study_chr1.one.phased.impute2.chunk1"
> [2] "study_chr1.one.phased.impute2.chunk1_info_by"

So you want the file name that starts with study and ends in 1?

I'd use grep() rather than gsub(), since you just want to match from a
list, or is there more going on than in your example?

You didn't give a reproducible dataset, but here's a fake one,
matching strings that begin with "a" instead of "study", and ending
with "1" as in your example:

> testdata <- c("abcd1", "abcd1_info", "nota1", "nota1_info")
> testdata[grepl("^a.*1$", testdata)]
[1] "abcd1"

You might really just need
yourdata[grepl("1$", yourdata)]
to select filenames that end in 1.

If that's all you really need, you've made it far too complicated.


Sarah Goslee

More information about the R-help mailing list