[R] a function more appropriate than 'sapply'?
ligges at statistik.tu-dortmund.de
Sat Jan 26 21:09:00 CET 2013
On 26.01.2013 20:46, Berend Hasselman wrote:
> On 26-01-2013, at 19:43, emorway <emorway at usgs.gov> wrote:
>> I'm wondering if I need to use a function other than sapply as the following
>> line of code runs indefinitely (or > 30 min so far) and uses up all 16Gb of
>> memory on my machine for what seems like a very small dataset (data attached
>> in a txt file wells.txt
>> <http://r.789695.n4.nabble.com/file/n4656723/wells.txt> ). The R code is:
>> The 2nd line of R code above gets bogged down and takes all my RAM with it:
>> I'm simply trying to extract all of the lines of data that have a single "_"
>> in the first column and place them into a dataset called "wells2". If that
>> were to work, I then want to extract the lines of data that have two "_" and
>> put them into a separate dataset, say "wells3". Is there a better way to do
>> this than the one-liner above?
> Read your file with
> wells<-read.table("wells.txt",col.names=c("name","plc_hldr"), stringsAsFactors=FALSE)
> Remove all non underscores with
> w.sub <- gsub("[^_]+","",wells[,1])
> then select elements of w.sub with 2 underscores and a single underscore with
> u.2 <- which(w.sub=="__")
> u.1 <- which(w.sub=="_")
> and use u.1 and u.2 to select the appropriate rows of wells.
wells1 <- wells[grep("^[^\\_]*_[^\\_]*$", wells[,1]),]
wells2 <- wells[grep("^[^\\_]*_[^\\_]*_[^\\_]*$", wells[,1]),]
> I tried to select rows containing 1 or 2 underscores with grep regular expressions but that appeared to be more difficult than I had expected.
> The method above is quick.
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help