[R] extracting quoted text from character string

Corey Moffet cmoffet at nwrc.ars.usda.gov
Mon Oct 13 21:02:02 CEST 2003


Hello all,

I am trying to solve a problem, and my solution is rather ugly and not very
general.  The posts for "[R] help with gsub and grep functions" seemed
relevent
and gave me hope for a more refined and more general solution.

The Problem:

line <- "'this text has spaces' 'thisNot' 3 4 5 6 7 8 9 10"
bad.line <- "'this text has spaces' thisNot 3 4 5 6 7 8 9 10"

The desired result of a process on 'line' or "bad.line":

> parts <- some.function(line)

> parts
 [1] "this text has spaces"
 [2] "thisNot"
 [3] "3"
 [4] "4"
 [5] "5"
 [6] "6"
 [7] "7"
 [8] "8"
 [9] "9"
[10] "10"

Current function to obtain a solution for "line" but not "bad.line":

"some.function" <- function(line, quote.char = "'") {
   quoted <- unlist(strsplit(line, quote.char))
   quoted <- quoted[quoted != ""]
   first <- quoted[1]
   second <- quoted[3]
   last <- quoted[4]
   last.parts <-unlist(strsplit(last, " "))
   last.parts <- last.parts[last.parts != ""]
   out <- c(first, second, last.parts)
   return(out)
}

This solution is not very good because the text parts of "line" are not 
required to be enclosed in quotations unless it has a space.  All the files
I currently have to process have the first two pieces enclose in "'".  But
it is future files that I worry about.  Is there an existing function that
I have overlooked that splits strings, ignoring the delimiter when it is
enclosed in quotes?  I know that I can do some testing on the length of
"quoted" in function "some.function" but it seems there should be a more
elegent way of doing this type of thing. Any suggestions?

With best wishes and kind regards I am

Sincerely,

Corey A. Moffet, Ph.D.
Support Scientist

University of Idaho
Northwest Watershed Research Center
800 Park Blvd, Plaza IV, Suite 105
Boise, ID 83712-7716

Voice: (208) 422-0718
FAX:   (208) 334-1502




More information about the R-help mailing list