[R] substituting dots in the names of the columns (sub, gsub, regexpr)
8rino-Luca Pantani
ottorino-luca.pantani at unifi.it
Thu Jul 26 15:40:40 CEST 2007
Dear R users,
I have the following two problems, related to the function sub, grep,
regexpr and similia.
The header of the file(s) I have to import is like this.
c("y (m)", "BD (g/cm3)", "PR (Mpa)", "Ks (m/s)", "SP g./g.", "P
(m3/m3)", "theta1 (g/g)", "theta2 (g/g)", "AWC (g/g)")
To get rid of spaces and symbols in the names of the columns,
I use read.table(... check.names=TRUE) and I get:
str <- c("y..m.", "BD..g.cm3.", "PR..Mpa.", "Ks..m.s.", "SP.g..g.",
"P..m3.m3.", "theta1..g.g.", "theta2..g.g.", "AWC..g.g.")
Now, my problem is to remove the trailing dots, as well as the double
dots, in order to get the names like the following
c("y.m", "BD.g.cm3", "PR.Mpa", "Ks.m.s", "SP.g.g", "P.m3.m3.",
"theta1.g.g", "theta2.g.g", "AWC.g.g")
I've searched the help pages for sub, regexpr and similia, and also
searched the help archives.
I understand that the dot is a peculiar sign since
sub("..", ".", str)
[1] "..m." "...g.cm3." "...Mpa." "...m.s." "..g..g."
[6] "..m3.m3." ".eta1..g.g." ".eta2..g.g." ".C..g.g."
Therefore I tried
sub("\\..", ".", str)
[1] "y.m." "BD.g.cm3." "PR.Mpa." "Ks.m.s." "SP...g."
[6] "P.m3.m3." "theta1.g.g." "theta2.g.g." "AWC.g.g."
and I've been surprised by the (to me) strange behaviour in "SP.g..g."
modified in "SP...g."
An this is the first problem I cannot solve.
Then there's the problem of trailing dot removal.
In
http://tolstoy.newcastle.edu.au/R/e2/help/07/01/8665.html
I've found a somewhat similar problem, but it do not works in this case
since:
gsub("[.].*", "", str)
[1] "y" "BD" "PR" "Ks" "SP" "P" "theta1" "theta2"
[9] "AWC"
And this the second problem
Apart this particular problems I would like to know more on regexp, sub
and so on, since each time
I have strings to manipulate, I must face my ignorance in the topic of
regular expression and its syntax.
Is there any page with examples, where I can improve my knowledge and
stop being frustrated each time I have to manipulate strings?
8rino
--
Ottorino-Luca Pantani, Università di Firenze
Dip. Scienza del Suolo e Nutrizione della Pianta
P.zle Cascine 28 50144 Firenze Italia
Tel 39 055 3288 202 (348 lab) Fax 39 055 333 273
OLPantani at unifi.it
More information about the R-help
mailing list