[R] Reformatting text inside a data frame

Jon BR jonsleepy at gmail.com
Mon Sep 7 21:27:05 CEST 2015


Hi all,
    I've read in a large data frame that has formatting similar to the one
in the small example below:

df <-
data.frame(c(1,2,3),c(NA,"AD=2;BA=8","AD=9;BA=1"),c("AD=13;BA=49","AD=1;BA=2",NA));
names(df) <- c("rowNum","first","second")

> df
  rowNum     first      second
1      1      <NA> AD=13;BA=49
2      2 AD=2;BA=8   AD=1;BA=2
3      3 AD=9;BA=1        <NA>


I'd like to reformat all of the non-NA entries in df from "first" and
"second" and so-on such that "AD=13;BA=49" will be replaced by the
following string: "13_13-49".

So applied to df, the output would be the following:

  rowNum     first      second
1      1      <NA> 13_13-49
2      2 2_2-8   1_1-2
3      3 9_9-1        <NA>


I'm generally a big proponent of shell scripting with awk, but I'd prefer
an all-R solution if one exists (and also to learn how to do this more
generally).

Could someone point out an appropriate paradigm or otherwise point me in
the right direction?

Best,
Jonathan

	[[alternative HTML version deleted]]



More information about the R-help mailing list