Jon BR jonsleepy at gmail.com
Mon Sep 7 21:27:05 CEST 2015

Hi all,
    I've read in a large data frame that has formatting similar to the one
in the small example below:

df <-
names(df) <- c("rowNum","first","second")

> df
  rowNum     first      second
1      1      <NA> AD=13;BA=49
2      2 AD=2;BA=8   AD=1;BA=2
3      3 AD=9;BA=1        <NA>

I'd like to reformat all of the non-NA entries in df from "first" and
"second" and so-on such that "AD=13;BA=49" will be replaced by the
following string: "13_13-49".

So applied to df, the output would be the following:

  rowNum     first      second
1      1      <NA> 13_13-49
2      2 2_2-8   1_1-2
3      3 9_9-1        <NA>

I'm generally a big proponent of shell scripting with awk, but I'd prefer
an all-R solution if one exists (and also to learn how to do this more

Could someone point out an appropriate paradigm or otherwise point me in
the right direction?


