[R] paste first row string onto every string in column

Patrick Connolly p_connolly at slingshot.co.nz
Thu Aug 13 08:13:11 CEST 2009


On Wed, 12-Aug-2009 at 09:06AM -0700, Jill Hollenbach wrote:

|> 
|> Thanks so much everybody, this has been incredibly helpful--not only is my
|> immediate issue solved but I've learned a lot in the process. The lapply
|> solution is best for me, as I need flexibility to edit df's with varying
|> numbers of columns. 
|> 
|> Now, one more question: after appending the string from the first line, I am
|> manipulating the df further(recoding the original contents; this I have
|> working fine), and afterwards I will need to strip back off that string. It
|> seems relatively straightforward, except that, as shown in the example above
|> (df2), there is an astersik involved (I need to remove all characters up to
|> and including the asterisk) which seems problematic.
|> Any suggestions? 

check out strsplit.  You'll probably first need to get the columns
into character instead of the factors that they'll be.

HTH



|> Many thanks,
|> Jill
|> 
|> 
|> 
|> Don MacQueen wrote:
|> > 
|> > Let's start with something simple and relatively easy to understand, 
|> > since you're new to this.
|> > 
|> > First, here's an example of the core of the idea:
|> >>  paste('a',1:4)
|> > [1] "a 1" "a 2" "a 3" "a 4"
|> > 
|> > Make it a little closer to your situation:
|> >>  paste('a*',1:4, sep='')
|> > [1] "a*1" "a*2" "a*3" "a*4"
|> > 
|> > Sometimes it helps to save the number of rows in your dataframe in a 
|> > new variable
|> > 
|> > nr <- nrow(df)
|> > 
|> > Then, for your first column, the "a*" in the above example is df$V1[1]
|> > For the 1:4 in the example, you use  df$V1[ 2:nr]
|> > Put it together and you have:
|> > 
|> >     dfnew <- df
|> >     dfnew$V1[ 2:nr] <- paste( dfnew$V1[1], dfnew$V1[ 2:nr] )
|> > 
|> > But you can use "-1" instead of "2:nr", and you get
|> > 
|> >    dfnew$V1[ -1 ] <- paste( dfnew$V1[1], dfnew$V1[ -1] )
|> > 
|> > That's how you can do it one column at a time.
|> > Since you have only four columns, just do the same thing to V2, V3, and
|> > V4.
|> > 
|> > But if you want a more general method, one that works no matter how 
|> > many columns you have, and no matter what they are named, then you 
|> > can use lapply() to loop over the columns. This is what Patrick 
|> > Connolly suggested, which is
|> > 
|> >     as.data.frame(lapply(df, function(x) paste(x[1], x[-1], sep = "")))
|> > 
|> > Note, though, that this will do it to all columns, so if you ever 
|> > happen to have a dataframe where you don't want to do all columns, 
|> > you'll have to be a little trickier with the lapply() solution.
|> > 
|> > -Don
|> > 
|> > At 6:48 PM -0700 8/11/09, Jill Hollenbach wrote:
|> >>Hi,
|> >>I am trying to edit a data frame such that the string in the first line is
|> >>appended onto the beginning of each element in the subsequent rows. The
|> data
|> >>looks like this:
|> >>
|> >>>  df
|> >>       V1   V2   V3   V4  
|> >>1   DPA1* DPA1* DPB1* DPB1*
|> >>2   0103 0104 0401 0601
|> >>3   0103 0103 0301 0402
|> >>.
|> >>.
|> >>  and what I want is this:
|> >>
|> >>>dfnew
|> >>       V1   V2   V3   V4  
|> >>1   DPA1* DPA1* DPB1* DPB1*
|> >>2   DPA1*0103 DPA1*0104 DPB1*0401 DPB1*0601
|> >>3   DPA1*0103 DPA1*0103 DPB1*0301 DPB1*0402
|> >>
|> >>any help is much appreciated, I am new to this and struggling.
|> >>Jill
|> >>
|> >>___
|> >>  Jill Hollenbach, PhD, MPH
|> >>     Assistant Staff Scientist
|> >>     Center for Genetics
|> >>     Children's Hospital Oakland Research Institute
|> >>     jhollenbach at chori.org
|> >>
|> >>--
|> >>View this message in context: 
|> >>http://*www.*nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24928720.html
|> >>Sent from the R help mailing list archive at Nabble.com.
|> >>
|> >>______________________________________________
|> >>R-help at r-project.org mailing list
|> >>https://*stat.ethz.ch/mailman/listinfo/r-help
|> >>PLEASE do read the posting guide
|> http://*www.*R-project.org/posting-guide.html
|> >>and provide commented, minimal, self-contained, reproducible code.
|> > 
|> > 
|> > -- 
|> > --------------------------------------
|> > Don MacQueen
|> > Environmental Protection Department
|> > Lawrence Livermore National Laboratory
|> > Livermore, CA, USA
|> > 925-423-1062
|> > 
|> > ______________________________________________
|> > R-help at r-project.org mailing list
|> > https://stat.ethz.ch/mailman/listinfo/r-help
|> > PLEASE do read the posting guide
|> > http://www.R-project.org/posting-guide.html
|> > and provide commented, minimal, self-contained, reproducible code.
|> > 
|> > 
|> 
|> -- 
|> View this message in context: http://www.nabble.com/paste-first-row-string-onto-every-string-in-column-tp24928720p24939755.html
|> Sent from the R help mailing list archive at Nabble.com.
|> 
|> ______________________________________________
|> R-help at r-project.org mailing list
|> https://stat.ethz.ch/mailman/listinfo/r-help
|> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
|> and provide commented, minimal, self-contained, reproducible code.

-- 
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.   
   ___    Patrick Connolly   
 {~._.~}                   Great minds discuss ideas    
 _( Y )_  	         Average minds discuss events 
(:_~*~_:)                  Small minds discuss people  
 (_)-(_)  	                      ..... Eleanor Roosevelt
	  
~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.~.




More information about the R-help mailing list