[R] i-best, grep function

Peter Dalgaard P.Dalgaard at biostat.ku.dk
Wed May 7 14:13:17 CEST 2008


Richard.Cotton at hsl.gov.uk wrote:
>> T1 <- read.delim(file="S://SEDIM//Yvonne//2_5//T1.txt",col.names= 
>> c("Dye/Sample_Peak", "Sample_File_Name", "Size", "Height", 
>> "Area_in_Point", "Area_in_BP", "Data_Point", "Begin_Point", 
>> "Begin_BP", "End_Point", "End_BP", "Width_in_Point", "Width_in_BP", 
>> "User_Comments", "User_Edit"))
>> T1 <- subset(T1, Size < 1000 & Size > 50)
>> T1.B <- cbind(T1[grep("^B", as.character(T1$Color),perl=T),3],
>> T1[grep("^B", as.character(T1$Color),perl=T),5])
>> T1.B <- cbind(T1.B, T1.B[,2]/sum(T1.B[,2]))
>>
>>
>> It works alright until the last two lines.  I try to grep the 
>> columns 3 and 5, but the outcome is 
>>  T1.B
>>      [,1] [,2]. 
>> I don´t quite understand the code of as.character(t1$Color), perl=T.
>>     
>
> T1 is a data frame, and T1$Color is one of the columns.
>   
With all due respect: I think you missed the point: T1$Color is NOT one
of the columns (look at the col.names).

So T1$Color is NULL and things go downhill from there.

But what is "i-best"???? This looks like a plain R issue.

    -p
> as.character converts the column T1$Color from type factor to type 
> character (i.e. a vector of strings).
>
> grep("^B", as.character(T1$Color),perl=T) means 'find all strings in the 
> vector T1$Color that begin with the letter (capital) B'.  "^B" is an 
> example of a regular expression.  There is a very good guide to them here: 
> http://www.regular-expressions.info/quickstart.html
>
> The relevant help pages in R are ?grep and ?regexp.  Don't worry about the 
> parameter perl=TRUE; there are subtle variations on regular expression 
> syntax, and it just means that you follow PERL-style syntax.
>
> Regards,
> Richie.
>
> Mathematical Sciences Unit
> HSL
>   


-- 
   O__  ---- Peter Dalgaard             Øster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark      Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)              FAX: (+45) 35327907



More information about the R-help mailing list