[R] Substring of a character column

Alain Guillet alain.guillet at uclouvain.be
Wed Aug 4 12:40:52 CEST 2010


  Hi,

a <- c("ID=NM_182905.1;Name=NM_182905;Alias=FLJ00038;Note=hypothetical 
protein
+ LOC375690
+ ","ID=NM_001005484;Alias=OR4F5;Note=olfactory receptor%2C family 4%2C
+ subfamily F
+ ","ID=NM_001005224.1;Name=NM_001005224;Alias=OR4F3;Note=olfactory
+ receptor%2C family 4%2C subfamily F
+ ")

fonction <- function(data,string) {
     liste <- strsplit(data,";")
     return(lapply(liste,function(x) grep(string,x)))
}

fonction(a,"ID=")
fonction(a,"Alias=")

HTH,

Alain



On 04-Aug-10 12:00, LogLord wrote:
> Hi,
>
> I have a dataframe with a rather complicated descriptive column (V9):
>
>> test3[(1:3),
>       V1     V4     V5
> 10 1   4559   7173
> 17 1  58954  59871
> 19 1 357522 358458
>
> V9
> 10 ID=NM_182905.1;Name=NM_182905;Alias=FLJ00038;Note=hypothetical protein
> LOC375690
> 17 ID=NM_001005484;Alias=OR4F5;Note=olfactory receptor%2C family 4%2C
> subfamily F
> 19 ID=NM_001005224.1;Name=NM_001005224;Alias=OR4F3;Note=olfactory
> receptor%2C family 4%2C subfamily F
> I have problems to extract two strings from this column (V9). First I need
> the "ID=..." and second I need the "Alias=..." both in seperate columns. I
> tried it with substr() but due to the different length and no wildcard
> allowance it did not work.
>
> Would be glad for any help!
>
> Thanks in advance.

-- 
Alain Guillet
Statistician and Computer Scientist

SMCS - IMMAQ - Université catholique de Louvain
Bureau c.316
Voie du Roman Pays, 20
B-1348 Louvain-la-Neuve
Belgium

tel: +32 10 47 30 50



More information about the R-help mailing list