[BioC] Re: Fw: problem with GEO parser

Saurin D. Jani jani at musc.edu
Mon Apr 11 20:09:44 CEST 2005


I forgot to tell you that this works only for 1 soft file in current dirctory. 

you can have your softfile in current directory , turn on your R session and cut
and paste below parser code..! It will work because I just ran on my computer 2
mins. back and it works fine. 

you have : GDS461.soft on this soft file,

> eset
Expression Set (exprSet) with
        12625 genes
        10 samples
                 phenoData object with 0 variables and 0 cases
         varLabels

-----------cut starts here-----------

#-- reading soft file
 softFile <- list.files(,"soft"); # from local directory

 system("cp *.soft file1.soft");
 system("grep -on \"ID_REF\" file1.soft > b.txt" );


 system("grep \"dataset_platform\" file1.soft > d.txt");
 ln <- as.matrix(readLines("b.txt"));
 lm <- as.matrix(readLines("d.txt"));

system("rm b.txt");
system("rm d.txt");
system("rm file1.soft");

lnX <- as.matrix(unlist(strsplit(ln[2],":")))
Skpnum <- as.numeric(lnX[1]);

 lmX <- as.matrix(unlist(strsplit(lm[1],"=")))
 chiptype <- trimWhiteSpace(lmX[2]);
 GDSN <-  softFile;

 emX <- read.table(softFile,skip = Skpnum,comment.char = "");
 Colm <- ncol(emX);

 Rnames <- as.matrix(emX["V1"]);
 temp_emX <- emX;

 temp2  <- temp_emX[3:Colm];
 temp2 <- as.matrix(temp2);
 rownames(temp2) <- Rnames;

 #--making expressiong set out of soft file, 
 #soft file has normalized data,so I am assuming here 
 #that this data is also normalized

 esetX <- as.matrix(temp2);
 eset <- new("exprSet", exprs = esetX);

-----------cut ends here-----------

now paste in to your R session. 

Saurin
-- 
|------------------------------------------------
| Saurin Jani,MS
| Statistical and Research Analyst
|
| Department of Cell Biology and Anatomy
| Medical  University of South Carolina (MUSC)
| 173 Ashley Ave
| Charleston,SC - 29407 (US)
| 
| Email: jani at musc.edu
| Phone: (843)792-5483
|------------------------------------------------


Quoting guillaume deplaine <guillaume.deplaine at neuf.fr>:

> Hello,
> 
> In april I wrote you a message about my problem with your GEO parser. It's 
> extremly important for me to open et cluster this file in R. I don't know 
> why it's the  problem with your program.
> Could you help me please.
> Thanks a lot
> ----- Original Message ----- 
> From: "guillaume deplaine" <guillaume.deplaine at neuf.fr>
> To: "Saurin D. Jani" <jani at musc.edu>
> Sent: Friday, April 01, 2005 1:33 PM
> Subject: problem with GEO parser
> 
> 
> > Dear colleague,
> >
> >    You sent me a GEO parser you wrote some time ago.
> > I have a problem because when I run R, I can read soft file with the 
> > command
> > line softFile<-list.files(,"GDS461.soft"). But after, with the command :
> > system("cp *.soft file1.soft") or with system(grep-on\"ID_REF\" file1.soft
> 
> >  >
> > b.txt"), R console said : cp (or grep) was not found.
> >
> > You wrote #put your GEO file but I don't know where GDS461.soft must be
> > written.
> > Perhaps it's a problem of version. I work with R 2.0.1 or I forget a space
> > in a  command line.
> > I enclose GDS461.soft file to my message.
> >
> > Could you explain me the problem and where, in your script, GDS461.soft 
> > must
> > be written.
> > Thanks for your help.
> >
> >
> > Guillaume Deplaine
> >
> > INSERM U36
> > Collège de France
> > 11, place Marcellin Berthelot
> > 75231 Paris Cedex 05
> >
> > Tél. : 01 44 27 16 54
> > Fax. : 01 44 27 16 91
> > Portable : 06 19 94 82 77
> > E-mail : guillaume.deplaine at neuf.fr
> > ----- Original Message ----- 
> > From: "Saurin D. Jani" <jani at musc.edu>
> > To: "Guillaume Deplaine" <guillaume.deplaine at college-de-france.fr>
> > Cc: <bioconductor at stat.math.ethz.ch>
> > Sent: Tuesday, March 29, 2005 4:18 PM
> > Subject: Re: [BioC] problem with GEO site
> >
> >
> >>> I was wishering if it's passible to do a clustering
> >>> analysis of this file with R ?
> >>
> >> you need to parse this file and make expression set in R. for that you
> >> need GEO
> >> parser and below is GEO parser that I wrote some time ago.
> >>
> >> ##================================================================
> >> ##                               GEO SOFT FILES
> >> ##================================================================
> >> # GEO soft file parser(1.0) - Saurin Jani
> >>
> >> #-- reading soft file
> >>
> >> softFile <- list.files(,"soft"); # from local directory
> >>
> >> system("cp *.soft file1.soft");
> >> system("grep -on \"ID_REF\";
> >> # put your GEO soft file , b.txt file will be created on your computer
> >>
> >> system("grep \"dataset_platform\" file1.soft > d.txt");
> >> ln <- as.matrix(readLines("b.txt"));
> >> lm <- as.matrix(readLines("d.txt"));
> >>
> >> system("rm b.txt");
> >> system("rm d.txt");
> >> system("rm file1.soft");
> >>
> >> lnX <- as.matrix(unlist(strsplit(ln[2],":")))
> >> Skpnum <- as.numeric(lnX[1]);
> >>
> >> lmX <- as.matrix(unlist(strsplit(lm[1],"=")))
> >> chiptype <- trimWhiteSpace(lmX[2]);
> >> GDSN <-  softFile;
> >>
> >> emX <- read.table(softFile,skip = Skpnum,comment.char = "");
> >> Colm <- ncol(emX);
> >>
> >> Rnames <- as.matrix(emX["V1"]);
> >> temp_emX <- emX;
> >>
> >> temp2  <- temp_emX[3:Colm];
> >> temp2 <- as.matrix(temp2);
> >> rownames(temp2) <- Rnames;
> >>
> >> #--making expressiong set out of soft file, soft file has normalized
> >> data,so I am
> >> #---assuming here that this data is also normalized
> >>
> >> esetX <- as.matrix(temp2);
> >> eset <- new("exprSet", exprs = esetX);
> >>
> >>
> >> you can use eset for clustering.
> >>
> >>
> >> Saurin
> >> -- 
> >> |------------------------------------------------
> >> | Saurin Jani,MS
> >> | Statistical and Research Analyst
> >> |
> >> | Department of Cell Biology and Anatomy
> >> | Medical  University of South Carolina (MUSC)
> >> | 173 Ashley Ave
> >> | Charleston,SC - 29407 (US)
> >> |
> >> | Email: jani at musc.edu
> >> | Phone: (843)792-5483
> >> |------------------------------------------------
> >>
> >>
> >> Quoting Guillaume Deplaine <guillaume.deplaine at college-de-france.fr>:
> >>
> >>> Hello,
> >>>
> >>>     I found a file on GEO web site. this files was processed with MASS 4
> >>> until normalization. I was wishering if it's passible to do a clustering
> >>> analysis of this file with R ?
> >>>
> >>>     My second question is if it's possible to retrieve raw data of this
> >>> file
> >>>
> >>> processed with MASS 4?
> >>>
> >>> Thanks for your answer
> >>>
> >>> _______________________________________________
> >>> Bioconductor mailing list
> >>> Bioconductor at stat.math.ethz.ch
> >>> https://stat.ethz.ch/mailman/listinfo/bioconductor
> >>>
> >>>
> >>
> >> _______________________________________________
> >> Bioconductor mailing list
> >> Bioconductor at stat.math.ethz.ch
> >> https://stat.ethz.ch/mailman/listinfo/bioconductor
> > 
>



More information about the Bioconductor mailing list