[BioC] Pointers on importing peptide (protein) expression data
sdavis2 at mail.nih.gov
Tue Apr 26 12:52:43 CEST 2005
On Apr 26, 2005, at 6:06 AM, Jamie Sherman wrote:
> I'm new to BioConductor and have given the faqs a read through as well
> as some of the tutorials but need someone to point me in the right
> direction because the data I have is a little unusual. I'll explain
> what I have and then what I would like to do.
> What I have.
> The data I have looks like this
> Protein_Name Peptide_Mass: T1 T2 T3 T4 T5 T6 T7
> T[1-7] is the expression ratio at time point [1-7] and is a ratio of
> abundances of the sample to a reference.
> What I would like to do.
> This data seems similar to what you might get out of an array
> experiment. I am wondering if I can load the data in a way that would
> allow me to make use of the annotation package to attach GO
> information and then use the applicable array analysis packages.
You will need to have a standard identifier for the proteins (like
Entrez Gene ID or refseq (mRNA) identifier). Since microarrays are
typically build around DNA, most of the tools for annotation in
bioconductor are build around mRNA identifiers, not the protein
counterparts. Look at the AnnBuilder package for how to build an
"annotation" package for your experiment.
> Is BioConductor a suitable tool for this?
> What is the best way to load this data? (where should I be looking)
You can import your data as a tab-separated file using read.table.
Just type ?read.table for help on using the function.
> Can you recommend analysis packages to clustering by GO and to
> cluster on patters in the protein expression data?
There are many means of clustering data in R and bioconductor. hclust
is a reasonable place to start. The heatmap function does clustering
of samples and genes. There is also the GoCluster package in
The searchable archives of R and bioconductor can be very helpful, also.
1) Searchable bioconductor archives
2) R site search (and archive search)
More information about the Bioconductor