[R] comparing and combing files

Assa Yeroslaviz frymor at gmail.com
Tue May 11 12:36:01 CEST 2010


Hello,

I have two tab-delimited files which I would like to combine.
In the first one I have gene IDs (Unique) on column 1 and than various
experimental results from microarray analysis (see attached files list1 )

the second arrays have the same genes IDs (more and in a different order,
some are double) (see attached files list2 )

What I would like to do is to search in the second list for gene ID of the
first list, than copy (add) the annotations (all or some) into the matrix
list of the first list.

Unfortunately I don't have a clue as how to start doing it. I tried to read
both lists into a matrix with read.table(). but I don't know how to
continue.

I would be happy for any help I can get.

THX

Assa
-------------- next part --------------
Probe Id	p-Value	log(fold_change)	fold_change	Hyb1_Signal	Hyb2_Signal	Hyb3_Signal	Hyb4_Signal	Hyb5_Signal	Hyb6_Signal	Hyb7_Signal	Hyb8_Signal
A_51_P125745	8.7834E-10	0.699225329	1.623632732	9.564188501	9.608302973	9.259850079	9.343987249	15.25372018	15.33048102	15.41274424	15.33793848
A_52_P101878	9.74652E-10	2.501046219	5.660957995	1.954870729	1.99751977	1.955063296	1.957937204	11.01789225	10.97750041	10.97175935	11.55849605
A_51_P442990	1.04001E-09	-0.402812753	-1.322083001	14.00554692	14.05279929	14.00221926	14.04812315	10.56923616	10.48970549	10.64738253	10.73328707
A_52_P184149	5.44301E-09	1.290971	2.446926896	4.911810683	4.498067567	4.58485978	5.039483247	11.52840271	11.819086	11.60780518	11.62005409
A_51_P372912	7.14967E-09	0.231830419	1.174323928	12.45730289	12.4543696	12.44985698	12.36406809	14.54124222	14.51370493	14.67676838	14.66224351
A_52_P315369	9.4116E-09	0.454511523	1.370318755	8.553786225	8.436374041	8.514351708	8.373483894	11.54101201	11.57518213	11.52680987	11.7806491
A_51_P273639	1.39392E-08	0.730389888	1.659087399	6.551572425	6.169019927	6.308868816	6.346543584	10.35759351	10.49588487	10.6468324	10.60069893
A_51_P199352	1.50718E-08	-0.515838137	-1.429824558	8.911575946	8.766722401	8.948142121	8.804777547	6.204620464	6.293980491	6.223970842	6.057543931
A_51_P196997	1.51114E-08	2.099347242	4.285154562	1.897835681	1.915403587	2.479244824	1.886853579	8.624538827	8.786418383	8.658524824	8.980244098
A_51_P259571	2.24213E-08	0.341391303	1.26697785	10.40022757	10.4228105	10.29809919	10.32980299	13.06843771	13.29941031	12.98823948	13.16133567
A_52_P4598	2.51886E-08	0.503978803	1.4181192	9.34248155	9.096537969	9.007279191	9.299140777	12.8720763	13.10863036	12.97146724	13.15723934
-------------- next part --------------
Probe ID	Transcript ID	InterPro Domains	Panther Families	Panther Biological Process Level 1	Panther Biological Process Level 2
 A_51_P100034	 NM_027162.3	MIF4G-likes type 3,MIF4-likes type 1/2/3 	EIF4G DOMAIN PROTEIN,AD023 PROTEIN 	Protein metabolism and modification 	Protein biosynthesis 
 A_51_P100052	 NM_198863.1	Leucine-rich repeats typical subtype,Leucine-rich repeat,Leucine-rich repeats cysteine-rich flanking regions N-terminal,Cysteine-rich flanking regions C-terminal 	LEUCINE-RICH TRANSMEMBRANE PROTEINS 	Biological process unclassified 	- 
 A_51_P100174	 NM_008613.2	- 	MEIOSIS-SPECIFIC NUCLEAR STRUCTURAL PROTEIN 1 	Developmental processes 	Meiosis 
 A_51_P100218	 NM_134198.1	Vomeronasal receptors type 1,GPCRs rhodopsin-like superfamily 	VOMERONASAL PHEROMONE RECEPTOR 	Signal transduction,Sensory perception 	Cell surface receptor mediated signal transduction,Pheromone response 
 A_51_P100227	 NM_023579.4	HEAT,Importin-betas N-terminal,Armadillo-like helical 	IMPORTIN BETA-3,IMPORTIN BETA 	Intracellular protein traffic,Biological process unclassified,Nucleoside, nucleotide and nucleic acid metabolism,Transport 	Nuclear transport,RNA localization 
 A_51_P100227	 NM_023579.4	HEAT,Importin-betas N-terminal,Armadillo-like helical 	IMPORTIN BETA-3,IMPORTIN BETA 	Intracellular protein traffic,Biological process unclassified,Nucleoside, nucleotide and nucleic acid metabolism,Transport 	Nuclear transport,RNA localization 
 A_51_P100238	 NM_146376.2	EGF-like regions conserved site,Olfactory receptor,GPCRs rhodopsin-like superfamily,GPCRs rhodopsin-like 	OLFACTORY RECEPTOR MOR107,OLFACTORY RECEPTOR 	Signal transduction,Biological process unclassified,Sensory perception 	Cell surface receptor mediated signal transduction,Chemosensory perception 
 A_51_P100298	 NM_152220.1	Target SNARE coiled-coil region,Syntaxin/epimorphins conserved site,Syntaxin-3,Syntaxins N-terminal 	SYNTAXIN,SYNTAXIN 3 	Intracellular protein traffic,Neuronal activities,Biological process unclassified,Protein targeting and localization,Transport 	Synaptic transmission,Protein targeting,Small molecule transport,Exocytosis, 
 A_51_P100309	 NM_001039652.1	Opioid receptor,Mu opioid receptor,GPCRs rhodopsin-like superfamily,GPCRs rhodopsin-like 	G-PROTEIN COUPLED RECEPTOR,INTERACTOR PROTEIN FOR CYTOHESIN EXCHANGE FACTORS 1,CONNECTOR ENCHANCER OF KINASE SUPPRESSOR OF RAS,MU-TYPE OPIOID RECEPTOR (MOR-1) 	Neuronal activities,Signal transduction,Sensory perception,Biological process unclassified 	Synaptic transmission,Pain sensation,Cell surface receptor mediated signal transduction 
 A_51_P100327	 NM_013683.1	ABC transporter-like,APOBEC/CMP deaminases zinc-binding,Antigen peptide transporter 2,AAA+ ATPases core,ABC transporters transmembrane region,ABC transporters ABCB2,ABC transporters transmembrane regions type 1 	ATP-BINDING CASSETTE TRANSPORTER,ABC TRANSPORTER TAP1 	Biological process unclassified,Immunity and defense,Transport 	Extracellular transport and import 


More information about the R-help mailing list