[BioC] R: Script for filtering
James W. MacDonald
jmacdon at med.umich.edu
Thu Apr 23 17:59:10 CEST 2009
OK, still not very detailed, so I will make assumptions.
I assume the file GenomeWideSNP_6.0 is the csv file you can download
I assume the second 'list' of 500 SNPs is actually a vector of the dbSNP
RS IDs for 500 SNPs.
I assume you want to know how many of the 500 SNPs in the vector of IDs
are found on the 6.0 chip.
I assume you read both into R, and called the Affy csv file 'GW6' and
the vector of RS IDs 'rsid'. I also assume there is a column called
'rsid' in the GW6 data.frame.
intersection <- GW6[GW6$rsid %in% rsid,]
And if there is a column in GW6 called 'gene' that you are interested
in, then you could add
intersection <- GW6[GW6$rsid %in% rsid,"gene"]
to get just that column.
Hopefully that helps.
But maybe you see my point about detailed questions. When you want to
know how to do something, you are asking a very _specific_ question. If
you don't give very specific details about what you are trying to do,
preferably with sample code if things aren't working the way you think
they should, then people are left to guess what you want and what you
Alberto Goldoni wrote:
> Sorry i'll be more detailed.
> in R i'd need to load the file GenomeWideSNP_6.0 containing all the SNPs and in the second time i compare this list with a second list containing SNPs of 500 genes.
> I would like to know how many genes (of the second list: 500) are included in the first list (GenomeWideSNP_6.0 database) and
> which SNPs are the same between the two lists.
> best regards.
> Da: James W. MacDonald [jmacdon at med.umich.edu]
> Inviato: giovedì 23 aprile 2009 15.35
> A: Alberto Goldoni
> Cc: bioconductor at stat.math.ethz.ch
> Oggetto: Re: [BioC] Script for filtering
> Hi Alberto,
> Alberto Goldoni wrote:
>> Hi to everybody,
>> i have to extract 500 genes from all the genes present on the GenomeWideSNP_6.0 database.
> I'm not familiar with this database. Could you please give more
> information? Also, there are no genes measured on the GenomeWideSNP_6.0
> chip. This chip measures SNPs, which may or may not be in or near genes.
>> If i have the list of these 500 genes, are there a script in order to extract only these genes from the complete list?
> This question is too vague to be answered. What is the 'complete list'?
> Maybe you are trying to subset a list or data.frame, in which case you
> should look at
> or perhaps
>> Thanks a lot.
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
> James W. MacDonald, M.S.
> Douglas Lab
> University of Michigan
> Department of Human Genetics
> 5912 Buhl
> 1241 E. Catherine St.
> Ann Arbor MI 48109-5618
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
James W. MacDonald, M.S.
University of Michigan
Department of Human Genetics
1241 E. Catherine St.
Ann Arbor MI 48109-5618
More information about the Bioconductor