[BioC] Using as.geneSet in GeneticsBase - problems with gene.columns

Johannes Gulmann Madsen johannes at dsr.life.ku.dk
Thu Mar 19 15:03:27 CET 2009


Im a new user of Bioconductor and therefore also a new user of the package
GeneticsBase. I have a dataframe looking like this (only a tiny bit of it):

> b[1:10,]
                  SNP.Name Sample.ID Allele1...Top Allele2...Top GC.Score
1      0_A2M_DS066406.1_15 602358305             G             G   0.8607
2       0_A2M_DS068238.1_4 602358305             G             G   0.8311
3     0_AARSL_DS061819.1_2 602358305             G             G   0.6840
4    0_ABCA1_DS062937.1_32 602358305             G             G   0.7499
5  0_ABCA1_DS073864.1_41_2 602358305             A             A   0.7900
6    0_ABCA1_DS073864.1_41 602358305             G             G   0.7383
7    0_ABCA1_DS078984.1_39 602358305             A             A   0.8360
8    0_ABCA1_DS082793.1_19 602358305             G             G   0.8440
9    0_ABCC10_DS066718.1_3 602358305             C             C   0.8859
10   0_ABCC6_DS063353.1_20 602358305             A             C   0.6117

Its of class 'data.frame' and i want it converted to a 'geneset' using

So i tried:

b2 <- as.geneSet(b, gene.columns=c("Allele1...Top", "Allele2...Top",
format=c("adjacent")), because the "Allele1...Top" and "Allele2...Top" are
(combined, thats why i want to use format=c("adjacent"), is that wrong?) my
genotypes, but it won't work. Does any of you have any experience using
'as.geneSet', i can't find any examples on the net that looks like mine? My
idea of the dataset is that i want to genotype each 'Sample.ID' like this (a
simple example):

Sample.ID 1:
SNP1: AA 0
      AC 1
      CC 2

Sample.ID 2:
SNP1: AA 0
      AC 1
      CC 2

So I want to make sure that the same SNP on different Sample.ID's is genotyped
the same way (like illustrated above).

Hope you can help me.

Regards Johannes

