[BioC] genotyping output files

J.Oosting at lumc.nl J.Oosting at lumc.nl
Mon Jan 16 11:11:07 CET 2012


As I had this requirement just the other week with affy SNP 6 arrays, I made a small script to produce map and ped files. You probably need some changes for your particular situation , like selecting the correct columns from an annotation file. targets is a dataframe containing patient information.

# Create map and ped file, use http://www.gwaspi.org/?page_id=145 as reference
# Annotation from Affymetrix.com
annot<-read.csv("annotationData/GenomeWideSNP_6.na32.annot.csv", skip=21,row.names=1,as.is=TRUE)
# 
snp.index <- which(isSnp(cnSet) & !is.na(chromosome(cnSet)))
calls<-assayData(cnSet)$call[snp.index,]
# Create map file
annot<-annot[rownames(calls),]
# third column of mapfile (genetic position (cM) is not in annotation, take one that is mostly 0)
mapfile<-annot[,c("Chromosome", "dbSNP.RS.ID","ChrX.pseudo.autosomal.region.1", "Physical.Position")]
# Create ped file
alleleA<-alleleB<-matrix("", nrow=nrow(calls), ncol=ncol(calls), dimnames=dimnames(calls))
# take complement of SNPs on reverse strand
allA<-annot$Allele.A
allComplA<-c("A","C","G","T")[match(allA,c("T","G","C","A"))]
forwA<-ifelse(annot$Strand=="+",allA,allComplA)
allB<-annot$Allele.B
allComplB<-c("A","C","G","T")[match(allB,c("T","G","C","A"))]
forwB<-ifelse(annot$Strand=="+",allB,allComplB)
for (r in 1:ncol(calls)){
  alleleA[,r]<-ifelse(calls[,r]<3,forwA,forwB)
  alleleB[,r]<-ifelse(calls[,r]<2,forwA,forwB)
}
ped<-matrix("",ncol=nrow(alleleA)*2+6,nrow=ncol(alleleA))
for (r in 1:ncol(calls)){
  ped[r,(1:nrow(alleleA))*2+5]<-alleleA[,r]
  ped[r,(1:nrow(alleleB))*2+6]<-alleleB[,r]
}
# Family id
ped[,1]<-targets$PatientId
# sample id
ped[,2]<-make.names(targets$DisplayLabel)
# Paternal id, maternal id
ped[,3:4]<-"0"
# Gender (1=male, 2=female)
ped[,5]<-ifelse(targets$Gender=="M","1","2")
# Affection (0=unknown,1=unaffected, 2=affected)
ped[,6]<-ifelse(targets$NorTum=="N","1","2")
#
write.table(mapfile, file="SNPollier.map", quote=FALSE, row.names=FALSE, col.names=FALSE)
write.table(ped, file="SNPollier.ped", quote=FALSE, row.names=FALSE, col.names=FALSE)

> 
> 
> Dear All,
> 
> I have recently attempted a genome wide genotyping experiment on around
> 500 samples using the Illumina 660quad chip.
> Crlmm genotyping was completed and a snpset object crlmmResult was
> created.
> 
> I have used the calls(crlmmResult) and the configuation command to quickly
> browse through the genotypes.
> 
> I was expecting that at the end of the process crlmm would produce a text
> file with all the genotypes which could be transported to other softwares
> like plink for association analysis,
> This hasnt happened and a text file was not generated,  however a
> crlmmResult.Rda file was extracted into SNPchip packages in my rlibrary.
> 
> Could you please direct  me towards what other prrocesses which I might
> need to complete before I can obtain an output text file all the with
> genome wide calls?
> 



More information about the Bioconductor mailing list