[BioC] Mistaken identifiers

Vincent Schulz Vincent.Schulz at yale.edu
Thu Nov 15 17:38:42 CET 2012


You can prevent the identifiers from being converted by using the excel file import wizard, changing 
gene name columns from general format to text.

You could also try this:
#from CRAN
library(xlsx)
wb <- createWorkbook()
sheet <- createSheet(wb, sheetName="CellBlock")
datframe <- data.frame(cbind(1:3, c("March1", "Sept1", "Hbb")))
addDataFrame(datframe, sheet, col.names=F, row.names=F,
startRow=1, startColumn=1, colStyle=NULL, colnamesStyle=NULL,
rownamesStyle=NULL, showNA=FALSE, characterNA="", byrow=FALSE)
saveWorkbook(wb, "junk.xlsx")

The output format for all cells seems to be a string with gene symbols intact.  It should be 
possible to set column formats to text, but I couldn't get it to work.

Vince


=====================================================

Thank you - Tim and Steve.

The link looks interesting.

Correcting at one's end doesn't end the problem being Excel a widely used
spreadsheet program - because 'people will turn around and send it back to
you" again (as Tim rightly points out).  Further, it also seems that it is
impossible to effectively disable this feature in Excel (i.e., disabling
this specific autocorrection in Excel).

Cheers,
Chintanu

=====================================================


On Tue, Nov 13, 2012 at 5:11 AM, Tim Triche, Jr. <tim.triche at gmail.com>wrote:

 > Be surprised :-)
 >
 > http://cran.r-project.org/web/packages/HGNChelper/index.html
 >
 > The trouble with anything readable by Excel is that people will turn
 > around and send it back to you.  Including vendors
 >
 >
 > On Sun, Nov 11, 2012 at 8:50 PM, Steve Lianoglou <
 > mailinglist.honeypot at gmail.com> wrote:
 >
 >> Hi,
 >>
 >> On Sun, Nov 11, 2012 at 11:22 PM, Chintanu <chintanu at gmail.com> wrote:
 >> > Hello all,
 >> >
 >> > Wondering whether there already a tried and tested solution in R to deal
 >> > with the issue of mistaken identifiers [Zeeberg et al. (2004) Mistaken
 >> > Identifiers: Gene name errors can be introduced inadvertently when using
 >> > Excel in bioinformatics; BMC Bioinformatics]. After analysing data, I
 >> often
 >> > do write.csv () and the output file is then shared to be often looked at
 >> > using MS Excel.
 >>
 >> I'd be surprised if you'll find a better solution than just informing
 >> people downstream to turn off the auto-correct feature.
 >>
 >> Perhaps putting together small tutorials for different
 >> platforms/versions that step people through the process of disabling
 >> this "feature" would be handy -- and perhaps it might be handier to
 >> put on a wiki some where for the better good of humanity.
 >>
 >>
 >> --
 >> Steve Lianoglou
 >> Graduate Student: Computational Systems Biology
 >>  | Memorial Sloan-Kettering Cancer Center
 >>  | Weill Medical College of Cornell University
 >> Contact Info: http://cbio.mskcc.org/~lianos/contact
 >>
 >> _______________________________________________
 >> Bioconductor mailing list
 >> Bioconductor at r-project.org
 >> https://stat.ethz.ch/mailman/listinfo/bioconductor
 >> Search the archives:
 >> http://news.gmane.org/gmane.science.biology.informatics.conductor
 >>
 >
 >
 >
 > --
 > *A model is a lie that helps you see the truth.*
 > *
 > *
 > Howard Skipper<http://cancerres.aacrjournals.org/content/31/9/1173.full.pdf>
 >
 >

	[[alternative HTML version deleted]]



More information about the Bioconductor mailing list