[BioC] scan date information

Mark Cowley m.cowley at garvan.org.au
Tue Sep 22 03:17:20 CEST 2009


Hi,
These scripts work OK for me on OSX (only on TXT CEL files, not the  
latest binary ones). I haven't gotten around to writing a version that  
uses the Fusion SDK.

Mark

celDate.sh
#!/bin/bash
#
# Determine the date that the CEL file was created, from the CEL file  
header
# eg "06/05/08 12:05:36"
#
# Mark Cowley, 2008-07-28
#
grep -m1 -a '^DatHeader' "$@" | egrep -o '[0-9]{2}/[0-9]{2}/[0-9]{2}  
[0-9]{2}:[0-9]{2}:[0-9]{2}'

-- or --

celDate.R
# Extract the CEL file creation date stamp from within the CEL file  
header.
#
# Mark Cowley, 2008-07-29
celDate <- function(files) {
	stopifnot( all(file.exists(files)) )
	files <- paste(squote(files), collapse=" ")
	cmd <- paste("grep -m1 -a '^DatHeader'", files,
				"| egrep -o '[0-9]{2}/[0-9]{2}/[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9] 
{2}'")

	dates <- system(cmd, intern=T)
	dates
}

-----------------------------------------------------
Mark Cowley, PhD

Peter Wills Bioinformatics Centre
Garvan Institute of Medical Research, Sydney, Australia
-----------------------------------------------------
On 22/09/2009, at 8:42 AM, Rob Dunne wrote:

> Thanks for that.
>
> I am not using the development version yet but I will look out for  
> the new slot.
>
> Saroj, your method doesn't work for me, perhaps your cel file is  
> ascii?
> strings SB_20D.CEL | grep DatHeader
>
> However, I have found
> $ grep --text d.a.t.e. SB_20D.CEL
> text/plainaffymetrix-scan-date(2008-04-03T04:45:53Z
>
> the "--text" option makes grep read a binary file as thought it was  
> text. I am not sure why I need  the dots in date.
>
> Bye
> Rob
>
>
>
> Patrick Aboyoun wrote:
>> Robert,
>> The answer depends on which version of R and BioC are you using. If  
>> you are using R <= 2.9, BioC <= 2.4, you will need to devise your  
>> own method; one of which was given by Saroj. If you are using R- 
>> devel and BioC 2.5 (devel), the eSet abstract class and its derived  
>> classes such as ExpressionSet contain a new slot called  
>> protocolData that contains an AnnotatedDataFrame object. This slot  
>> is to be populated by metadata contained in microarray data files.  
>> In BioC 2.5 (devel) the read.affybatch from affy and read.celfile  
>> from affyio add a ScanData column to the protocolData slot with the  
>> metadata you are looking for.
>> Cheers,
>> Patrick
>> Saroj K Mohapatra wrote:
>>> Hi Rob:
>>>
>>> I have a file called _16.CEL. I want to find out the date  
>>> information in its header. The following gives me:
>>>
>>> $ strings _16.CEL | grep DatHeader
>>> DatHeader=[2..65534]  _16:CLS=7365 RWS=7365 XIN=1  YIN=1   
>>> VE=30        2.0 10/27/06 10:57:45 50207590  M10 I find a date  
>>> 10/27/06. Is this what you are looking for?
>>>
>>> Best,
>>>
>>> Saroj
>>>
>>>
>>> Robert Dunne wrote:
>>>> Hi List,
>>>>
>>>> I apologise for what may be a very simple question. How can I  
>>>> retrieve
>>>> the scan date information from cel files?
>>>>
>>>> I can find the information using some editors, kate under linux  
>>>> shows
>>>> "a f f y m e t r i x - s c a n - d a t e   ( 2 0 0 8 - 0 4 - 0 3"
>>>> but I can't find it all all using vi or emacs. I suppose this is  
>>>> something to do with encoding.
>>>> Also "string file.cel | grep "d a t e"" does not work.
>>>>
>>>> I have tried the affxparser library but
>>>> readCelHeader("file.cel")
>>>> does not pick up the date.
>>>>
>>>> Unfortunately in many experiments the scan date turns out to be  
>>>> the major effect.
>>>>
>>>> Bye
>>>> Rob
>>>>
>>>> _______________________________________________
>>>> Bioconductor mailing list
>>>> Bioconductor at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>> _______________________________________________
>>> Bioconductor mailing list
>>> Bioconductor at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor



More information about the Bioconductor mailing list