[BioC] how to open a SNP data file as large as 500M in Windows OR just extract part of data

James W. MacDonald jmacdon at med.umich.edu
Fri Sep 10 15:36:22 CEST 2010


Another solution is to install the Rtools toolset and use grep or sed.

http://www.murdoch-sutherland.com/Rtools/

something like

grep <your snp name here> <snp file name here>

will get the SNP data without having to open the entire file at one 
time. An alternative is

sed -n '/<snp name here/p'

which will do the same. And usually faster than opening the entire file 
just to find one line.

You can of course re-direct the output into a new file by adding a

 > mynewfile.txt

at the end of either of the above.

Best,

Jim



On 9/10/2010 12:49 AM, Michael Imbeault wrote:
>
> You could try http://www.editpadpro.com/ - it opens arbitrary large
> files, I opened 1 GB text files with it before.
>
> Michael
>
> On 09/09/2010 11:26 PM, xiangxue Guo wrote:
>> Hi,there
>>
>> Does anybody know how to open a SNP data file as large as 500M in
>> Windows computer? These data are SNPs for many chromosomes, and we
>> just need one of them. Thus if someone knowes how to extract the data
>> of just one chromosome, it also should be OK for us.
>>
>> Thanks in advanced,
>>
>> Guo
>>
>> _______________________________________________
>> Bioconductor mailing list
>> Bioconductor at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/bioconductor
>> Search the archives:
>> http://news.gmane.org/gmane.science.biology.informatics.conductor
>>
>>
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
James W. MacDonald, M.S.
Biostatistician
Douglas Lab
University of Michigan
Department of Human Genetics
5912 Buhl
1241 E. Catherine St.
Ann Arbor MI 48109-5618
734-615-7826
**********************************************************
Electronic Mail is not secure, may not be read every day, and should not be used for urgent or sensitive issues 



More information about the Bioconductor mailing list