[R] Help: read a proportion of high through-put data

R. Michael Weylandt michael.weylandt at gmail.com
Tue Jan 24 05:33:50 CET 2012


Ok, it seems to have worked on my machine as well, but for some levels
you didn't mention before.

 If you are having trouble with the header names, I'll take a stab at
it -- R (by default) requires them to be syntactically valid names
(i.e., can't start with a number or have a dollar sign or hyphen in
them) and will modify them as needed. Generally this is helpful for
interactive use (if you want to call names directly). If you wish to
suppress this behavior, add the "check.names = FALSE" argument to
read.table() and it will keep them as is. If you ever do need a
non-syntactic name again, you can get it by surrounding it in
backquotes:

i.e., `3s` <- 4

3s # throws an error
identical(`3s`, 4) # works

Michael

On Mon, Jan 23, 2012 at 11:28 PM, chee chen <chee.chen at yahoo.com> wrote:
> Hi, Michael,
> Please ignore my previous email with the attachment, since I guess I
> resolved it with your suggestions (with "header=TRUE), except some minor
> issues with the names of the header.
> Regards,
> Chee
>
> ________________________________
> From: R. Michael Weylandt <michael.weylandt at gmail.com>
> To: Chee Chen <chee.chen at yahoo.com>
> Cc: R-ORG <r-help at r-project.org>
> Sent: Monday, January 23, 2012 10:26 PM
> Subject: Re: [R] Help: read a proportion of high through-put data
>
> It's pretty hard to answer this without the file in hand, but I'd
> guess something like the following is at play:
>
> Columns of data.frame()s have to have a single type. So if R sees
> anything it thinks is a character, it will coerce the whole column to
> character. Since you have not set the first row to be a header, it's
> probably interpreting that as the first element of the row and
> recognizes it as character. This behavior is sometimes auto-rectified
> by read.table() or read.csv() if it sees a column without a member in
> the first line -- as that suggests that we have column and rownames
> around rectangular data -- but that doesn't seem to be happening here.
>
> What happens if you try
>
> read.table("sample.txt", header = TRUE)
>
> An alternative route, if those names are coming in as headers, would
> be to manually coerce the columns -- if everything is to be numeric,
> just wrap the call in as.numeric()
>
> Michael
>
> On Mon, Jan 23, 2012 at 10:18 PM, Chee Chen <chee.chen at yahoo.com> wrote:
>> Dear All,
>> I have a text file, tab delimited, called "sample.txt",as follows:
>> ID_REF    382    GC_Score    Theta    R    B_Allele_Freq    Log_R_Ratio
>> 200003    BB    0.9101527    0.9734979    0.8788951    1    0
>> 200006    AB    0.6003323    0.4385073    2.033364    0.4850979
>>  0.01553433
>>
>> I have explored various options of the command: read.table, with one as:
>> read.table("sample.txt", na.strings="NA",as.is = TRUE)
>>
>> However, everything that it reads in becomes a character.
>>
>> Could you please help me on this?
>> Best regards,
>> Chee
>>
>>        [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>



More information about the R-help mailing list