[R] read.table truncated data?

jim holtman jholtman at gmail.com
Thu Aug 25 17:57:03 CEST 2011


But did you try the following:

x <- read.table(...., comment.char = '', quote = '')

Most cases is that there is a missing quote somewhere in your data.
use a text editor and search for single and double quotes.

On Thu, Aug 25, 2011 at 11:49 AM, zhenjiang xu <zhenjiang.xu at gmail.com> wrote:
> Thanks for your replies. I looked at those lines and didn't spot anything
> unusual.
>
>> tail(a)
>        test_id gene_id gene               locus sample_1 sample_2 status
> 21418 tY(GUA)J1       - SUP7 chr10:354243-354332 air1rrp6 air2rrp6     OK
> 21419 tY(GUA)J2       - SUP4 chr10:542955-543044 air1rrp6 air2rrp6     OK
> 21420 tY(GUA)M1       - SUP5 chr13:168794-168883 air1rrp6 air2rrp6     OK
> 21421 tY(GUA)M2       - SUP8 chr13:837927-838016 air1rrp6 air2rrp6     OK
> 21422  tY(GUA)O       - SUP3 chr15:288191-288280 air1rrp6 air2rrp6     OK
> 21423  tY(GUA)Q       -    -   chrmt:70823-70907 air1rrp6 air2rrp6     OK
>      value_1 value_2 ln.fold_change. test_stat  p_value  q_value
> significant
> 21418 0.00000  0.0000        0.000000   0.00000 1.000000 1.011650
>  no
> 21419 0.00000  0.0000        0.000000   0.00000 1.000000 1.011480
>  no
> 21420 0.00000  0.0000        0.000000   0.00000 1.000000 1.011500
>  no
> 21421 0.00000  0.0000        0.000000   0.00000 1.000000 1.011520
>  no
> 21422 0.00000  0.0000        0.000000   0.00000 1.000000 1.011550
>  no
> 21423 6.68356 10.7397        0.474301  -1.08614 0.277417 0.455917
>  no
>
>
> tY(GUA)J1       -       SUP7    chr10:354243-354332     rrp6    air1rrp6
>   OK      0       0       0       0       1    1.00404  no
> tY(GUA)J2       -       SUP4    chr10:542955-543044     rrp6    air1rrp6
>   OK      0       0       0       0       1    1.00497  no
> tY(GUA)M1       -       SUP5    chr13:168794-168883     rrp6    air1rrp6
>   OK      0       0       0       0       1    1.00492  no
> tY(GUA)M2       -       SUP8    chr13:837927-838016     rrp6    air1rrp6
>   OK      0       0       0       0       1    1.00488  no
> tY(GUA)O        -       SUP3    chr15:288191-288280     rrp6    air1rrp6
>   OK      0       0       0       0       1    1.00485  no
> tY(GUA)Q        -       -       chrmt:70823-70907       rrp6    air1rrp6
>   OK      4.49644 6.68356 0.396365        -0.766052     0.443645
>  0.634724        no
> 15S_rRNA        -       15S_RRNA        chrmt:6545-8194 WT      air2rrp6
>   OK      2288.88 711.697 -1.16817        2.78772       0.00530801
>  0.0167772       yes
> 21S_rRNA        -       21S_RRNA        chrmt:58008-62447       WT
>  air2rrp6        OK      4134.59 1927.04 -0.7634 1.58991       0.111855
>   0.22339 no
> ETS1-1  -       ETS1-1  chr12:457732-458432     WT      air2rrp6        OK
>   3258.97 1114.76 -1.07277        2.91211 0.00359       0.0121587       yes
> ETS1-2  -       ETS1-2  chr12:466869-467569     WT      air2rrp6        OK
>   3258.97 1114.76 -1.07277        2.91211 0.00359       0.0121597       yes
>
>
> On Wed, Aug 24, 2011 at 2:34 PM, Sarah Goslee <sarah.goslee at gmail.com>wrote:
>
>> Hi,
>>
>> On Wed, Aug 24, 2011 at 2:18 PM, zhenjiang xu <zhenjiang.xu at gmail.com>
>> wrote:
>> > Hi R users,
>> >
>> > I was using read.table to read a file. The data.fame looked alright, but
>> I
>> > found not all rows are read by the read.table. What's wrong with it? It
>> > didn't give me any warning or error messages. Why the data are truncated?
>> > Thanks.
>> >
>> > $ wc -l all/isoform_exp.diff
>> > 42847 all/isoform_exp.diff
>> >
>> >> a=read.table('all/isoform_exp.diff', header=T, sep='\t')
>> >> nrow(a)
>> > [1] 21423
>>
>> This is a common problem. You need to take a look at the last row that
>> was imported, and the rows around 21423 in the original file.
>>
>> Common causes include stray single or double quotation marks, and
>> other special characters in your file like the default comment.char #
>>
>> Sarah
>> --
>> Sarah Goslee
>> http://www.functionaldiversity.org
>>
>
>
>
> --
> Best,
> Zhenjiang
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?



More information about the R-help mailing list