[R] Need fresh eyes to see what I'm missing

Bert Gunter bgunter@4567 @end|ng |rom gm@||@com
Tue Sep 14 18:59:58 CEST 2021


Input problems of this sort are often caused by stray or extra
characters (commas, dashes, etc.) in the input files, which then can
trigger automatic conversion to character. Excel files are somewhat
notorious for this.

A couple of comments, and then I'll quit, as others should have
greater insight (and may correct any of my errors).

1.
> as.numeric("1,")
[1] NA
Warning message:
NAs introduced by coercion

So if a stray character caused your "numeric" input to be read in as
character, then you converted it with as.numeric() (do not use
as.integer or as.double), you get that error.

2. So I would say that you need to check those columns in your data
frame that were read in as character instead of numeric.  I'd also
check the others with unique() or some such just to make sure they
have the handful of right values.

One way of doing this would be to look for NA's in as.numeric, as
above. But I thought you said you did
this already and found none, so I don't get it. Other approaches would
be to examine your .csv file with ?count.fields or try reading it with
?read.delim. Any discrepancies or errors you get from these may help
you to pinpoint problems like stray characters, to many fields in a
line, etc.

3. As for your "fps as factors" question, note that:
> as.numeric(factor("3"))
[1] 1

So it depends on how you read stuff in. The answer should be "no" with
read.csv(..., stringsAsFactors = FALSE), but I'm not sure what all you
did or what kind of junk in your .csv file may be causing R to misread
the numeric data as character.

As I said, others may be wiser and correct any errors in my "advice."
This is as far as I can go -- and it may already be too far.

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )



On Tue, Sep 14, 2021 at 9:01 AM Rich Shepard <rshepard using appl-ecosys.com> wrote:
>
> On Tue, 14 Sep 2021, Bert Gunter wrote:
>
> > Remove all your as.integer() and as.double() coercions. They are
> > unnecessary (unless you are preparing input for C code; also, all R
> > non-integers are double precision) and may be the source of your problems.
>
> Bert,
>
> When I remove coercions the script produces warnings like this:
> 1: In mean.default(fps, na.rm = TRUE) :
>    argument is not numeric or logical: returning NA
>
> and str(vel) displays this:
> 'data.frame':   565675 obs. of  6 variables:
>   $ year : chr  "2016" "2016" "2016" "2016" ...
>   $ month: int  3 3 3 3 3 3 3 3 3 3 ...
>   $ day  : int  3 3 3 3 3 3 3 3 3 3 ...
>   $ hour : chr  "12" "12" "12" "12" ...
>   $ min  : int  0 10 20 30 40 50 0 10 20 30 ...
>   $ fps  : chr  "1.74" "1.75" "1.76" "1.81" ...
>
> so month, day, and min are recognized as integers but year, hour, and fps
> are seen as characters. I don't understand why.
>
> Regards,
>
> Rich
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list