[R] question about spaces in r

MacQueen, Don macqueen1 at llnl.gov
Fri Dec 9 17:36:29 CET 2011


First of all, it's R, not r, and on this mailing list people care about
this kind of thing.

Second, you will need to provide more information in order to get better
help. Please read the posting guide.

There are a number of introductory level documents available via CRAN,
please pick one and study the basics.

That said, here are a few basics:

When R reads data that is a mixture of letters and digits, as yours is, it
will interpret all of them as characters. (Possibly converting to
something called a "factor", depending on exactly how the data is being
input.)

read.csv() produces data frames, not matrices. In a matrix, all values
must be the same type, numeric or character. In a data frame, different
columns can be different types, but each column must be all the same type.
Your column has a mixture of tyeps. Your input column apparently has a
mixture of types, violating the data frame rule, so read.csv() treats the
numbers as characters.

You can use as.numeric() on this column to convert to numeric. The
elements that look like numbers will become numeric, and the elements that
have letters will be converted to NA for missing.

Just guessing, but from what you've shown in looks like maybe your input
data (outside R) is structured in a way that is not conducive to reading
into R. This is quite common when the input data is a spreadsheet.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 12/9/11 7:44 AM, "Matt Spitzer" <matthewjspitzer at gmail.com> wrote:

>Hello,
>I would like to please ask if someone would explain how r reads characters
>and numbers differently.  Using read.csv, I had a matrix that resembled
>the
>following, only with many more ids and data:
>
>    ID
> Visit
> variable
> 2
> 1
> 5
> 2
> 1
> 3
> 2
> 3
> 4
> 2
> 41
> 1
> 2
> 42
> 34
> 2
> 5
> 54
> 2
> 9
> 1
> 2
> 10
> 3
> 2
> 12
> 5
> 5
> 1
> 54
> 5
> 2
> 9
> 5
> 3
> 3
> 5
> 41
> 54
> 5
> 41
> 2
> 5
> 5
> 235
> 5
> 9
> 4
> 5
> 10
> 2
> 5
> 12
> 2
>
>I then tried to subset for Visit==3.  However, subset == was not working
>properly.  This gave me zero rows.  I printed the matrix/dataframe and
>found that this was because r viewed the 3 as " 3" (space three).  So, I
>had to type subset == " 3" to select for the data instead.  I think this
>has to do with character, number and string properties, but I am quite a
>novice.  Would anyone be able to instruct me how one tells a
>dafaframe/matrix to convert numbers such as " 3" to "3" so that I do not
>get confused in the future?  I guess another problem I have is that I am
>still learning the differences between matrices and dataframes.
>Thanks so much, Matt
>
>	[[alternative HTML version deleted]]
>
>______________________________________________
>R-help at r-project.org mailing list
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list