[R] question about spaces in r

Petr PIKAL petr.pikal at precheza.cz
Mon Dec 12 08:00:57 CET 2011


Hi
> 
> First of all, it's R, not r, and on this mailing list people care about
> this kind of thing.
> 
> Second, you will need to provide more information in order to get better
> help. Please read the posting guide.
> 
> There are a number of introductory level documents available via CRAN,
> please pick one and study the basics.
> 
> That said, here are a few basics:
> 
> When R reads data that is a mixture of letters and digits, as yours is, 
it
> will interpret all of them as characters. (Possibly converting to
> something called a "factor", depending on exactly how the data is being
> input.)
> 
> read.csv() produces data frames, not matrices. In a matrix, all values
> must be the same type, numeric or character. In a data frame, different
> columns can be different types, but each column must be all the same 
type.
> Your column has a mixture of tyeps. Your input column apparently has a
> mixture of types, violating the data frame rule, so read.csv() treats 
the
> numbers as characters.
> 
> You can use as.numeric() on this column to convert to numeric. The

Not if it is factor (which you can check by str(your.object). With factor 
you need to do

as.numeric(as.character(factor.variable))

But it is always better to read data properly instead of changing them 
later after reading.

To get more help you shall follow Don§s advice to provide some more info 
about what you have, what you did and what you get.

Usefull commands are

?dput
?str

And probably spending some of your time reading R-intro document.

Regards
Petr

> elements that look like numbers will become numeric, and the elements 
that
> have letters will be converted to NA for missing.
> 
> Just guessing, but from what you've shown in looks like maybe your input
> data (outside R) is structured in a way that is not conducive to reading
> into R. This is quite common when the input data is a spreadsheet.

> -Don
> 
> 
> -- 
> Don MacQueen
> 
> Lawrence Livermore National Laboratory
> 7000 East Ave., L-627
> Livermore, CA 94550
> 925-423-1062
> 
> 
> 
> 
> 
> On 12/9/11 7:44 AM, "Matt Spitzer" <matthewjspitzer at gmail.com> wrote:
> 
> >Hello,
> >I would like to please ask if someone would explain how r reads 
characters
> >and numbers differently.  Using read.csv, I had a matrix that resembled
> >the
> >following, only with many more ids and data:
> >
> >    ID
> > Visit
> > variable
> > 2
> > 1
> > 5
> > 2
> > 1
> > 3
> > 2
> > 3
> > 4
> > 2
> > 41
> > 1
> > 2
> > 42
> > 34
> > 2
> > 5
> > 54
> > 2
> > 9
> > 1
> > 2
> > 10
> > 3
> > 2
> > 12
> > 5
> > 5
> > 1
> > 54
> > 5
> > 2
> > 9
> > 5
> > 3
> > 3
> > 5
> > 41
> > 54
> > 5
> > 41
> > 2
> > 5
> > 5
> > 235
> > 5
> > 9
> > 4
> > 5
> > 10
> > 2
> > 5
> > 12
> > 2
> >
> >I then tried to subset for Visit==3.  However, subset == was not 
working
> >properly.  This gave me zero rows.  I printed the matrix/dataframe and
> >found that this was because r viewed the 3 as " 3" (space three).  So, 
I
> >had to type subset == " 3" to select for the data instead.  I think 
this
> >has to do with character, number and string properties, but I am quite 
a
> >novice.  Would anyone be able to instruct me how one tells a
> >dafaframe/matrix to convert numbers such as " 3" to "3" so that I do 
not
> >get confused in the future?  I guess another problem I have is that I 
am
> >still learning the differences between matrices and dataframes.
> >Thanks so much, Matt
> >
> >   [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help at r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list