[R] about data problem

lily li chocold12 at gmail.com
Wed Sep 21 00:56:57 CEST 2016


Is there a function in read.csv that I can use to avoid converting numeric
to factor? Thanks a lot.



On Tue, Sep 20, 2016 at 4:42 PM, lily li <chocold12 at gmail.com> wrote:

> Thanks. Then what should I do to solve the problem?
>
> On Tue, Sep 20, 2016 at 4:30 PM, Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
> wrote:
>
>> I suppose you can do what works for your data, but I wouldn't recommend
>> na.rm=TRUE because it hides problems rather than clarifying them.
>>
>> If in fact your data includes true NA values (the letters NA or simply
>> nothing between the commas are typical ways this information may be
>> indicated), then read.csv will NOT change from integer to factor
>> (particularly if you have specified which markers represent NA using the
>> na.strings argument documented under read.table)... so you probably DO have
>> unexpected garbage still in your data which could be obscuring valuable
>> information that could affect your conclusions.
>> --
>> Sent from my phone. Please excuse my brevity.
>>
>> On September 20, 2016 3:11:42 PM PDT, lily li <chocold12 at gmail.com>
>> wrote:
>> >I reread the data, and use 'na.rm = T' when reading the data. This time
>> >it
>> >has no such problem. It seems that the existence of NAs convert the
>> >integer
>> >to factor. Thanks for your help.
>> >
>> >
>> >On Tue, Sep 20, 2016 at 4:09 PM, Jianling Fan <fanjianling at gmail.com>
>> >wrote:
>> >
>> >> Add the "stringsAsFactors = F"  when you read the data, and then
>> >> convert them to numeric.
>> >>
>> >> On 20 September 2016 at 16:00, lily li <chocold12 at gmail.com> wrote:
>> >> > Yes, it is stored as factor. I can't check out any problem in the
>> >> original
>> >> > data. Reread data doesn't help either. I use read.csv to read in
>> >the
>> >> data,
>> >> > do you think it is better to use read.table? Thanks again.
>> >> >
>> >> > On Tue, Sep 20, 2016 at 3:55 PM, Greg Snow <538280 at gmail.com>
>> >wrote:
>> >> >
>> >> >> This indicates that your Discharge column has been
>> >stored/converted as
>> >> >> a factor (run str(df) to verify and check other columns).  This
>> >> >> usually happens when functions like read.table are left to try to
>> >> >> figure out what each column is and it finds something in that
>> >column
>> >> >> that cannot be converted to a number (possibly an oh instead of a
>> >> >> zero, an el instead of a one, or just a letter or punctuation mark
>> >> >> accidentally in the file).  You can either find the error in your
>> >> >> original data, fix it, and reread the data, or specify that the
>> >column
>> >> >> should be numeric using the colClasses argument to read.table or
>> >other
>> >> >> function.
>> >> >>
>> >> >>
>> >> >>
>> >> >> On Tue, Sep 20, 2016 at 3:46 PM, lily li <chocold12 at gmail.com>
>> >wrote:
>> >> >> > Hi R users,
>> >> >> >
>> >> >> > I have a problem in reading data.
>> >> >> > For example, part of my dataframe is like this:
>> >> >> >
>> >> >> > df
>> >> >> > month day year          Discharge
>> >> >> >    3        1   2010                6.4
>> >> >> >    3        2   2010               7.58
>> >> >> >    3        3   2010               6.82
>> >> >> >    3        4   2010               8.63
>> >> >> >    3        5   2010               8.16
>> >> >> >    3        6   2010               7.58
>> >> >> >
>> >> >> > Then if I type summary(df), why it converts the discharge data
>> >to
>> >> >> levels? I
>> >> >> > also met the same problem when reading some other csv files. How
>> >to
>> >> solve
>> >> >> > this problem? Thanks.
>> >> >> >
>> >> >> > Discharge
>> >> >> > 7.58     :2
>> >> >> > 6.4       :1
>> >> >> > 6.82     :1
>> >> >> > 8.63     :1
>> >> >> > 8.16     :1
>> >> >> >
>> >> >> >         [[alternative HTML version deleted]]
>> >> >> >
>> >> >> > ______________________________________________
>> >> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>> >see
>> >> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> >> > PLEASE do read the posting guide http://www.R-project.org/
>> >> >> posting-guide.html
>> >> >> > and provide commented, minimal, self-contained, reproducible
>> >code.
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Gregory (Greg) L. Snow Ph.D.
>> >> >> 538280 at gmail.com
>> >> >>
>> >> >
>> >> >         [[alternative HTML version deleted]]
>> >> >
>> >> > ______________________________________________
>> >> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >> > https://stat.ethz.ch/mailman/listinfo/r-help
>> >> > PLEASE do read the posting guide http://www.R-project.org/
>> >> posting-guide.html
>> >> > and provide commented, minimal, self-contained, reproducible code.
>> >>
>> >>
>> >>
>> >> --
>> >> Jianling Fan
>> >> 樊建凌
>> >>
>> >
>> >       [[alternative HTML version deleted]]
>> >
>> >______________________________________________
>> >R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >https://stat.ethz.ch/mailman/listinfo/r-help
>> >PLEASE do read the posting guide
>> >http://www.R-project.org/posting-guide.html
>> >and provide commented, minimal, self-contained, reproducible code.
>>
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list