[R] Dput Help in R

David Winsemius dwinsemius at comcast.net
Wed Dec 30 18:59:31 CET 2015


> On Dec 30, 2015, at 2:56 AM, SHIVI BHATIA <shivi.bhatia at safexpress.com> wrote:
> 
> Dear Team, 
> 
> 
> 
> I am facing an error while performing a manipulation using a dplyr package.
> In the code below, I am using mutate to build a new calculated column:
> 
> 
> 
> kp<-read.csv("collection_last.csv",header=TRUE)

Given the material below, I suspect that columns which you suspected of being 'numeric' were actually found to have some values that could not be converted to that class and so were entered as 'factor's. The approach of converting such a set of factor-values back to their intended numeric-values is not as simple as coercing to numeric. I would instead suggest that you learn how to use the colClasses argument for the `read.*`-functions. If all of the values are numeric then it could be as simple as:

kp<-read.csv("collection_last.csv",  # header=TRUE is default for read csv
               colClasses="numeric" )

If it is not that simple, then this might succeed:

kp[ , c('DOC_AMOUNT', 'RECEIPT_AMT', 'TDS_AMT', 'REBATE')] <- 
         lapply( kp[ , c('DOC_AMOUNT', 'RECEIPT_AMT', 'TDS_AMT', 'REBATE')], 
                 function(x) as.numeric(as.character(x))
                )

The care and fixing of factor arguments is just one of the items covered in the R-FAQ which, like the "Introduction to R" should be read by all R-noobs.

-- 
David.

> 
> mutate(kp,dif=DOC_AMOUNT-RECEIPT_AMT+TDS_AMT+REBATE)
> 
> 
> 
> However it gives an error:-
> 
> Warning messages:
> 
> 1: In Ops.factor(c(28831L, 28831L, 17504L, 4184L, 36187L, 25819L, 699L,  :
> 
>  '-' not meaningful for factors
> 
> 2: In Ops.factor(c(28831L, 28831L, 17504L, 4184L, 36187L, 25819L, 699L,  :
> 
>  '+' not meaningful for factors
> 
> 3: In Ops.factor(c(28831L, 28831L, 17504L, 4184L, 36187L, 25819L, 699L,  :
> 
>  '+' not meaningful for factors
> 
> 
> 
> This is an error when some of my variables are factors hence I have tried to
> change these to numeric so used the expression as:
> 
> kp$DOC_TYPE=as.numeric(kp$DOC_TYPE). 
> 
> 
> 
> this now shows as variable type of as "double". So expedite help on this one
> i was trying to create a reproducible example and i am highly struggling to 
> 
> create one. the data i have is approx. around 1 million rows with 21 columns
> hence when i use a dput option it does not capture the entire detailing and
> row level info required to share and even dput(head(kp$DOC_TYPE) does not
> help either. 
> 
> I have seen many stack overflow & r help column before composing this email.
> Hence i need help to create this reproducible example to share with the
> experts in the community. Apologies if this is a repeat.
> 
> 
> 
> PLEASE HELP AS I AM HIGHLY STRUGGLING TO BUILD ANY OUTCOME. 
> 
> Regards, Shivi
> 
> 
> 
> This e-mail is confidential. It may also be legally privileged. If you are not the addressee you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return e-mail. Internet communications cannot be guaranteed to be timely, secure, error or virus-free. The sender does not accept liability for any errors or omissions.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA



More information about the R-help mailing list