[R] Rpart help

peter dalgaard pdalgd at gmail.com
Wed May 24 10:30:16 CEST 2017


> On 24 May 2017, at 04:38 , Bert Gunter <bgunter.4567 at gmail.com> wrote:
> 
> 1. Forget Excel. Erase it from your memory. banish its paradigms from
> your practices. Faiing to do so will only bring misery as you explore
> R. R is a rational programming language primarily for data analysis,
> statistics, and graphics. Excel is, ummm, not.

And, never mind Bert's rant, a simple table(single_order, churn) would give info similar to what you claim to have from Excel, minus the risk of finding that the data are not the same, or that Excel was doing something bizarre.

-pd


> 
> 2. Have you read the rpart documents and vignettes? That should be
> your first port of call for questions about how it works.
> 
> 
> Cheers,
> Bert
> 
> 
> Bert Gunter
> 
> "The trouble with having an open mind is that people keep coming along
> and sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
> 
> 
> On Tue, May 23, 2017 at 6:45 PM, kristen wissmar
> <wissmar.kristen at gmail.com> wrote:
>> Hi R users!
>> 
>> I'm new to R, so I'm starting with a basic exercise in rpart.
>> 
>> I'm predicting if a user will churn based on past order history.  I've
>> calculated the probabilities in excel, and if user is a single order
>> customer (1), then their probability of churn is 90%, if there are multiple
>> orders(0) then the probability of churning is 70%. In the R model, the
>> probability looks like it's 100% and 53%. In excel I used the count of
>> shopper_key to calculate probabilities. So I'm wondering if R has needs a
>> shopper_key to count?
>> 
>> It would be helpful if someone could suggest where I'm going wrong.
>> 
>> Thank you!
>> 
>> 
>> Code -
>> m1 <- rpart( churn ~ single_order , data = data2, method="anova" )
>> 
>> Output-
>> n= 22041
>> 
>> node), split, n, deviance, yval
>>      * denotes terminal node
>> 
>> 1) root 22041 3229.265 0.8216959
>>  2) single_order< 0.5 8407 2092.852 0.5325324 *
>>  3) single_order>=0.5 13634    0.000 1.0000000 *
>> 
>> 
>> shopper_key churn single_order
>> 1 1 0
>> 2 1 1
>> 3 0 0
>> 4 1 0
>> 5 1 1
>> 6 1 1
>> 7 1 0
>> 8 1 1
>> 9 0 1
>> 10 1 1
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
> 
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Office: A 4.23
Email: pd.mes at cbs.dk  Priv: PDalgd at gmail.com



More information about the R-help mailing list