[R] Analyzing large transition matrix

John Kane jrkrideau at yahoo.ca
Thu Jun 24 13:54:56 CEST 2010



--- On Thu, 6/24/10, Jim Lemon <jim at bitwrit.com.au> wrote:

> From: Jim Lemon <jim at bitwrit.com.au>
> Subject: Re: [R] Analyzing large transition matrix
> To: "Bill Harris" <bill_harris at facilitatedsystems.com>
> Cc: "r-help" <r-help at r-project.org>
> Received: Thursday, June 24, 2010, 7:44 AM
> On 06/23/2010 11:30 PM, Bill Harris
> wrote:
> > Let's say you have a dataframe of car trade-ins. 
> For example, each row
> > contains
> >
> > oldcar   newcar   qty
> >
> > and a typical entry could be
> >
> > lexus   bmw    1
> >
> > I put the qty column to allow for fleet purchases,
> where one purchase
> > may convert multiple cars at once.
> >
> > I'd like to show what's going on.  I could do a
> histogram of newcar to
> > show the frequency each type of car is bought. 
> If there are 5-10 car
> > types, that works.  If there are 50-100 or more,
> the legend gets
> > illegible.
> >
> > I could also do a histogram of oldcar to see what
> people gave up, but
> > that's less interesting.
> >
> > I'm considering a correlogram using the corrgram
> package, but a heat map
> > might work, too.  Any tips on making the legends
> useful in any of this?
> > Any better approaches to try?
> >
> > I tried table() and prop.table() to see if I could get
> transition
> > probabilities as if this were a Markov chain, but
> dim() comes out 108
> > 78, which is still too big to print or visualize.
> >
> Hi Bill,
> You could use sizetree (plotrix) if you have one car per
> line, but with 
> 50-100 initial categories, you're going to need a long
> piece of paper.
> 
Subset by manufacturer and done one per page?





More information about the R-help mailing list