[R] Linear Logistic Regression - Understanding the output (and possibly the test to use!)

David Winsemius dwinsemius at comcast.net
Sun Sep 5 01:17:12 CEST 2010


On Sep 4, 2010, at 6:53 PM, stats at wittongilbert.free-online.co.uk wrote:

> Hi I know asking which test to use is frowned upon on this list...  
> so please do read on for at least a couple on sentences...
>
> I have some multivariate data slit as follows
>
> Tumour Site (one of 5 categories) #
> Chemo Schedule (one of 3 cats) ##
> Cycle (one of 3 cats*) ##
> Dose (one of 3 cats*) #
>
> *These are actually integers but for all our other analysis so far  
> we have grouped them into logical bands of categories.
>
> The dependant variable is "Reaction" or "No Reaction"
>
> I have individually analysed each of the independant variables  
> against Reaction/No Reaction using ChiSq and Fisher Tests. Those  
> marked ## produced p values less than 0.05, and those marked #  
> produce p values close to 0.05.
>
> We believe that Cycle is the crucial piece of data - the others just  
> appear to be different because there are more early cycles in  
> certain groups than others.
>
> SO - I believe what I need to do is a Linear Logistic Regression on  
> the 4 independant variables. And I'm expecting it to show that the  
> tumour site, schedule and dose don't matter, only the cycle matters.  
> Done a lot of reading and I'm clueless!!
>
> I think I want to do something like:
>
> glm (reaction ~ site + sched + cycle + dose, data=mydata,  
> family=poisson)
>
>
> I am then expecting to see some very long output with lots of  
> numbers... ...my question is TWO fold -
>
> 1. is glm the right thing to use before I waste my time

Yes, but if your outcome variable is binomial then the family argument  
should be .... "binomial". (And if you thought it should be poisson,  
then why below did you use gaussian???
>
> and 2. how do I interpret the result!

Result? What result? I do see any description of your data, nor any  
code.

> (I'm kind of expect a lecture here as I'm really looking for a nice  
> snappy 'p<0.05 means this variable is the one having the influence'  
> type answer and I suspect I'm going to be told thats not possible...!

I think you need to consult a statistician or someone who has taken  
the time to read that "statistical mumbo jumbo" you don't want to  
learn. This mailing list is not set up to be a tutorial site.

(Re your request below: Some years ago I saw one of those "programmed  
learning" texts by Kleinbaum on logistic regression. Maybe you could  
read it and see if it makes your consulting sessions go more smoothly.)

http://www.bookfinder.com/search/?author=kleinbaum&title=logistic+regression&lang=en&isbn=&submit=Begin+search&new_used=*&destination=us&currency=USD&mode=basic&st=sr&ac=qr

I have a couple of Kleinbaum's (et al) other texts and find them to be  
well written and reasoned, so I suspect the citation above would be as  
accessible as any.

>
> To be clear the example given in the docs is:
>
>> library(MASS)

<snipped an example that was not relevant to logistic regression>
>
> ---
> Either can someone point me to a decent place that would explain  
> what the means or provide me some pointers? i.e. which of the  
> variables has the influence on the outcome in the anorexia data?
>
> Please don't shout!! happy to be pointed to a reference but would  
> prefer one in common english not some stats mumbo jumbo!
>
> Calum

-- 

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list