[BioC] Classification

David martin vilanew at gmail.com
Fri Jun 24 17:39:58 CEST 2011


Agree , need crossvalidation !!!
thanks for your comments.

On 06/24/2011 05:38 PM, Tim Triche, Jr. wrote:
> You have an ordinal response, so you might consider an ordered probit model
> with interaction terms and a penalized likelihood fit, and determine the
> best penalty by cross-validation.  I don't recall whether CMA supports
> ordered probit models, but it's probably the best approach, and you could
> just brute-force it -- you've only got 120 different models to fit under
> this scheme.  At the very least, CMA would generate the cross-validation
> sets for you.
>
> You might also want to consider recursively fitting a shrunken LDA model
> (diseased/healthy, moderate/severe) and see how that compares to an ordinal
> model.  Regardless, cross-validation is the obvious answer to how to pick
> one.
>
> Hope this helps,
> -t
>
> On Fri, Jun 24, 2011 at 8:24 AM, David martin<vilanew at gmail.com>  wrote:
>
>> thanks.
>> Is not binary since i have three categories and 5 genes. I have tried LDA
>> and stepclass
>>
>> #LDR stepwise
>> disc<-stepclass(Group~ ., data =dataf, method = "lda",improvement = 0.001)
>>
>> where group contains my three categories ("healthy","moderate disease",
>> "severe disease") and dataf the pcr values for my 5 genes.
>>
>> The problem i have is that stepwise generates a different signature each
>> time (as it randomly picks up a gene to start with)? This is ok for me but
>> how many times do you need to run stepclass so that you found your mopst
>> probable genes that classify your groups , Do i need to do a loop for
>> stepclass ???
>>
>> thanks
>>
>>
>>
>> On 06/24/2011 05:17 PM, Kevin R. Coombes wrote:
>>
>>> .. and probably should ...
>>>
>>> For a binary classification with only a few predictors, you can, for
>>> example, use logistic regression with some standard criterion like AIC,
>>> BIC, or Bayesian model averaging to decide which predictors should be
>>> retained.
>>>
>>> Kevin
>>>
>>> On 6/23/2011 6:10 PM, Moshe Olshansky wrote:
>>>
>>>> If you have just 5 genes and a decent number of samples you can use
>>>> any of
>>>> the "conventional" (i.e. not high throughput) methods like LDA, trees,
>>>> Random Forest, SVM, etc.
>>>>
>>>>   I will have a look at both packages. It's pcr data by the way
>>>>> thanks
>>>>>
>>>>> On 06/23/2011 05:56 PM, Tim Triche, Jr. wrote:
>>>>>
>>>>>> or CMA, which is perhaps a more systematic approach for classification.
>>>>>> (the package name stands for Classification of MicroArrays) Very well
>>>>>> thought out.
>>>>>>
>>>>>>
>>>>>> On Thu, Jun 23, 2011 at 8:02 AM, Sean
>>>>>> Davis<sdavis2 at mail.nih.gov>
>>>>>> wrote:
>>>>>>
>>>>>>   On Thu, Jun 23, 2011 at 10:58 AM, David
>>>>>>> martin<vilanew at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> I have 5 genes of interest. I would like to know which combination(s)
>>>>>>>> of
>>>>>>>> genes gives the best disease separation. Which test could i use in my
>>>>>>>> training set to see which combination is the best classificer between
>>>>>>>> my
>>>>>>>> disease and my healthy population.
>>>>>>>>
>>>>>>>> Thanks for any comment or test that could be useful to answer that
>>>>>>>>
>>>>>>> question.
>>>>>>>
>>>>>>> Check out the MLInterfaces package. It should give you some ideas on
>>>>>>> where to start.
>>>>>>>
>>>>>>> Sean
>>>>>>>
>>>>>>> ______________________________**_________________
>>>>>>> Bioconductor mailing list
>>>>>>> Bioconductor at r-project.org
>>>>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>>>> Search the archives:
>>>>>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>   ______________________________**_________________
>>>>> Bioconductor mailing list
>>>>> Bioconductor at r-project.org
>>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>>> Search the archives:
>>>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>>
>>>>>
>>>>
>>>> ______________________________**______________________________**
>>>> __________
>>>> The information in this email is confidential and intend...{{dropped:4}}
>>>>
>>>> ______________________________**_________________
>>>> Bioconductor mailing list
>>>> Bioconductor at r-project.org
>>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>>> Search the archives:
>>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>>
>>>
>>> ______________________________**_________________
>>> Bioconductor mailing list
>>> Bioconductor at r-project.org
>>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>>> Search the archives:
>>> http://news.gmane.org/gmane.**science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>>
>>>
>> ______________________________**_________________
>> Bioconductor mailing list
>> Bioconductor at r-project.org
>> https://stat.ethz.ch/mailman/**listinfo/bioconductor<https://stat.ethz.ch/mailman/listinfo/bioconductor>
>> Search the archives: http://news.gmane.org/gmane.**
>> science.biology.informatics.**conductor<http://news.gmane.org/gmane.science.biology.informatics.conductor>
>>
>
>
>



More information about the Bioconductor mailing list