[R] Logistic Regression with 200K features in R?

Eik Vettorazzi E.Vettorazzi at uke.de
Thu Dec 12 12:51:10 CET 2013


I thought so (with all the limitations due to collinearity and so on),
but actually there is a limit for the maximum size of an array which is
independent of your memory size and is due to the way arrays are
indexed. You can't create an object with more than 2^31-1 = 2147483647
elements.

https://stat.ethz.ch/pipermail/r-help/2007-June/133238.html

cheers

Am 12.12.2013 12:34, schrieb Romeo Kienzler:
> ok, so 200K predictors an 10M observations would work?
> 
> 
> On 12/12/2013 12:12 PM, Eik Vettorazzi wrote:
>> it is simply because you can't do a regression with more predictors than
>> observations.
>>
>> Cheers.
>>
>> Am 12.12.2013 09:00, schrieb Romeo Kienzler:
>>> Dear List,
>>>
>>> I'm quite new to R and want to do logistic regression with a 200K
>>> feature data set (around 150 training examples).
>>>
>>> I'm aware that I should use Naive Bayes but I have a more general
>>> question about the capability of R handling very high dimensional data.
>>>
>>> Please consider the following R code where "mygenestrain.tab" is a 150
>>> by 200000 matrix:
>>>
>>> traindata <- read.table('mygenestrain.tab');
>>> mylogit <- glm(V1 ~ ., data = traindata, family = "binomial");
>>>
>>> When executing this code I get the following error:
>>>
>>> Error in terms.formula(formula, data = data) :
>>>    allocMatrix: too many elements specified
>>> Calls: glm ... model.frame -> model.frame.default -> terms ->
>>> terms.formula
>>> Execution halted
>>>
>>> Is this because R can't handle 200K features or am I doing something
>>> completely wrong here?
>>>
>>> Thanks a lot for your help!
>>>
>>> best Regards,
>>>
>>> Romeo
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Eik Vettorazzi

Department of Medical Biometry and Epidemiology
University Medical Center Hamburg-Eppendorf

Martinistr. 52
20246 Hamburg

T ++49/40/7410-58243
F ++49/40/7410-57790
--

Besuchen Sie uns auf: www.uke.de
_____________________________________________________________________

Universitätsklinikum Hamburg-Eppendorf; Körperschaft des öffentlichen Rechts; Gerichtsstand: Hamburg
Vorstandsmitglieder: Prof. Dr. Christian Gerloff (Vertreter des Vorsitzenden), Prof. Dr. Dr. Uwe Koch-Gromus, Joachim Prölß, Rainer Schoppik
_____________________________________________________________________

SAVE PAPER - THINK BEFORE PRINTING



More information about the R-help mailing list