[R] How long to wait for process?

Bert Gunter bgunter.4567 at gmail.com
Wed Jul 26 16:49:13 CEST 2017


Dunno. You might wish to email the maintainer (see ?maintainer), who
may not monitor this list, if you do not get a satisfactory reply
here.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Wed, Jul 26, 2017 at 7:14 AM, john polo <jpolo at mail.usf.edu> wrote:
> UseRs,
>
> I have a dataframe with 2547 rows and several hundred columns in R 3.1.3. I
> am trying to run a small logistic regression with a subset of the data.
>
> know_fin ~
> comp_grp2+age+gender+education+employment+income+ideol+home_lot+home+county
>
>     > str(knowf3)
>     'data.frame':   2033 obs. of  18 variables:
>     $ userid    : Factor w/ 2542 levels "FNCNM1639","FNCNM1642",..: 1857 157
> 965 1967 164 315 849 1017 699 189 ...
>     $ round_id   : Factor w/ 1 level "Round 11": 1 1 1 1 1 1 1 1 1 1 ...
>     $ age       : int  67 66 44 27 32 67 36 76 70 66 ...
>     $ county: Factor w/ 80 levels "Adair","Alfalfa",..: 75 75 75 75 75 75 64
> 64 64 64 ...
>     $ gender    : Factor w/ 2 levels "0","1": 1 2 1 1 2 1 2 1 2 2 ...
>     $ education : Factor w/ 8 levels "1","2","3","4",..: 6 7 6 8 2 4 2 4 2 6
> ...
>     $ employment: Factor w/ 9 levels "1","2","3","4",..: 8 4 4 4 3 8 5 8 4 4
> ...
>     $ income    : num  550000 80000 90000 19000 42000 30000 18000 50000
> 800000 10000 ...
>     $ home: num  0 0 0 0 0 0 0 0 0 0 ...
>     $ ideol     : Factor w/ 7 levels "1","2","3","4",..: 2 7 4 3 2 4 2 3 2 6
> ...
>     $ home_lot  : Factor w/ 3 levels "1","2","3": 2 2 2 2 2 2 3 3 1 2 ...
>     $ hispanic  : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
>     $ comp_grp2 : Factor w/ 16 levels "Cr_Gr","Cr_Ot",..: 13 13 13 13 13 13
> 10 10 10 10 ...
>     $ know_fin : Factor w/ 3 levels "0","1","2": 2 2 2 2 2 2 2 2 2 2 ...
>
>
> With the regular glm() function, I get a warning about "perfect or
> quasi-perfect separation"[1]. I looked for a method to deal with this and a
> penalized GLM is an accepted method[2]. This is implemented in logistf(). I
> used the default settings for the function.
>
> Just before I run the model, memory.size() for my session is ~4500 (MB).
> memory.limit() is ~25500. When I start the model, R immediately becomes
> non-responsive. This is in a Windows environment and in Task Manager, the
> instance of R is, and has been, using ~13% of CPU aand ~4997 MB of RAM. It's
> been ~24 hours now in that state and I don't have any idea of how long this
> should take. If I run the same model in the same setting with the base
> glm(), the model runs in about 60 seconds. Is there a way to know if the
> process is going to produce something useful after all this time or if it's
> hanging on some kind of problem?
>
>
>   [1]:
> https://stats.stackexchange.com/questions/11109/how-to-deal-with-perfect-separation-in-logistic-regression#68917
>   [2]:
> https://academic.oup.com/biomet/article-abstract/80/1/27/228364/Bias-reduction-of-maximum-likelihood-estimates
>
>
> --
> Men occasionally stumble
> over the truth, but most of them
> pick themselves up and hurry off
> as if nothing had happened.
> -- Winston Churchill
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list