[R] Setting log(0) to 0

(Ted Harding) Ted.Harding at nessie.mcc.ac.uk
Wed Feb 16 14:26:53 CET 2005


On 16-Feb-05 Terji Petersen wrote:
> Hi,
> 
> I'm trying to do  a regression like this:
> 
> wage.r = lm( log(WAGE) ~ log(EXPER)
> 
> where EXPER is an integer that goes from 0 to about 50.
> EXPER contains some zeros, so you can't take its log,
> and the above regression therefore fails. I would like
> to make R accept log(0) as 0, is that  possible?
> Or do I have first have to turn the 0's into 1's to be
> able to do the above regression?

If treating "log(0)" as 0 would do your business, then a
preliminary pass to turn "0" into "1" would be the simplest
method. It only takes 1 line.

This is a bit of a Catch-22. It looks like you're trying
to fit a power law

  WAGE = A*(EXPER^B)

(where I guess "EXPER" means experience) and you've got
some cases with no experience. Whether your work-round
is appropriate depends in part on the unit of "experience".
If it's in years, then a case with 3 months experience
would have log(EXPER) = -1.39, thereby weighing in with
a lesser value than someone with zero experience, on your
proposal.

On the other hand, if it's in days, then

  log(EXPER) = log(91) = 4.51

and even someone with only a week has log(EXPER) = 1.95

But your log(0) = 0 data would be sitting there all the
time, whatever the scale of EXPER, and so would have an
influence on your regression which depended on this scale.
You might have to consider using log(0) --> const
where the "const" is such as to give reasonable results,
given what comes out of the rest of the data (where EXPER>0).

The fundamental problem is that your power law predicts
zero wage for zero experience, which is rarely the case.

You might do better to try a non-linear fit

  WAGE = W0 + A*(EXPER^B)

for which sort of thing there are several resources in R,
perhaps the simplest being 'nls'.

For what you have in your installed packages, try a

  help.search("nonlinear")

Once you open this door, you can try perhaps more realistic
non-linear models, including what can be found amongst the
"SS....." (Self-Starting) models in "nls" --  have a look
at what's listed by

  library(help=nls)

as well as what is allowed according to "?nls".

Such models would allow an initial (zero-experience) wage,
perhaps not changing much for some time, then rising more
rapidly once an "experience threshold" is passed, then
flattening out to a lower slope over a longer time (something
which many of us have experience of). And even ultimately
ending to decrease ...

Hoping this helps,
Ted.


--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at nessie.mcc.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 16-Feb-05                                       Time: 13:26:53
------------------------------ XFMail ------------------------------




More information about the R-help mailing list