# [R] Long-tail model in R ... anyone?

Dirk Eddelbuettel edd at debian.org
Wed Jul 4 21:15:49 CEST 2007

```I think you simply had your nls() syntax wrong.  Works here:

## first a neat trick to read the data from embedded text
+ rank,cum_value
10,     17396510
32,     31194809
96,     53447300
420,    100379331
1187,   152238166
24234,  432238757
91242,  581332371
294180, 650880870
1242185,665227287"))
>

## then compute cumulative share
> fmdata[,"cumshare"] <- fmdata[,"cum_value"] / fmdata[nrow(fmdata),"cum_value"]
>

## then check the data, just in case
> summary(fmdata)
rank           cum_value            cumshare
Min.   :     10   Min.   : 17396510   Min.   :0.02615
1st Qu.:     96   1st Qu.: 53447300   1st Qu.:0.08034
Median :   1187   Median :152238166   Median :0.22885
Mean   : 183732   Mean   :298259489   Mean   :0.44836
3rd Qu.:  91242   3rd Qu.:581332371   3rd Qu.:0.87389
Max.   :1242185   Max.   :665227287   Max.   :1.00000
>

## finally estimate the model, using only the first seven rows of data
## using the parametric form from the paper and some wild guesses as
## starting values:
> fit <- nls(cumshare ~ Beta / ((N50 / rank)^Alpha + 1), data=fmdata[1:7,], start=list(Alpha=1, Beta=1, N50=1e4))
> summary(fit)

Formula: cumshare ~ Beta/((N50/rank)^Alpha + 1)

Parameters:
Estimate Std. Error t value Pr(>|t|)
Alpha 4.829e-01  5.374e-03   89.86 9.20e-08 ***
Beta  1.429e+00  2.745e-02   52.07 8.14e-07 ***
N50   3.560e+04  3.045e+03   11.69 0.000306 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.002193 on 4 degrees of freedom

Number of iterations to convergence: 8
Achieved convergence tolerance: 1.297e-06

>

which is reasonably close to the quoted
N50 = 30714, α = 0.49, and β = 1.38.

You can probably play a little with the nls options to see what effect this
has.

That said, seven observations for three parameters in non-linear model may be
a little hazardous.  One indication is that the estimated parameters values
are not too stable once you add the eights and nineth row of data.

Dirk

--
Hell, there are no rules here - we're trying to accomplish something.
-- Thomas A. Edison

```