[R] neural network not using all observations

Tue May 12 17:10:46 CEST 2009

I am exploring neural networks (adding non-linearities) to see if I can
get more predictive power than a linear regression model I built. I am
using the function nnet and following the example of Venables and
Ripley, in Modern Applied Statistics with S, on pages 246 to 249. I have
standardized variables (z-scores) such as assets, age and tenure. I have
other variables that are binary (0 or 1). In max_acc_ownr_nwrth_n_med
for example, the variable has a value of 1 if the client's net worth is
above the median net worth and a value of 0 otherwise. These are derived
variable I created and variables that the regression algorithm has found
to be predictive. A regression on the same variables shown below gives
me an R-Square of about 0.12. I am trying to increase the predictive
power of this regression model with a neural network being careful to
avoid overfitting.

Similar to Venables and Ripley, I used the following code:

> library(nnet)

> dim(coreaff.trn.nn)

[1] 5088    8

> head(coreaff.trn.nn)

  hh.iast.y WC_Total_Assets all_assets_per_hh         age      tenure
max_acc_ownr_liq_asts_n_med max_acc_ownr_nwrth_n_med
max_acc_ownr_ann_incm_n_med

1   3059448      -0.4692186        -0.4173532 -0.06599001 -1.04747935
0                        1                           0

2   4899746       3.4854334         4.0111164 -0.06599001 -0.72540200
1                        1                           1

3    727333      -0.2677357        -0.4177944 -0.30136473 -0.40332465
1                        1                           1

4    443138      -0.5295170        -0.6999646 -0.14444825 -1.04747935
0                        0                           0

5    484253      -0.6112205        -0.7306664  0.64013414  0.07979137
1                        0                           0

6    799054       0.6580506         1.1763114  0.24784295  0.07979137
0                        1                           1

> coreaff.nn1 <- nnet(hh.iast.y ~ WC_Total_Assets + all_assets_per_hh +
age + tenure + max_acc_ownr_liq_asts_n_med +

+                     max_acc_ownr_nwrth_n_med +
max_acc_ownr_ann_incm_n_med, coreaff.trn.nn, size = 2, decay = 1e-3,

+                     linout = T, skip = T, maxit = 1000, Hess = T)

# weights:  26

initial  value 12893652845419998.000000 

iter  10 value 6352515847944854.000000

final  value 6287104424549762.000000 

converged

> summary(coreaff.nn1)

a 7-2-1 network with 26 weights

options were - skip-layer connections  linear output units  decay=0.001

     b->h1     i1->h1     i2->h1     i3->h1     i4->h1     i5->h1
i6->h1     i7->h1 

 -21604.84   -2675.80   -5001.90   -1240.16    -335.44  -12462.51
-13293.80   -9032.34 

     b->h2     i1->h2     i2->h2     i3->h2     i4->h2     i5->h2
i6->h2     i7->h2 

 210841.52   47296.92   58100.43  -13819.10   -9195.80  117088.99
131939.57  106994.47 

      b->o      h1->o      h2->o      i1->o      i2->o      i3->o
i4->o      i5->o      i6->o      i7->o 

1115190.67  894123.33 -417269.57   89621.84  170268.12   44833.63
59585.05  112405.30  437581.05  244201.69

> sum((hh.iast.y - predict(coreaff.nn1))^2)  

Error: object "hh.iast.y" not found

So I try:

> sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

Error: dims [product 5053] do not match the length of object [5088]

In addition: Warning message:

In coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1) :

  longer object length is not a multiple of shorter object length

Doing a little debugging:

> pred <- predict(coreaff.nn1)

> dim(pred)

[1] 5053    1

> dim(coreaff.trn.nn)

[1] 5088    8

So it looks like the dimensions (number of records/cases) of the vector
pred is 5,053 and the number of records of the input dataset is 5,088.

It looks like the neural network is dropping 35 records. Does anyone
have any idea of why it would do this? It is most probably because those
35 records are "bad" data, a pretty common occurrence in the real world.
Does anyone know how I can identify the dropped records? If I can do
this I can get the dimensions of the input dataset to be 5,053 and then:

> sum((coreaff.trn.nn$hh.iast.y - predict(coreaff.nn1))^2)

would work.

A summary of my dataset is:

> summary(coreaff.trn.nn)

   hh.iast.y        WC_Total_Assets      all_assets_per_hh         age
tenure           max_acc_ownr_liq_asts_n_med

 Min.   :       0   Min.   :-6.970e-01   Min.   :-8.918e-01   Min.
:-4.617e+00   Min.   :-1.209e+00   Min.   :0.0000             

 1st Qu.:  565520   1st Qu.:-5.387e-01   1st Qu.:-6.147e-01   1st
Qu.:-4.583e-01   1st Qu.:-7.254e-01   1st Qu.:0.0000             

 Median :  834164   Median :-3.160e-01   Median :-3.718e-01   Median :
9.093e-02   Median :-2.423e-01   Median :0.0000             

 Mean   : 1060244   Mean   : 2.948e-13   Mean   : 3.204e-12   Mean
:-1.884e-11   Mean   :-3.302e-12   Mean   :0.4951             

 3rd Qu.: 1207181   3rd Qu.: 1.127e-01   3rd Qu.: 1.891e-01   3rd Qu.:
5.617e-01   3rd Qu.: 5.629e-01   3rd Qu.:1.0000             

 Max.   :45003160   Max.   : 1.332e+01   Max.   : 4.011e+00   Max.   :
5.818e+00   Max.   : 4.267e+00   Max.   :1.0000             

                                                              NA's   :
3.500e+01                                                   

 max_acc_ownr_nwrth_n_med max_acc_ownr_ann_incm_n_med

 Min.   :0.0              Min.   :0.0000             

 1st Qu.:0.0              1st Qu.:0.0000             

 Median :0.5              Median :0.0000             

 Mean   :0.5              Mean   :0.3634             

 3rd Qu.:1.0              3rd Qu.:1.0000             

 Max.   :1.0              Max.   :1.0000

Since I am writing this post, I have a few other questions.

I know I can compare 2 regression models using:

anova(model1, model2)

Will this work if one of the models is a regression model and the other
model is a neural network? I have not reached the point in building a
neural network to try this yet. If not, is there any other way I can
compare the performance of a regression model and neural network? If not
I may have to resort to programming to do this. I can probably use
predict() to get one vector for the regression model and another for the
neural network and then compare these predictions against the actual
value.

Is there any R package that can produce lift charts (ROC curves, gains
tables, etc.), K-S statistic, etc., that can be used to quantify the
performance of a predictive model (as done in database marketing)? If
so, such a package can be used to compare a regression model and a
neural network.

Another question I have is can any of the neural network packages in R
(nnet, AMORE, neural, neuralnet, or others I do not know about) do
variable selection (the way the regression methods do)? Or must I do
this manually looking at the weights and pruning the network by
eliminating weights close to zero (at all the layers in the network)?

Thanks in advance,

Jude 

___________________________________________
Jude Ryan
Director, Client Analytical Services
Strategy & Business Development
UBS Financial Services Inc.
1200 Harbor Boulevard, 4th Floor
Weehawken, NJ 07086-6791
Tel. 201-352-1935
Fax 201-272-2914
Email: jude.ryan at ubs.com

-------------- next part --------------
Please do not transmit orders or instructions regarding a UBS 
account electronically, including but not limited to e-mail, 
fax, text or instant messaging. The information provided in 
this e-mail or any attachments is not an official transaction 
confirmation or account statement. For your protection, do not 
include account numbers, Social Security numbers, credit card 
numbers, passwords or other non-public information in your e-mail. 
Because the information contained in this message may be privileged, 
confidential, proprietary or otherwise protected from disclosure, 
please notify us immediately by replying to this message and 
deleting it from your computer if you have received this 
communication in error. Thank you. 

UBS Financial Services Inc. 
UBS International Inc. 
UBS Financial Services Incorporated of Puerto Rico 
UBS AG

UBS reserves the right to retain all messages. Messages are protected
and accessed only in legally justified cases.