[R] Weighted Ridge Regression with GCV Optimization

Preetam Pal lordpreetam at gmail.com
Tue Sep 22 22:25:39 CEST 2015


Hi R-users,

I am having problems while implementing the following model:

   1. I have numerical regressors (GDP, HPA and FX observed quarterly) and
   need to predict the numerical variable Y.
   2. I have to run *weighted Ridge Regression* where the weights of the
   squared residuals are decreasing at 5% with every quarter into the past.
   3. Before estimating beta, I need select the *optimal Ridge parameter*
   (lambda) wrt the GCV criterion:
                                                  a> For any lambda, divide
   the data into say, blocks B1, B2, B3, B4 and B5 of size k = 20% of data
   size. For each i, remove  B_i, estimate the beta vector           over the
   remaining data set and find the unweighted SSE (or any other deviation
   metric ) using this beta vector on the block B_i. Iterate over all
                 five B_i''s  ( i =1,2,3,4) and get the average of the 4 sse
   values.
                                                                    b> Allow
   lambda to vary between 0 to 1 in steps of size 0.01 and choose that lambda
   which minimizes the average sse computed in step a>
   4. With this choice of lambda, my final beta estimate would be [X'W'WX +
   lambda * Identity Matrix]^(-1)  * X'W'WY.
   5. Here W'W is a diagonal matrix whose diagonals are decreasing from the
   last entry upwards at 5% decay rate and trace(W'W) = 1 (i.e. sum of weights
   = 1)

I know lm.ridge() can do Ridge Regression, but I dont know how to write the
code with these weights, GCV criterion etc.

Can you please help me with this? I have attached the exact data in .txt
format (should be readable with read.table() ).Please let me know in case I
need to provide any more clarifications.

Thanks,
Preetam
-------------- next part --------------
T	GDP Rate	HPA	FX	Y
1	0.806660537	2.177803167	1.14980573	2.733594304
2	0.997724655	1.585686087	0.814496976	3.193948056
3	0.99032353	0.569843997	0.464488882	3.065751781
4	0.606121306	3.037648988	0.565322084	4.537399052
5	0.858131141	4.816423605	1.924534222	7.871730873
6	0.052909178	2.048591352	1.470221953	2.580646078
7	0.081400487	1.152495559	1.128828557	7.200336313
8	0.840972911	3.848225962	1.004272646	1.211124673
9	0.965868218	1.039679934	0.231408747	7.566968
10	0.952626722	4.455565591	0.483541015	9.412639513
11	0.067691757	0.038417569	0.69744243	8.055369029
12	0.985658841	1.143481763	1.65850909	6.962599601
13	0.177186946	3.762691635	0.44379572	9.904367023
14	0.490066697	0.655629739	1.281478696	1.796422139
15	0.223740666	1.393201062	1.235291827	5.237943945
16	0.782873809	1.485727273	0.224511215	6.399036418
17	0.947492758	0.318485005	1.158911495	8.183470692
18	0.49692711	2.169601457	1.777618832	8.830805294
19	0.956704273	1.546827505	0.241838792	7.554654431
20	0.404624372	3.041530693	1.66039172	6.709330773
21	0.98557461	2.45656369	1.695179666	8.638707974
22	0.494102398	4.527230971	0.993352283	7.958872374
23	0.893182943	3.429112971	0.675541115	5.665249801
24	0.669680459	0.459919029	1.011872328	8.883120607
25	0.017296599	2.184045646	1.575891106	2.585709635


More information about the R-help mailing list