[R] Adjusting length of series

David Winsemius dwinsemius at comcast.net
Mon Jul 2 16:26:50 CEST 2012


On Jul 2, 2012, at 5:13 AM, Lekgatlhamang, lexi Setlhare wrote:

> Hi David and AK,
> I have been trying to implement your suggestions since yesterday,  
> but I encountered some challenges.
>
> As for David's suggestions, I could only implement it after some  
> modifications. Using an abridged version of my data, I dpud my  
> dataset and then show my steps below.

Well, your initial question (why the $ referencing did not work) is  
now answered. This is not a dataframe but rather a 'ts' classed object  
and there is no `$` method for such objects. They are really matrices  
with some extra attributes.

 > ydata$BoBCL1
Error in ydata$BoBCL1 : $ operator is invalid for atomic vectors

As I understood it you were able to get useful analyses using the  
formula methods for lm on these objects, but were just having  
difficulty with the "$" operator. So the answer is ..... don't do that.
-- 
David.

>
>> dput(ydata)
> structure(c(68.1000000000004, -34.8000000000002, 90.3999999999996,
> 54.6000000000004, -172.3, 51.8000000000002, 175, 79.8000000000002,
> -35.7000000000007, 130.5, 116.8, -67.5, 164.5, 514.8, -326.1,
> 98.4000000000005, 160.2, 53.1999999999998, 283.6, -111.6, 127.8,
> -17.3000000000002, 286.3, NA, NA, -102.900000000001, 125.2,  
> -35.7999999999993,
> -226.900000000001, 224.1, 123.2, -95.1999999999998, -115.500000000001,
> 166.200000000001, -13.6999999999998, -184.3, 232, 350.3,  
> -840.900000000001,
> 424.500000000001, 61.7999999999993, -107, 230.400000000001,  
> -395.200000000001,
> 239.400000000001, -145.1, 303.6, NA, NA, NA, 228.1, -160.999999999999,
> -191.100000000001, 451.000000000001, -100.900000000001, -218.4,
> -20.3000000000011, 281.700000000002, -179.900000000001, -170.6,
> 416.3, 118.3, -1191.2, 1265.4, -362.700000000002, -168.799999999999,
> 337.400000000001, -625.600000000001, 634.600000000001,  
> -384.500000000001,
> 448.700000000001, NA, NA, -164.457840999999, 17.0793539999995,
> 95.9767880000009, 680.238166999999, -491.348690999999, -274.694009,
> -256.332907, 469.62296, -146.431891, -41.0772019999995, -106.970104,
> 757.688263999999, -1689.214533, 2320.098952, -1446.97942, 516.384521,
> -375.277650999999, 293.867029999999, 417.845195, 278.198807,
> -968.592033999999, -314.195986, NA, NA, NA, 181.537194999999,
> 78.8974340000013, 584.261378999998, -1171.586858, 216.654681999999,
> 18.3611019999998, 725.955867, -616.054851, 105.354689000001,
> -65.8929020000005, 864.658367999999, -2446.902797, 4009.313485,
> -3767.078372, 1963.363941, -891.662171999999, 669.144680999999,
> 123.978165, -139.646388, -1246.790841, 654.396048, NA, 4937,
> 5005.1, 4970.3, 5060.7, 5115.3, 4943, 4994.8, 5169.8, 5249.6,
> 5213.9, 5344.4, 5461.2, 5393.7, 5558.2, 6073, 5746.9, 5845.3,
> 6005.5, 6058.7, 6342.3, 6230.7, 6358.5, 6341.2, 6627.5, 4187.5,
> 4296.004835, 4240.051829, 4201.178177, 4258.281313, 4995.622616,
> 5241.615228, 5212.913831, 4927.879527, 5112.468183, 5150.624948,
> 5147.704511, 5037.81397, 5685.611693, 4644.194883, 5922.877025,
> 5754.579747, 6102.66699, 6075.476582, 6342.153204, 7026.675021,
> 7989.395645, 7983.524235, 7663.456839), .Dim = c(24L, 7L), .Dimnames  
> = list(
>     NULL, c("DCred1", "DCred2", "DCred3", "DBoBC2", "DBoBC3",
>     "CredL1", "BoBCL1")), .Tsp = c(2001.08333333333, 2003, 12
> ), class = c("mts", "ts"))
>
> NB: the NAs in the dataset emanated from lagging or differencing the  
> series
>
> David's suggestion
>  df<-data.frame(DCred1,DCred2,DCred3,DBoBC2,DBoBC3,CredL1,BoBCL1)
> Error in data.frame(DCred1, DCred2, DCred3, DBoBC2, DBoBC3, CredL1,  
> BoBCL1) :
>   arguments imply differing number of rows: 23, 22, 21, 24
>
> So I modified as follows:
> length(DCred3)  # finding the minimum length of various series
> [1] 21
>
> # Then dataframe construction
> dframe<-  
> data.frame(Dcre1=DCred1[1:21],Dcre2=DCred2[1:21],Dcre3=DCred3[1:21],
> +  
> Dbobc2 
> = 
> DBoBC2 
> [1:21],Dbobc3=DBoBC3[1:21],CredL=CredL1[1:21],BoBCL=BoBCL1[1:21])
> # Then estimated regression
>> regCred<- lm(Dcre1~Dcre2+Dcre3+Dbobc2+Dbobc3+CredL+BoBCL,  
>> data=dframe)
>> summary(regCred)
> # Worked well as shown by results below
> Call:
> lm(formula = Dcre1 ~ Dcre2 + Dcre3 + Dbobc2 + Dbobc3 + CredL +
>     BoBCL, data = dframe)
> Residuals:
>     Min      1Q  Median      3Q     Max
> -69.516 -27.695  -8.085  13.851 107.276
> Coefficients:
>              Estimate Std. Error t value Pr(>|t|)
> (Intercept) 159.32304  157.15209   1.014 0.327873
> Dcre2        -0.75527    0.17262  -4.375 0.000634 ***
> Dcre3        -0.21006    0.08656  -2.427 0.029329 *
> Dbobc2        0.05111    0.06565   0.779 0.449197
> Dbobc3        0.03106    0.03510   0.885 0.391108
> CredL        -0.10967    0.04933  -2.223 0.043177 *
> BoBCL         0.09756    0.03097   3.150 0.007087 **
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> Residual standard error: 52.3 on 14 degrees of freedom
> Multiple R-squared: 0.9331,     Adjusted R-squared: 0.9044
> F-statistic: 32.55 on 6 and 14 DF,  p-value: 1.911e-07
>
> This is good, but couldn't I code the process for my 15 variable  
> model?
> Perhaps that is where the use of
> Dcr<- lapply(..., function(x) ...)
> comes in?
>
> AK, if you spare some minutes, please use my dput data to illustrate  
> the suggestion you made, I searched the lapply function (using ?? 
> lapply) but could not get a handle of how to use it in my case. My  
> dput data is as shown below.
>
>          DCred1 DCred2  DCred3      DBoBC2      DBoBC3 CredL1   BoBCL1
> Feb 2001   68.1     NA      NA          NA          NA 4937.0 4187.500
> Mar 2001  -34.8 -102.9      NA  -164.45784          NA 5005.1 4296.005
> Apr 2001   90.4  125.2   228.1    17.07935   181.53719 4970.3 4240.052
> May 2001   54.6  -35.8  -161.0    95.97679    78.89743 5060.7 4201.178
> Jun 2001 -172.3 -226.9  -191.1   680.23817   584.26138 5115.3 4258.281
> Jul 2001   51.8  224.1   451.0  -491.34869 -1171.58686 4943.0 4995.623
> Aug 2001  175.0  123.2  -100.9  -274.69401   216.65468 4994.8 5241.615
> Sep 2001   79.8  -95.2  -218.4  -256.33291    18.36110 5169.8 5212.914
> Oct 2001  -35.7 -115.5   -20.3   469.62296   725.95587 5249.6 4927.880
> Nov 2001  130.5  166.2   281.7  -146.43189  -616.05485 5213.9 5112.468
> Dec 2001  116.8  -13.7  -179.9   -41.07720   105.35469 5344.4 5150.625
> Jan 2002  -67.5 -184.3  -170.6  -106.97010   -65.89290 5461.2 5147.705
> Feb 2002  164.5  232.0   416.3   757.68826   864.65837 5393.7 5037.814
> Mar 2002  514.8  350.3   118.3 -1689.21453 -2446.90280 5558.2 5685.612
> Apr 2002 -326.1 -840.9 -1191.2  2320.09895  4009.31348 6073.0 4644.195
> May 2002   98.4  424.5  1265.4 -1446.97942 -3767.07837 5746.9 5922.877
> Jun 2002  160.2   61.8  -362.7   516.38452  1963.36394 5845.3 5754.580
> Jul 2002   53.2 -107.0  -168.8  -375.27765  -891.66217 6005.5 6102.667
> Aug 2002  283.6  230.4   337.4   293.86703   669.14468 6058.7 6075.477
> Sep 2002 -111.6 -395.2  -625.6   417.84519   123.97817 6342.3 6342.153
> Oct 2002  127.8  239.4   634.6   278.19881  -139.64639 6230.7 7026.675
> Nov 2002  -17.3 -145.1  -384.5  -968.59203 -1246.79084 6358.5 7989.396
> Dec 2002  286.3  303.6   448.7  -314.19599   654.39605 6341.2 7983.524
> Jan 2003     NA     NA      NA          NA          NA 6627.5 7663.457
>
> Thanks kindly. Lexi
> 	[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list