[R] ERROR: length of 'center' must equal the number of columns of 'x'

Barry King barry.king at qlx.com
Sat May 9 16:38:46 CEST 2015


I am attempting to predict tomorrow's rainfall, RISK_MM, with LASSO using a
data set that
I have partitioned into a train data set and a test data set.  The
structures of the
two data sets are shown below and appear to be identical except the number
of observations:

str(train)
'data.frame': 262 obs. of  24 variables:
 $ Date         : Factor w/ 366 levels "1/1/2008","1/10/2008",..: 146 312
160 345 58 69 202 52 236 176 ...
 $ Location     : Factor w/ 1 level "Canberra": 1 1 1 1 1 1 1 1 1 1 ...
 $ MinTemp      : num  17.1 4.6 11.3 0.7 10.3 10.1 3.8 7.1 0.5 4.2 ...
 $ MaxTemp      : num  29.6 14.7 32.3 14.1 21.3 31.2 21.7 28.4 17.1 18.9 ...
 $ Rainfall     : num  0 0 0 0 3 0 0.2 0 0 0 ...
 $ Evaporation  : num  5.8 4.4 9.4 5.6 4.2 8.8 2.8 11.6 4 6.4 ...
 $ Sunshine     : num  9.2 8.4 11.4 9 6.7 13.1 6.5 12.7 9.4 10.8 ...
 $ WindGustDir  : Factor w/ 16 levels "E","ENE","ESE",..: 1 15 5 2 7 8 8 4
8 15 ...
 $ WindGustSpeed: int  48 52 28 20 43 41 44 48 31 50 ...
 $ WindDir9am   : Factor w/ 16 levels "E","ENE","ESE",..: 10 15 2 12 2 9 3
7 3 16 ...
 $ WindDir3pm   : Factor w/ 16 levels "E","ENE","ESE",..: 3 8 15 7 4 14 15
7 14 15 ...
 $ WindSpeed9am : int  9 28 4 6 7 6 2 2 6 6 ...
 $ WindSpeed3pm : int  17 33 6 7 19 20 20 19 13 31 ...
 $ Humidity9am  : int  67 54 44 69 79 45 99 45 74 60 ...
 $ Humidity3pm  : int  38 51 17 43 46 16 34 22 42 34 ...
 $ Pressure9am  : num  1017 1015 1024 1027 1018 ...
 $ Pressure3pm  : num  1013 1012 1021 1022 1014 ...
 $ Cloud9am     : int  6 1 5 7 8 0 7 0 1 3 ...
 $ Cloud3pm     : int  7 3 2 1 1 1 7 1 1 2 ...
 $ Temp9am      : num  21.7 9.2 18.2 7.4 11.7 18.7 7.9 17.2 7.4 11.2 ...
 $ Temp3pm      : num  29.1 12 30.5 13.7 19.8 30.4 20.2 28.2 16.2 18.1 ...
 $ RainToday    : Factor w/ 2 levels "No","Yes": 1 1 1 1 2 1 1 1 1 1 ...
 $ RISK_MM      : num  1.8 0 0 0 0 0 0 0 0 0 ...
 $ RainTomorrow : Factor w/ 2 levels "No","Yes": 2 1 1 1 1 1 1 1 1 1 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:38] 114 119 128 139 141
175 177 181 190 194 ...
  .. ..- attr(*, "names")= chr [1:38] "114" "119" "128" "139" ...

str(test)
'data.frame': 66 obs. of  24 variables:
 $ Date         : Factor w/ 366 levels "1/1/2008","1/10/2008",..: 85 87 88
90 92 64 65 66 70 71 ...
 $ Location     : Factor w/ 1 level "Canberra": 1 1 1 1 1 1 1 1 1 1 ...
 $ MinTemp      : num  13.7 13.3 7.6 6.1 8.8 8.4 9.1 8.5 12.4 13.8 ...
 $ MaxTemp      : num  23.4 15.5 16.1 18.2 19.5 22.8 25.2 27.3 32.1 31.2 ...
 $ Rainfall     : num  3.6 39.8 2.8 0.2 0 16.2 0 0.2 0 0 ...
 $ Evaporation  : num  5.8 7.2 5.6 4.2 4 5.4 4.2 7.2 8.4 7.2 ...
 $ Sunshine     : num  3.3 9.1 10.6 8.4 4.1 7.7 11.9 12.5 11.1 8.4 ...
 $ WindGustDir  : Factor w/ 16 levels "E","ENE","ESE",..: 8 8 11 10 9 1 4 1
1 3 ...
 $ WindGustSpeed: int  85 54 50 43 48 31 30 41 46 44 ...
 $ WindDir9am   : Factor w/ 16 levels "E","ENE","ESE",..: 4 15 11 10 1 9 10
1 10 16 ...
 $ WindDir3pm   : Factor w/ 16 levels "E","ENE","ESE",..: 6 14 3 3 2 3 8 8
16 14 ...
 $ WindSpeed9am : int  6 30 20 19 19 7 6 2 7 6 ...
 $ WindSpeed3pm : int  6 24 28 26 17 6 9 15 9 19 ...
 $ Humidity9am  : int  82 62 68 63 70 82 74 54 70 72 ...
 $ Humidity3pm  : int  69 56 49 47 48 32 34 35 22 23 ...
 $ Pressure9am  : num  1010 1006 1018 1025 1026 ...
 $ Pressure3pm  : num  1007 1007 1018 1022 1023 ...
 $ Cloud9am     : int  8 2 7 4 7 7 1 0 0 7 ...
 $ Cloud3pm     : int  7 7 7 6 7 1 2 3 3 6 ...
 $ Temp9am      : num  15.4 13.5 11.1 12.4 14.1 13.3 14.6 16.8 19.1 20.2 ...
 $ Temp3pm      : num  20.2 14.1 15.4 17.3 18.9 21.7 24 26 30.7 29.8 ...
 $ RainToday    : Factor w/ 2 levels "No","Yes": 2 2 2 1 1 2 1 1 1 1 ...
 $ RISK_MM      : num  39.8 2.8 0 0 16.2 0 0.2 0 0 1.2 ...
 $ RainTomorrow : Factor w/ 2 levels "No","Yes": 2 2 1 1 2 1 1 1 1 2 ...
 - attr(*, "na.action")=Class 'omit'  Named int [1:38] 114 119 128 139 141
175 177 181 190 194 ...
  .. ..- attr(*, "names")= chr [1:38] "114" "119" "128" "139" ...

x <- model.matrix(RISK_MM~MinTemp + MaxTemp + Rainfall + Evaporation
                  + Sunshine  + WindGustSpeed + WindGustDir + WindDir9am
                  + WindDir3pm + WindSpeed9am + WindSpeed3pm
                  + Humidity9am + Humidity3pm + Pressure9am
                  + Pressure3pm + Cloud9am + Cloud3pm + Temp9am + Temp3pm
                  + RainToday, data=train)

x <- x[,-1]
library(lars)

lasso <- lars(x=x,y=train$RISK_MM,trace=TRUE,type="lasso")

fits <- predict.lars(lasso, test, type="fit")

This last statement generates the error:
Error in scale.default(newx, object$meanx, FALSE) :
  length of 'center' must equal the number of columns of 'x'

I do not know how to interpret this error message or how to resolve the
error.
Any guidance you can provide is appreciated.

Thank you,
Barry E. King Ph.D.
Butler University
College of Business
Indianapolis, Indiana

	[[alternative HTML version deleted]]



More information about the R-help mailing list