[R] help needed using t.test with factors

Thomas Adams Thomas.Adams at noaa.gov
Thu Feb 4 21:11:22 CET 2010


Peter,

Thank you very much! That did the trick…

Regards,
Tom


Peter Ehlers wrote:
> Tom,
>
> t.test(MAE ~ type, data=data, subset=type %in% c('hpc','rfc'))
>
> -Peter Ehlers
>
> Thomas Adams wrote:
>> Dennis,
>>
>> Thank you for the suggestion, but I get this error:
>>
>> > t.test(MAE ~ type,data=data)
>> Error in t.test.formula(MAE ~ type, data = data) :
>> grouping factor must have exactly 2 levels
>>
>> Tom
>>
>>
>>
>> Dennis Murphy wrote:
>>> Hi:
>>>
>>> On Thu, Feb 4, 2010 at 11:07 AM, Thomas Adams <Thomas.Adams at noaa.gov 
>>> <mailto:Thomas.Adams at noaa.gov>> wrote:
>>>
>>> I am trying to use t.test on the following data:
>>>
>>> date type INTERVAL nCASES MTF SDF MTO SDO nFST MF nOBS MO MB BIASCV 
>>> BIASEV ME MAE RMSE CRCF
>>> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502 0.556 1367 1.373 4385 
>>> 1.502 1.471 0.285 0.164 -1.256 1.266 1.399 0.056
>>> 2001-06-15 avn 0.00LT0.01 852225 0.018 0.066 0.000 0.001 708406 
>>> 0.001 852225 0.000 0.000 1.663 71.664 0.018 0.018 0.068 0.176
>>> 2001-06-15 avn 0.01LT0.10 77643 0.097 0.151 0.039 0.025 176129 0.040 
>>> 77643 0.039 0.040 2.331 2.486 0.058 0.086 0.162 0.096
>>> 2001-06-15 avn 0.10LT0.25 29388 0.145 0.186 0.162 0.043 74164 0.160 
>>> 29388 0.162 0.160 2.493 0.897 -0.017 0.129 0.189 0.056
>>> 2001-06-15 avn 0.25LT0.50 17592 0.177 0.208 0.353 0.070 25189 0.336 
>>> 17592 0.353 0.343 1.365 0.503 -0.175 0.238 0.279 0.033
>>> 2001-06-15 avn 0.50LT1.00 10503 0.208 0.245 0.693 0.138 6481 0.666 
>>> 10503 0.693 0.683 0.593 0.300 -0.485 0.517 0.560 0.017
>>> 2001-06-15 avn GE1.00 4385 0.246 0.300 1.502 0.556 1367 1.373 4385 
>>> 1.502 1.471 0.285 0.164 -1.256 1.266 1.399 0.056
>>> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502 0.556 577 1.338 4385 
>>> 1.502 1.483 0.117 0.161
>>> -1.261 1.272 1.398 0.111
>>> 2001-06-15 eta 0.00LT0.01 852225 0.013 0.055 0.000 0.001 799424 
>>> 0.000 852225 0.000 0.000 1.368 50.193 0.013 0.013 0.057 0.175
>>> 2001-06-15 eta 0.01LT0.10 77643 0.079 0.139 0.039 0.025 113987 0.043 
>>> 77643 0.039 0.041 1.617 2.013 0.040 0.079 0.144 0.083
>>> 2001-06-15 eta 0.10LT0.25 29388 0.116 0.169 0.162 0.043 47461 0.160 
>>> 29388 0.162 0.161 1.596 0.719 -0.045 0.139 0.178 0.055
>>> 2001-06-15 eta 0.25LT0.50 17592 0.147 0.197 0.353 0.070 23284 0.345 
>>> 17592 0.353 0.348 1.296 0.417 -0.205 0.258 0.291 0.040
>>> 2001-06-15 eta 0.50LT1.00 10503 0.180 0.230 0.693 0.138 7003 0.643 
>>> 10503 0.693 0.673 0.619 0.260 -0.513 0.532 0.576 0.041
>>> 2001-06-15 eta GE1.00 4385 0.242 0.308 1.502 0.556 577 1.338 4385 
>>> 1.502 1.483 0.117 0.161
>>> -1.261 1.272 1.398 0.111
>>> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502 0.556 1326 1.265 4385 
>>> 1.502 1.447 0.255 0.225 -1.163 1.172 1.314 0.144
>>> 2001-06-15 hpc 0.00LT0.01 852225 0.014 0.057 0.000 0.001 777147 
>>> 0.000 852225 0.000 0.000 0.823 54.824 0.014 0.014 0.059 0.195
>>> 2001-06-15 hpc 0.01LT0.10 77643 0.092 0.148 0.039 0.025 123342 0.048 
>>> 77643 0.039 0.045 1.967 2.346 0.053 0.085 0.156 0.109
>>> 2001-06-15 hpc 0.10LT0.25 29388 0.147 0.190 0.162 0.043 56107 0.161 
>>> 29388 0.162 0.161 1.896 0.908 -0.015 0.137 0.192 0.077
>>> 2001-06-15 hpc 0.25LT0.50 17592 0.195 0.219 0.353 0.070 25677 0.344 
>>> 17592 0.353 0.348 1.424 0.552 -0.158 0.237 0.276 0.057
>>> 2001-06-15 hpc 0.50LT1.00 10503 0.251 0.265 0.693 0.138 8137 0.659 
>>> 10503 0.693 0.678 0.737 0.362 -0.442 0.480 0.529 0.066
>>> 2001-06-15 hpc GE1.00 4385 0.339 0.345 1.502 0.556 1326 1.265 4385 
>>> 1.502 1.447 0.255 0.225 -1.163 1.172 1.314 0.144
>>> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502 0.556 297 1.119 4385 
>>> 1.502 1.478 0.050 0.105
>>> -1.345 1.345 1.474 -0.062
>>> 2001-06-15 ngm 0.00LT0.01 852225 0.017 0.063 0.000 0.001 771901 
>>> 0.000 852225 0.000 0.000 0.703 65.457 0.017 0.017 0.065 0.132
>>> 2001-06-15 ngm 0.01LT0.10 77643 0.070 0.127 0.039 0.025 133779 0.041 
>>> 77643 0.039 0.040 1.803 1.784 0.031 0.073 0.131 0.073
>>> 2001-06-15 ngm 0.10LT0.25 29388 0.100 0.152 0.162 0.043 54850 0.161 
>>> 29388 0.162 0.161 1.859 0.620 -0.061 0.137 0.168 0.050
>>> 2001-06-15 ngm 0.25LT0.50 17592 0.130 0.177 0.353 0.070 24526 0.344 
>>> 17592 0.353 0.348 1.360 0.369 -0.222 0.263 0.291 0.047
>>> 2001-06-15 ngm 0.50LT1.00 10503 0.152 0.196 0.693 0.138 6383 0.643 
>>> 10503 0.693 0.674 0.564 0.219 -0.541 0.551 0.591 0.025
>>> 2001-06-15 ngm GE1.00 4385 0.157 0.199 1.502 0.556 297 1.119 4385 
>>> 1.502 1.478 0.050 0.105
>>> -1.345 1.345 1.474 -0.062
>>> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502 0.556 1192 1.239 4385 
>>> 1.502 1.446 0.224 0.228 -1.159 1.168 1.310 0.157
>>> 2001-06-15 rfc 0.00LT0.01 852225 0.014 0.055 0.000 0.001 773777 
>>> 0.000 852225 0.000 0.000 0.719 53.984 0.014 0.014 0.056 0.200
>>> 2001-06-15 rfc 0.01LT0.10 77643 0.091 0.141 0.039 0.025 123689 0.047 
>>> 77643 0.039 0.044 1.899 2.333 0.052 0.084 0.150 0.114
>>> 2001-06-15 rfc 0.10LT0.25 29388 0.148 0.184 0.162 0.043 58569 0.159 
>>> 29388 0.162 0.160 1.957 0.913 -0.014 0.134 0.186 0.081
>>> 2001-06-15 rfc 0.25LT0.50 17592 0.197 0.214 0.353 0.070 26386 0.340 
>>> 17592 0.353 0.345 1.448 0.558 -0.156 0.232 0.271 0.055
>>> 2001-06-15 rfc 0.50LT1.00 10503 0.253 0.262 0.693 0.138 8123 0.643 
>>> 10503 0.693 0.671 0.718 0.365 -0.440 0.476 0.525 0.074
>>> 2001-06-15 rfc GE1.00 4385 0.343 0.349 1.502 0.556 1192 1.239 4385 
>>> 1.502 1.446 0.224 0.228 -1.159 1.168 1.310 0.157
>>> 2001-07-15 avn GE1.00 3258 0.194 0.233 1.399 0.400 1323 1.440 3258 
>>> 1.399 1.410 0.418 0.139 -1.204 1.209 1.287 0.039
>>> 2001-07-15 avn 0.00LT0.01 879285 0.021 0.073 0.000 0.001 736915 
>>> 0.001 879285 0.000 0.000 1.541 73.048 0.020 0.020 0.075 0.137
>>> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139 0.039 0.025 179228 0.040 
>>> 84628 0.039 0.040 2.200 2.104 0.043 0.078 0.146 0.079
>>>
>>>
>>> This wouldn't read for me:
>>>
>>> Error: unexpected string constant in:
>>> "79285 0.000 0.000 1.541 73.048 0.020 0.020 0.075 0.137
>>> 2001-07-15 avn 0.01LT0.10 84628 0.081 0.139 0.039 0.025
>>>
>>>
>>>
>>> of which this is just a small portion of the data. What I want to
>>> do is to test the difference between the MAE values for those that
>>> are, for example, 'hpc' vs those that are 'rfc', that is, by
>>> 'type' in the header.
>>>
>>> t.test(MAE ~ type, data = yourdf, ...)
>>>
>>> By default, t.test uses var.equal = FALSE and paired = FALSE. If you 
>>> want to
>>> assume equal population variances, set var.equal = TRUE. Since the 
>>> sample sizes
>>> are going to be large, this is essentially a Z-test.
>>>
>>>
>>> I have looked for many examples and have tried to construct the
>>> correct syntax, but no luck so far. If possible, I would further
>>> like to break down the test, not only by type, but type and INTERVAL.
>>>
>>>
>>> If you want this type of breakdown, you're going to be doing a 
>>> two-way ANOVA.
>>> Individual t-tests in this case would be an extremely inefficient 
>>> use of the data.
>>>
>>> HTH,
>>> Dennis
>>>
>>>
>>> -- Thomas E Adams
>>> National Weather Service
>>> Ohio River Forecast Center
>>> 1901 South State Route 134
>>> Wilmington, OH 45177
>>>
>>> EMAIL: thomas.adams at noaa.gov <mailto:thomas.adams at noaa.gov>
>>>
>>> VOICE: 937-383-0528
>>> FAX: 937-383-0033
>>>
>>> ______________________________________________
>>> R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>


-- 
Thomas E Adams
National Weather Service
Ohio River Forecast Center
1901 South State Route 134
Wilmington, OH 45177

EMAIL:	thomas.adams at noaa.gov

VOICE:	937-383-0528
FAX:	937-383-0033



More information about the R-help mailing list