[R] Simulations of GAM and MARS models : sample size ; Y-outliers and missing X-data

varin sacha v@r|n@@ch@ @end|ng |rom y@hoo@|r
Thu Aug 8 14:11:05 CEST 2019


Dear Abby,

Many thanks for your response.

To answer your question. For me better all the x variables (collectively), to have m% missing values.

When you tell me : "Modify your code so that a single function say sim.test() computes your simulated statistics, for n sample size and m missing values, and returns the results, say as a two-element list".
I trust you and guess it is a really good idea, but don't know how to do that... :=(







Le jeudi 8 août 2019 à 05:29:55 UTC+2, Abby Spurdle <spurdle.a using gmail.com> a écrit : 





> How can I modify my R codes to simulate the sample size, the presence of Y-outliers and the presence of missing data ?


I don't know what it means for data to have 50% Y-outliers.
That's new to me...

As for the rest of your question.
Modify your code so that a single function, say sim.test() computes your simulated statistics, for n sample size and m missing values, and returns the results, say as a two-element list.

Then write a top level script (or function), something like:

+ ns = c (50, 100, 200, 300, 500)
+ ms = (1:5) * 0.1

+ n = rep (ns, each=5)
+ m = rep (ms, times=5)
+ GAM.stat = MARS.stat = numeric (25)

+ for (i in 1:25)
+ {   results = sim.test (n [i], m [i], ...other.args...)
+     GAM.stat [i] = results$GAM.stat
+     MARS.stat [i] = results$MARS.stat
+ }

+ cbind (n, m, GAM.stat, MARS.stat)

Note that from past experience, what you are doing may produce misleading results.
Because your results are dependent on your simulated data.
(Different simulated data will produce different results, and different end conclusions).

I haven't checked how the functions, you've used to fit models, handle missing values.
But assuming that missing values are NAs, this should be easy to do.

Do you want *each* x variable to have m% missing values, or *all* the x variables (collectively), to have m% missing values?



More information about the R-help mailing list