[R] Odd Results when generating predictions with nnet function

Paul Bernal p@u|bern@|07 @end|ng |rom gm@||@com
Wed Sep 2 18:45:36 CEST 2020


You are right Jeff, that was a mistake, I was focusing on the square root
and made the mistake of talking about taking the square root instead of
raising to the 2nd power.

This is the example I was following (
https://www.youtube.com/watch?v=SaQgA6V8UA4). Of course, I tried fitting
the nnet model to my own data, to see what kind of results I'd get (the
data that I used, I provided in the very first e-mail).

The question I was asking is why do I get a bunch of 1s for the
predictions, given that the expected results would have to be somewhere
close to the latest observations.

The code and the data from the example I was following is provided in the
youtube link above.

Paul


<https://www.youtube.com/watch?v=SaQgA6V8UA4>

El mié., 2 sept. 2020 a las 10:01, Jeff Newmiller (<jdnewmil using dcn.davis.ca.us>)
escribió:

> Why would you expect raising y_pred to the 0.5 to "backtransform" a model
> sqrt(y)~x? Wouldn't you raise to the 2?
>
> Why would you "backtransform" x in such a model if it were never
> transformed in the first place? Dr Maechler did not suggest that.
>
> And why are you mentioning some random unspecified video on Youtube? That
> does not enlighten anyone here, apparently including you. Please reference
> package documentation, and/or reproduce the analysis discussed in that
> video to provide a contrasting (or supporting) point with the example you
> gave.
>
>
> On September 2, 2020 7:21:58 AM PDT, Paul Bernal <paulbernal07 using gmail.com>
> wrote:
> >Dear Dr. Martin and Dr. Peter,
> >
> >Hope you are doing well. Thank you for your kind feedback. I also tried
> >fitting the nnet using y ~ x, but the model kept on generating odd
> >predictions. If I understand correctly, from what Dr. Martin said, it
> >would
> >be a good idea to try modeling sqrt(y) ~ x and then backtransform
> >raising
> >both y and x to 0.5?
> >
> >I was looking at a video where the guy modeled count data without doing
> >any
> >kind of transformation and didn't get odd results, which is rather
> >extrange.
> >
> >Cheers,
> >
> >Paul
> >
> >
> >
> >El mié., 2 sept. 2020 a las 2:37, Martin Maechler (<
> >maechler using stat.math.ethz.ch>) escribió:
> >
> >> >>>>> peter dalgaard
> >> >>>>>     on Wed, 2 Sep 2020 08:41:09 +0200 writes:
> >>
> >>     > Generically, nnet(a$y ~ a$x, a ...) should be nnet(y ~ x,
> >>     > data=a, ...) otherwise predict will go looking for a$x, no
> >>     > matter what is in xnew.
> >>
> >>     > But more importantly, nnet() is a _classifier_,
> >>     > so the LHS should be a class, not a numeric variable.
> >>
> >>     > -pd
> >>
> >> Well, nnet() can be used for both classification *and* regression,
> >> which is quite clear from the MASS book, but indeed, not from
> >> its help page, which indeed mentions one formula  'class ~ ...'
> >> and then only has classification examples.
> >>
> >> So, indeed, the  ?nnet  help page could improved.
> >>
> >> In his case, y are counts,  so  John Tukey's good old
> >> "first aid transformation" principle would suggest to model
> >>
> >> sqrt(y) ~ ..   in a *regression* model which nnet() can do.
> >>
> >> Martin Maechler
> >> ETH Zurich  and  R Core team
> >>
> >>
> >>
> >>     >> On 1 Sep 2020, at 22:19 , Paul Bernal
> >>     >> <paulbernal07 using gmail.com> wrote:
> >>     >>
> >>     >> Dear friends,
> >>     >>
> >>     >> Hope you are all doing well. I am currently using R
> >>     >> version 4.0.2 and working with the nnet package.
> >>     >>
> >>     >> My dataframe consists of three columns, FECHA which is
> >>     >> the date, x, which is a sequence from 1 to 159, and y,
> >>     >> which is the number of covid cases (I am also providing
> >>     >> the dput for this data frame below).
> >>     >>
> >>     >> I tried fitting a neural net model using the following
> >>     >> code:
> >>     >>
> >>     >> xnew = 1:159 Fit <- nnet(a$y ~ a$x, a, size = 5, maxit =
> >>     >> 1000, lineout = T, decay = 0.001)
> >>     >>
> >>     >> Finally, I attempted to generate predictions with the
> >>     >> following code:
> >>     >>
> >>     >> predictions <- predict(Fit, newdata = list(x = xnew),
> >>     >> type = "raw")
> >>     >>
> >>     >> But obtained extremely odd results: As you can see,
> >>     >> instead of obtaining numbers, more or less in the range
> >>     >> of the last observations of a$y, I end up getting a bunch
> >>     >> of 1s, which doesn´t make any sense (if anyone could help
> >>     >> me understand what could be causing this):
> >>     >> dput(predictions) structure(c(1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
> >>     >> 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1), .Dim
> >>     >> = c(159L, 1L), .Dimnames = list(c("1", "2", "3", "4",
> >>     >> "5", "6", "7", "8", "9", "10", "11", "12", "13", "14",
> >>     >> "15", "16", "17", "18", "19", "20", "21", "22", "23",
> >>     >> "24", "25", "26", "27", "28", "29", "30", "31", "32",
> >>     >> "33", "34", "35", "36", "37", "38", "39", "40", "41",
> >>     >> "42", "43", "44", "45", "46", "47", "48", "49", "50",
> >>     >> "51", "52", "53", "54", "55", "56", "57", "58", "59",
> >>     >> "60", "61", "62", "63", "64", "65", "66", "67", "68",
> >>     >> "69", "70", "71", "72", "73", "74", "75", "76", "77",
> >>     >> "78", "79", "80", "81", "82", "83", "84", "85", "86",
> >>     >> "87", "88", "89", "90", "91", "92", "93", "94", "95",
> >>     >> "96", "97", "98", "99", "100", "101", "102", "103",
> >>     >> "104", "105", "106", "107", "108", "109", "110", "111",
> >>     >> "112", "113", "114", "115", "116", "117", "118", "119",
> >>     >> "120", "121", "122", "123", "124", "125", "126", "127",
> >>     >> "128", "129", "130", "131", "132", "133", "134", "135",
> >>     >> "136", "137", "138", "139", "140", "141", "142", "143",
> >>     >> "144", "145", "146", "147", "148", "149", "150", "151",
> >>     >> "152", "153", "154", "155", "156", "157", "158", "159"),
> >>     >> NULL))
> >>     >>
> >>     >> head(a) FECHA x y 1 2020-03-09 1 1 2 2020-03-10 2 8 3
> >>     >> 2020-03-11 3 14 4 2020-03-12 4 27 5 2020-03-13 5 36 6
> >>     >> 2020-03-14 6 43
> >>     >>
> >>     >> dput(a) structure(list(FECHA = structure(c(18330, 18331,
> >>     >> 18332, 18333, 18334, 18335, 18336, 18337, 18338, 18339,
> >>     >> 18340, 18341, 18342, 18343, 18344, 18345, 18346, 18347,
> >>     >> 18348, 18349, 18350, 18351, 18352, 18353, 18354, 18355,
> >>     >> 18356, 18357, 18358, 18359, 18360, 18361, 18362, 18363,
> >>     >> 18364, 18365, 18366, 18367, 18368, 18369, 18370, 18371,
> >>     >> 18372, 18373, 18374, 18375, 18376, 18377, 18378, 18379,
> >>     >> 18380, 18381, 18382, 18383, 18384, 18385, 18386, 18387,
> >>     >> 18388, 18389, 18390, 18391, 18392, 18393, 18394, 18395,
> >>     >> 18396, 18397, 18398, 18399, 18400, 18401, 18402, 18403,
> >>     >> 18404, 18405, 18406, 18407, 18408, 18409, 18410, 18411,
> >>     >> 18412, 18413, 18414, 18415, 18416, 18417, 18418, 18419,
> >>     >> 18420, 18421, 18422, 18423, 18424, 18425, 18426, 18427,
> >>     >> 18428, 18429, 18430, 18431, 18432, 18433, 18434, 18435,
> >>     >> 18436, 18437, 18438, 18439, 18440, 18441, 18442, 18443,
> >>     >> 18444, 18445, 18446, 18447, 18448, 18449, 18450, 18451,
> >>     >> 18452, 18453, 18454, 18455, 18456, 18457, 18458, 18459,
> >>     >> 18460, 18461, 18462, 18463, 18464, 18465, 18466, 18467,
> >>     >> 18468, 18469, 18470, 18471, 18472, 18473, 18474, 18475,
> >>     >> 18476, 18477, 18478, 18479, 18480, 18481, 18482, 18483,
> >>     >> 18484, 18485, 18486, 18487, 18488), class = "Date"), x =
> >>     >> 1:159, y = c(1, 8, 14, 27, 36, 43, 55, 69, 86, 109, 137,
> >>     >> 200, 245, 313, 345, 443, 558, 674, 786, 901, 989, 1075,
> >>     >> 1181, 1317, 1475, 1673, 1801, 1988, 2100, 2249, 2528,
> >>     >> 2752, 2974, 3234, 3400, 3472, 3574, 3751, 4016, 4210,
> >>     >> 4273, 4467, 4658, 4821, 4992, 5166, 5338, 5538, 5779,
> >>     >> 6021, 6200, 6378, 6532, 6720, 7090, 7197, 7387, 7523,
> >>     >> 7731, 7868, 8070, 8282, 8448, 8616, 8783, 8944, 9118,
> >>     >> 9268, 9449, 9606, 9726, 9867, 9977, 10116, 10267, 10577,
> >>     >> 10926, 11183, 11447, 11728, 12131, 12531, 13015, 13463,
> >>     >> 13837, 14095, 14609, 15044, 15463, 16004, 16425, 16854,
> >>     >> 17233, 17889, 18586, 19211, 20059, 20686, 21422, 21962,
> >>     >> 22597, 23351, 24274, 25222, 26030, 26752, 27314, 28030,
> >>     >> 29037, 29905, 30658, 31686, 32785, 33550, 34463, 35237,
> >>     >> 35995, 36983, 38149, 39334, 40291, 41251, 42216, 43257,
> >>     >> 44352, 45633, 47177, 48096, 49243, 50373, 51408, 52261,
> >>     >> 53468, 54426, 55153, 55906, 56817, 57993, 58864, 60296,
> >>     >> 61442, 62223, 63269, 64191, 65256, 66383, 67453, 68456,
> >>     >> 69424, 70231, 71418, 72560, 73651, 74492, 75394, 76464,
> >>     >> 77377, 78446, 79402)), row.names = c(NA, 159L), class =
> >>     >> "data.frame") Any help and/or guidance will be greatly
> >>     >> appreciated,
> >>     >>
> >>     >> Cheers,
> >>     >>
> >>     >> Paul
> >>     >>
> >>     >> [[alternative HTML version deleted]]
> >>     >>
> >>     >> ______________________________________________
> >>     >> R-help using r-project.org mailing list -- To UNSUBSCRIBE and
> >>     >> more, see https://stat.ethz.ch/mailman/listinfo/r-help
> >>     >> PLEASE do read the posting guide
> >>     >> http://www.R-project.org/posting-guide.html and provide
> >>     >> commented, minimal, self-contained, reproducible code.
> >>
> >>     > --
> >>     > Peter Dalgaard, Professor, Center for Statistics,
> >>     > Copenhagen Business School Solbjerg Plads 3, 2000
> >>     > Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23
> >>     > Email: pd.mes using cbs.dk Priv: PDalgd using gmail.com
> >>
> >>     > ______________________________________________
> >>     > R-help using r-project.org mailing list -- To UNSUBSCRIBE and
> >>     > more, see https://stat.ethz.ch/mailman/listinfo/r-help
> >>     > PLEASE do read the posting guide
> >>     > http://www.R-project.org/posting-guide.html and provide
> >>     > commented, minimal, self-contained, reproducible code.
> >>
> >
> >       [[alternative HTML version deleted]]
> >
> >______________________________________________
> >R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >https://stat.ethz.ch/mailman/listinfo/r-help
> >PLEASE do read the posting guide
> >http://www.R-project.org/posting-guide.html
> >and provide commented, minimal, self-contained, reproducible code.
>
> --
> Sent from my phone. Please excuse my brevity.
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list