[R] Help with plotting and date-times for climate data

Richard O'Keefe r@oknz @end|ng |rom gm@||@com
Thu Sep 14 08:25:13 CEST 2023


I think we all figured out what TMIN and TMAX were.
That's *precisely* why there is a problem.
The "average or median" daily maxima mean practically
nothing.
Let's take where I live, for example.
I'm looking at an official map.  I see temperatures
7.3C, 9.8C, 14C, 14.2C, 14.7C, and 15C.  According to what
is supposed to be the same source on my phone, it's 12C.
All for the same time, and all within a few kilometres.
The temperature that is recorded in the national data set
is taken at the local airport, about 30km away, and generally
about 2 or 3C hotter than other measurements.  (Airport
weather stations notoriously run hot.)  The local weather
station I trust most, run by the Physics department at the
University, says that today's maximum was 20C, but if it
went any higher than 17C where I live (10 minutes away by
car) I shall be very much surprised.

What explains all this?
(1) Altitude.  We're not unusually hilly, but the variation
with altitude is much more than you'd expect from the usual
adiabatic lapse rate.
(2) Proximity to the coast and sea breezes.
(3) Which side of a hill/valley you're on.
(4) Variation in solar irradiance, which is *astonishingly*
patchy (even after averaging out clouds).
Weather is just unspfxably variable.
Don't look at extremes if you can look at distributions.

And all this means that unless you fit some kind of predictive model
that takes into account things like expected cloud cover for the
duration of your stay, the data you have are going to be a much
cruder guide than you can expect.  Your best guide is probably
"how do the people who live there dress at this time of year"?

Oh, for what it's worth, here are daily TMAX values for this
time of year a couple of years ago.
 10.9 14.2 14.3 19.4 21.3 18.1 18.4 16.0 13.1 13.3 15.4 17.6 20.3 18.2 14.8
17.0 17.2 11.7  6.0 10.6
What does that tell you?  "Welcome to changeable SPRING!"
If you take the average, which happens to be 15.4, you'll miss the
fact that you need to be prepared for anything between 6C and 21C.
So you need winter wear and summer wear.  To cope with TMAX.

The big mistakes in statistics are generally not in programming but
in formulating the problem.  In your case, you're overthinking the
problem.  Just plot the raw (or in your case, pre-cooked) data.
Plot TMAX and TMIN and look at their range and how much they vary.
Taking means or medians will destroy information you need for your
purposes.  Take the mean of 15.4 and you'll completely fail to notice
the day that got above 21 and the day that never got above 6 and you'll
be wrongly dressed for both of them.



On Thu, 14 Sept 2023 at 05:22, Kevin Zembower <kevin using zembower.org> wrote:

> Tim, Richard, y'all are reading too much into this. I believe that TMAX
> is the high temperature of the day, and TMIN is the low. I'm trying to
> compute the average or median high and low temperatures for the data I
> have (2011 to present). I'm going on a trip to this area, and want to
> know how to pack.
>
> Thanks for your interest.
>
> -Kevin
>
> On Thu, 2023-09-14 at 03:07 +1200, Richard O'Keefe wrote:
> > I am well aware of the physiological implications
> > of temperature, and that is *why* I view recorded
> > TMIN and TMAX at a single point with an extremely
> > jaundiced eye.  TMAX at shoulder height has very
> > little relevance to an insect living in grass, for
> > example.  And if TMAX is sustained for one second,
> > that has very different consequences from if TMAX
> > is sustained for five minutes.  I can see the usefulness
> > of "proportion of day above Thi/below Tlo", but that
> > is quite different.
> >
> > OK, so my interest in weather data was mainly based
> > around water management: precipitation, evaporation,
> > herd and crop water needs, that kind of thing.  And
> > the first thing you learn from that experience is
> > that ANY kind of single-point summary is seriously
> > misleading.
> >
> > Let's end this digression.
> >
> >
> > On Thu, 14 Sept 2023 at 02:18, Ebert,Timothy Aaron <tebert using ufl.edu>
> > wrote:
> > > I had the same question.
> > > However, I can partly answer the off-topic question. Min and max
> > > can be important as lower and upper development thresholds. Below
> > > the min no growth or development occur because reaction rates are
> > > too slow to enable such. Above max, temperatures are too hot.
> > > Protein function is impaired, and systems stop functioning. There
> > > is a considerable range between where systems shut down (but
> > > recover) and tissue death.
> > > In a simple form the growth and physiological stage of plants,
> > > insects, and many others, can be modeled as a function of
> > > temperature. These are often called growing degree day models (or
> > > some version of that). This is number of thermal units needed for
> > > the organism to develop to the next stage (e.g. instar for an
> > > insect, or fruit/flower formation for a plant). However, better
> > > accuracy is obtained if the model includes both min and max
> > > thresholds.
> > >
> > > All I have done is provide an example where min and max could have
> > > a real world use. I use max(temp) over some interval and then
> > > update an accumulated thermal units variable based on the outcome.
> > > That detail is not evident in the original request.
> > >
> > > Tim
> > >
> > > -----Original Message-----
> > > From: R-help <r-help-bounces using r-project.org> On Behalf Of Richard
> > > O'Keefe
> > > Sent: Wednesday, September 13, 2023 9:58 AM
> > > To: Kevin Zembower <kevin using zembower.org>
> > > Cc: r-help using r-project.org
> > > Subject: Re: [R] Help with plotting and date-times for climate data
> > >
> > > [External Email]
> > >
> > > Off-topic, but what is a "mean temperature max"
> > > and what good would it do you to know you if you did?
> > > I've been looking at a lot of weather station data and for no
> > > question I've ever had (except "would the newspapers get excited
> > > about this") was "max" (or min) the answer.  Considering the way
> > > that temperature can change by several degrees in a few minutes, or
> > > a few metres -- I meant horizontally when I wrote that, but as you
> > > know your head and feet don't experience the same temperature,
> > > again by more than one degree -- I am at something of a loss to
> > > ascribe much practical significance to TMAX.  Are you sure this is
> > > the analysis you want to do?  Is this the most informative data you
> > > can get?
> > >
> > > On Wed, 13 Sept 2023 at 08:51, Kevin Zembower via R-help <
> > > r-help using r-project.org> wrote:
> > >
> > > > Hello,
> > > >
> > > > I'm trying to calculate the mean temperature max from a file of
> > > > climate date, and plot it over a range of days in the year. I've
> > > > downloaded the data, and cleaned it up the way I think it should
> > > > be.
> > > > However, when I plot it, the geom_smooth line doesn't show up. I
> > > > think
> > > > that's because my x axis is characters or factors. Here's what I
> > > > have so far:
> > > > ========================================
> > > > library(tidyverse)
> > > >
> > > > data <- read_csv("Ely_MN_Weather.csv")
> > > >
> > > > start_day = yday(as_date("2023-09-22")) end_day =
> > > > yday(as_date("2023-10-15"))
> > > >
> > > > d <- as_tibble(data) %>%
> > > >      select(DATE,TMAX,TMIN) %>%
> > > >      mutate(DATE = as_date(DATE),
> > > >             yday = yday(DATE),
> > > >             md = sprintf("%02d-%02d", month(DATE), mday(DATE))
> > > >             ) %>%
> > > >      filter(yday >= start_day & yday <= end_day) %>%
> > > >      mutate(md = as.factor(md))
> > > >
> > > > d_sum <- d %>%
> > > >      group_by(md) %>%
> > > >      summarize(tmax_mean = mean(TMAX, na.rm=TRUE))
> > > >
> > > > ## Here's the filtered data:
> > > > dput(d_sum)
> > > >
> > > > > structure(list(md = structure(1:25, levels = c("09-21", "09-
> > > > > 22",
> > > > "09-23", "09-24", "09-25", "09-26", "09-27", "09-28", "09-29",
> > > > "09-30", "10-01", "10-02", "10-03", "10-04", "10-05", "10-06",
> > > > "10-07", "10-08", "10-09", "10-10", "10-11", "10-12", "10-13",
> > > > "10-14", "10-15"), class = "factor"), tmax_mean = c(65,
> > > > 62.2222222222222, 61.3, 63.8888888888889, 64.3, 60.1111111111111,
> > > > 62.3, 60.5, 61.9, 61.2, 63.6666666666667, 59.5, 59.5555555555556,
> > > > 61.5555555555556, 59.4444444444444, 58.7777777777778,
> > > > 55.8888888888889, 58.125, 58, 55.6666666666667, 57,
> > > > 55.4444444444444,
> > > > 49.7777777777778, 48.75, 43.6666666666667)), class = c("tbl_df",
> > > > "tbl", "data.frame"
> > > > ), row.names = c(NA, -25L))
> > > > >
> > > > ggplot(data = d_sum, aes(x = md)) +
> > > >      geom_point(aes(y = tmax_mean, color = "blue")) +
> > > >      geom_smooth(aes(y = tmax_mean, color = "blue"))
> > > > =====================================
> > > > My questions are:
> > > > 1. Why isn't my geom_smooth plotting? How can I fix it?
> > > > 2. I don't think I'm handling the month and day combination
> > > > correctly.
> > > > Is there a way to encode month and day (but not year) as a date?
> > > > 3. (Minor point) Why does my graph of tmax_mean come out red when
> > > > I
> > > > specify "blue"?
> > > >
> > > > Thanks for any advice or guidance you can offer. I really
> > > > appreciate
> > > > the expertise of this group.
> > > >
> > > > -Kevin
> > > >
> > > > ______________________________________________
> > > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > > https://stat/
> > > > .ethz.ch%2Fmailman%2Flistinfo%2Fr-
> > > > help&data=05%7C01%7Ctebert%40ufl.edu
> > > > %7C41f002949dac426196de08dbb4619001%7C0d4da0f84a314d76ace60a62331
> > > > e1b84
> > > > %7C0%7C0%7C638302103358987487%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4
> > > > wLjAw
> > > > MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7
> > > > C&sda
> > > > ta=dfC3W%2F%2BBsZI0EaAx%2FocRgw81PSJH8sVZHPFB4rMyiaM%3D&reserved=
> > > > 0
> > > > PLEASE do read the posting guide
> > > > http://www.r/
> > > > -project.org%2Fposting-
> > > > guide.html&data=05%7C01%7Ctebert%40ufl.edu%7C41
> > > > f002949dac426196de08dbb4619001%7C0d4da0f84a314d76ace60a62331e1b84
> > > > %7C0%
> > > > 7C0%7C638302103358987487%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw
> > > > MDAiL
> > > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sda
> > > > ta=zR
> > > > IyH2os0w%2Bi1M26YCGqRFZyXNN6KnS2ddNrEZ9BvVo%3D&reserved=0
> > > > and provide commented, minimal, self-contained, reproducible
> > > > code.
> > > >
> > >
> > >         [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > > http://www.r-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible code.
>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list