[R] create an empty data frame and then fill in it (and then evaluate the mean of semi-hourly data for each day)

William Dunlap wdunlap at tibco.com
Fri Jun 10 17:01:00 CEST 2016


Finally I applied lapply in this way:
df_snow_day$snow <- lapply(df_snow_day$day, function(x)
round(mean(df_snow$snow[df_snow$day == x], na.rm=T))

This does not work. I do not understand why the class of df_snow_day$snow
is of type list either:


lapply()'s output is always a list.

            I first created a new column of type "Date"
            df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d")

If 'date_POSIX' is of class "POSIXct" that line gives a warning because
the second argument to as.Data.POSIXct is the time zone ('tz').
Perhaps your data_POSIX column is really character.  I made my df_snow
as follows:

txt <- c("data_POSIX\tsnow",
  "2004-11-01 00:00:00\t50",
  "2004-11-01 00:30:00\t55",
 "2004-11-01 01:00:00\t60")
df_snow <- read.table(sep="\t", text=txt,header=TRUE,
colClasses=c("POSIXct","numeric"))
str(df_snow)
'data.frame':   3 obs. of  2 variables:
 $ data_POSIX: POSIXct, format: "2004-11-01 00:00:00" ...
 $ snow      : num  50 55 60

and as.Date gave:
   > as.Date(df_snow$data_POSIX,"%Y-%m-%d")
   [1] "2004-11-01" "2004-11-01" "2004-11-01"
   Warning message:
   In as.POSIXlt.POSIXct(x, tz = tz) : unknown timezone '%Y-%m-%d'

Also, converting POSIXct objects to Date objects is usually the wrong
thing to do, as the time zone in the POSIXct object is ignored (I think UTC
is assumed):
  > ct <- as.POSIXct(sprintf("2016-%02d-%02d %02d:%02d", 2:5, 22:25, 15:18,
45:48), tz="US/Pacific")
  > data.frame(ct,as.Date(ct)) # note day-of-month mismatches
                     ct as.Date.ct.
  1 2016-02-22 15:45:00  2016-02-22
  2 2016-03-23 16:46:00  2016-03-23
  3 2016-04-24 17:47:00  2016-04-25
  4 2016-05-25 18:48:00  2016-05-26
You can convert to a POSIXlt object and pull out the day-of-month
or day-of-year
  > as.POSIXlt(ct)$mday
  [1] 22 23 24 25
  > as.POSIXlt(ct)$yday
  [1]  52  82 114 145
I can never remember which helper functions are available
for this sort of thing.  Many people like the ones in the lubridate
package.

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Jun 10, 2016 at 3:45 AM, Stefano Sofia <
stefano.sofia at regione.marche.it> wrote:

> Thank you for your answer. Very clear.
> (I don't like the second solution either.)
> Let me then ask a final question.
> From an initial data frame with semi-hourly data (df_snow, with two
> columns, data_POSIX of type "POSIXct" "POSIXt" and snow of type "numeric"),
> I need to evaluate the mean of for each day.
>
> data_POSIX snow
> 2004-11-01 00:00:00 50
> 2004-11-01 00:30:00 55
> 2004-11-01 01:00:00 60
> ...
>
> I first created a new column of type "Date"
> df_snow$day <- as.Date(df_snow$data_POSIX,"%Y-%m-%d")
>
> then I created a new data frame called df_snow_day to store the mean of
> data grouped by day:
> list_days <- unique(df_snow$day)
> df_snow_day <- data.frame(day=list_days)
>
> Finally I applied lapply in this way:
> df_snow_day$snow <- lapply(df_snow_day$day, function(x)
> round(mean(df_snow$snow[df_snow$day == x], na.rm=T)))
>
> This does not work. I do not understand why the class of df_snow_day$snow
> is of type list either:
>
>        day snow
> NA    <NA>       NULL
> NA.1  <NA>       NULL
> NA.2  <NA>       NULL
>
> Where is my mistake?
>
> Thank you for all your help
> Stefano
>
>
> _____________________________________________
>
> Da: Duncan Murdoch [murdoch.duncan at gmail.com]
> Inviato: giovedì 9 giugno 2016 12.36
> A: Stefano Sofia; r-help at r-project.org
> Oggetto: Re: [R] create an empty data frame and then fill in it
>
> On 09/06/2016 6:22 AM, Stefano Sofia wrote:
> > Dear R list users,
> > sorry for this simple question, but I already spent many efforts to
> solve it.
> >
> > I create an empty data frame called df_year like
> >
> > df_year <- data.frame(day=as.Date(character()), hs_MteBove=integer(),
> hs_MtePrata=integer(), hs_Pintura=integer(), hs_Pizzo=integer(),
> hs_Sassotetto=integer(), hs_Sibilla=integer(), stringsAsFactors=FALSE)
> >
> > and then I start to fill in it with
> >
> > df_year$day <- seq(as.Date("2004-11-01-00-00","%Y-%m-%d"),
> as.Date("2005-05-01-00-00","%Y-%m-%d"), by="day")
> >
> > but I get the following error:
> > "replacement has 182 rows, data has 0"
> >
> > Where is my silly mistake?
>
> Your dataframe has 0 rows, so you can't put a 182 row vector into the
> first column.
>
> Unlike vectors, dataframes won't grow if you make assignments beyond the
> end of the rows.
>
> There are at least a couple of solutions:
>
> 1.  Don't create columns until you have data ready for them.
>
> You can wait to create the dataframe until your "day" column is ready:
>
> df_year <- data.frame(day = seq(...))
>
> As you compute other columns of the same length, you can add them, e.g.
>
> df_year$hs_MteBove <- ...
>
> 2.  Create your columns with the right length from the beginning:
>
> df_year <- data.frame(day = rep(as.Date(NA), 182), ...)
>
> I don't like this solution as much.
>
> Duncan Murdoch
>
>
> ________________________________
>
> AVVISO IMPORTANTE: Questo messaggio di posta elettronica può contenere
> informazioni confidenziali, pertanto è destinato solo a persone autorizzate
> alla ricezione. I messaggi di posta elettronica per i client di Regione
> Marche possono contenere informazioni confidenziali e con privilegi legali.
> Se non si è il destinatario specificato, non leggere, copiare, inoltrare o
> archiviare questo messaggio. Se si è ricevuto questo messaggio per errore,
> inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio
> computer. Ai sensi dell’art. 6 della DGR n. 1394/2008 si segnala che, in
> caso di necessità ed urgenza, la risposta al presente messaggio di posta
> elettronica può essere visionata da persone estranee al destinatario.
> IMPORTANT NOTICE: This e-mail message is intended to be received only by
> persons entitled to receive the confidential information it may contain.
> E-mail messages to clients of Regione Marche may contain information that
> is confidential and legally privileged. Please do not read, copy, forward,
> or store this message unless you are an intended recipient of it. If you
> have received this message in error, please forward it to the sender and
> delete it completely from your computer system.
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

	[[alternative HTML version deleted]]



More information about the R-help mailing list