[R] Output of tapply function as data frame: Problem Fixed

Ogbos Okike g||ted|||e2014 @end|ng |rom gm@||@com
Fri Mar 29 07:04:28 CET 2024


This is great!
Many thanks to all for helping to further resolve the problem.
Best wishes
Ogbos

On Fri, Mar 29, 2024 at 6:39 AM Rui Barradas <ruipbarradas using sapo.pt> wrote:

> Às 01:43 de 29/03/2024, Ogbos Okike escreveu:
> > Dear Rui,
> > Thanks again for resolving this. I have already started using the version
> > that works for me.
> >
> > But to clarify the second part, please let me paste the what I did and
> the
> > error message:
> >
> >> set.seed(2024)
> >> data <- data.frame(
> > +    Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L,
> > + TRUE),
> > +    count = sample(10L, 100L, TRUE)
> > + )
> >>
> >> # coerce tapply's result to class "data.frame"
> >> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
> > Error: unexpected '>' in "res <- with(data, tapply(count, Date, mean))
> |>"
> >> # assign a dates column from the row names
> >> res$Date <- row.names(res)
> > Error in row.names(res) : object 'res' not found
> >> # cosmetics
> >> names(res)[2:1] <- names(data)
> > Error in names(res)[2:1] <- names(data) : object 'res' not found
> >> # note that the row names are still tapply's names vector
> >> # and that the columns order is not Date/count. Both are fixed
> >> # after the calculations.
> >> res
> >
> > You can see that the error message is on the pipe. Please, let me know
> > where I am missing it.
> > Thanks.
> >
> > On Wed, Mar 27, 2024 at 10:45 PM Rui Barradas <ruipbarradas using sapo.pt>
> wrote:
> >
> >> Às 08:58 de 27/03/2024, Ogbos Okike escreveu:
> >>> Dear Rui,
> >>> Nice to hear from you!
> >>>
> >>> I am sorry for the omission and I have taken note.
> >>>
> >>> Many thanks for responding. The second solution looks elegant as it
> >> quickly
> >>> resolved the problem.
> >>>
> >>> Please, take a second look at the first solution. It refused to run.
> >> Looks
> >>> as if the pipe is not properly positioned. Efforts to correct it and
> get
> >> it
> >>> run failed. If you can look further, it would be great. If time does
> not
> >>> permit, I am fine too.
> >>>
> >>> But having the too solutions will certainly make the subject more
> >>> interesting.
> >>> Thank you so much.
> >>> With warmest regards from
> >>> Ogbos
> >>>
> >>> On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas <ruipbarradas using sapo.pt>
> >> wrote:
> >>>
> >>>> Às 04:30 de 27/03/2024, Ogbos Okike escreveu:
> >>>>> Warm greetings to you all.
> >>>>>
> >>>>> Using the tapply function below:
> >>>>> data<-read.table("FD1month",col.names = c("Dates","count"))
> >>>>> x=data$count
> >>>>>     f<-factor(data$Dates)
> >>>>> AB<- tapply(x,f,mean)
> >>>>>
> >>>>>
> >>>>> I made a simple calculation. The result, stored in AB, is of the form
> >>>>> below. But an effort to write AB to a file as a data frame fails.
> When
> >> I
> >>>>> use the write table, it only produces the count column and strip of
> the
> >>>>> first column (date).
> >>>>>
> >>>>> 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01
> >>>>> 2006-05-01
> >>>>>     -4.106887  -4.259154  -5.836090  -4.756757  -4.118011  -4.487942
> >>>>>     -4.430705
> >>>>> 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01
> >>>>> 2006-12-01
> >>>>>     -3.856727  -6.067103  -6.418767  -4.383031  -3.985805  -4.768196
> >>>>> -10.072579
> >>>>> 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01
> >>>>> 2007-07-01
> >>>>>     -5.342338  -4.653128  -4.325094  -4.525373  -4.574783  -3.915600
> >>>>>     -4.127980
> >>>>> 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01
> >>>>> 2008-02-01
> >>>>>     -3.952150  -4.033518  -4.532878  -4.522941  -4.485693  -3.922155
> >>>>>     -4.183578
> >>>>> 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01
> >>>>> 2008-09-01
> >>>>>     -4.336969  -3.813306  -4.296579  -4.575095  -4.036036  -4.727994
> >>>>>     -4.347428
> >>>>> 2008-10-01 2008-11-01 2008-12-01
> >>>>>     -4.029918  -4.260326  -4.454224
> >>>>>
> >>>>> But the normal format I wish to display only appears on the terminal,
> >>>>> leading me to copy it and paste into a text file. That is, when I
> enter
> >>>> AB
> >>>>> on the terminal, it returns a format in the form:
> >>>>>
> >>>>> 008-02-01  -4.183578
> >>>>> 2008-03-01  -4.336969
> >>>>> 2008-04-01  -3.813306
> >>>>> 2008-05-01  -4.296579
> >>>>> 2008-06-01  -4.575095
> >>>>> 2008-07-01  -4.036036
> >>>>> 2008-08-01  -4.727994
> >>>>> 2008-09-01  -4.347428
> >>>>> 2008-10-01  -4.029918
> >>>>> 2008-11-01  -4.260326
> >>>>> 2008-12-01  -4.454224
> >>>>>
> >>>>> Now, my question: How do I write out two columns displayed by AB on
> the
> >>>>> terminal to a file?
> >>>>>
> >>>>> I have tried using AB<-data.frame(AB) but it doesn't work either.
> >>>>>
> >>>>> Many thanks for your time.
> >>>>> Ogbos
> >>>>>
> >>>>>         [[alternative HTML version deleted]]
> >>>>>
> >>>>> ______________________________________________
> >>>>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> >>>>> https://stat.ethz.ch/mailman/listinfo/r-help
> >>>>> PLEASE do read the posting guide
> >>>> http://www.R-project.org/posting-guide.html
> >>>>> and provide commented, minimal, self-contained, reproducible code.
> >>>> Hello,
> >>>>
> >>>> The main trick is to pipe to as.data.frame. But the result will have
> one
> >>>> column only, you must assign the dates from the df's row names.
> >>>> I also include an aggregate solution.
> >>>>
> >>>>
> >>>>
> >>>> # create a test data set
> >>>> set.seed(2024)
> >>>> data <- data.frame(
> >>>>      Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"),
> 100L,
> >>>> TRUE),
> >>>>      count = sample(10L, 100L, TRUE)
> >>>> )
> >>>>
> >>>> # coerce tapply's result to class "data.frame"
> >>>> res <- with(data, tapply(count, Date, mean)) |> as.data.frame()
> >>>> # assign a dates column from the row names
> >>>> res$Date <- row.names(res)
> >>>> # cosmetics
> >>>> names(res)[2:1] <- names(data)
> >>>> # note that the row names are still tapply's names vector
> >>>> # and that the columns order is not Date/count. Both are fixed
> >>>> # after the calculations.
> >>>> res
> >>>> #>               count       Date
> >>>> #> 2024-03-22 5.416667 2024-03-22
> >>>> #> 2024-03-23 5.500000 2024-03-23
> >>>> #> 2024-03-24 6.000000 2024-03-24
> >>>> #> 2024-03-25 4.476190 2024-03-25
> >>>> #> 2024-03-26 6.538462 2024-03-26
> >>>> #> 2024-03-27 5.200000 2024-03-27
> >>>>
> >>>> # fix the columns' order
> >>>> res <- res[2:1]
> >>>>
> >>>>
> >>>>
> >>>> # better all in one instruction
> >>>> aggregate(count ~ Date, data, mean)
> >>>> #>         Date    count
> >>>> #> 1 2024-03-22 5.416667
> >>>> #> 2 2024-03-23 5.500000
> >>>> #> 3 2024-03-24 6.000000
> >>>> #> 4 2024-03-25 4.476190
> >>>> #> 5 2024-03-26 6.538462
> >>>> #> 6 2024-03-27 5.200000
> >>>>
> >>>>
> >>>>
> >>>> Also,
> >>>> I'm glad to help as always but Ogbos, you have been an R-Help
> >>>> contributor for quite a while, please post data in dput format. Given
> >>>> the problem the output of the following is more than enough.
> >>>>
> >>>>
> >>>> dput(head(data, 20L))
> >>>>
> >>>>
> >>>> Hope this helps,
> >>>>
> >>>> Rui Barradas
> >>>>
> >>>>
> >>>> --
> >>>> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> >>>> presença de vírus.
> >>>> www.avg.com
> >>>>
> >>>
> >> Hello,
> >>
> >> This pipe?
> >>
> >>
> >> with(data, tapply(count, Date, mean)) |> as.data.frame()
> >>
> >>
> >> I am not seeing anything wrong with it. I have tried it again just now
> >> and it runs with no problems, like it had before.
> >> A solution is not to pipe, separate the instructions.
> >>
> >>
> >> res <- with(data, tapply(count, Date, mean))
> >> res <- as.data.frame(res)
> >>
> >>
> >> But this should be equivalent to the pipe, I cannot think of a way to
> >> have this separated instructions run but not the pipe.
> >>
> >> Hope this helps,
> >>
> >> Rui Barradas
> >>
> >>
> >> --
> >> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> >> presença de vírus.
> >> www.avg.com
> >>
> >
> Hello,
>
> Yes, the problem seems to be the pipe but there is nothing wrong with
> the code.
> The pipe operator was introduced in R 4.1.0, what is your version of R?
>
> You can always not use the pipe,
>
>
> res <- as.data.frame(with(data, tapply(count, Date, mean)))
>
>
> Hope this helps,
>
> Rui Barradas
>
>
>
> --
> Este e-mail foi analisado pelo software antivírus AVG para verificar a
> presença de vírus.
> www.avg.com
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list