[R] Divide all rows of a data frame by the first row.

Jeff Newmiller jdnewmil at dcn.davis.CA.us
Fri Jun 15 17:36:02 CEST 2012


Since RExcel imports date/time columns as POSIXt, sticking with that is probably simpler in this case.

The key I have found that makes the use of POSIXt types relatively straightforward is to set the TZ environment variable using Sys.setenv() to a value that is compatible with the timezone of the data you are planning to read in and work with. In particular, the "Etc/GMT+n" zones (- for Eastern hemisphere) that avoid daylight savings complications are usually quite straightforward for Excel data.

Also, avoid the use of POSIXlt unless you have a specific need for its capabilities. In particular I try to avoid putting POSIXlt into data frames.
---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k
--------------------------------------------------------------------------- 
Sent from my phone. Please excuse my brevity.

"R. Michael Weylandt" <michael.weylandt at gmail.com> wrote:

>Honestly, given what a pain dates are in Excel, I might simply import
>them as strings and do the conversion on the R side of things. Also,
>I'd recommend you use Date objects rather than POSIXct to cut out the
>unnecessary complexity of timezones, DST, etc. [See Gabor's article in
>RNews 4/1 for a fuller treatment]
>
>Best,
>Michael
>
>On Fri, Jun 15, 2012 at 9:08 AM, Keith Weintraub <kw1958 at gmail.com>
>wrote:
>> Jeff,
>>  Thanks. I was able to easily get rid of the do.call code and use the
>t( t(...)) construct.
>>
>> In terms of the dates I have tried your suggestion and I am having an
>odd problem.
>>
>> The dates I included in my previous message were only a subset of the
>actual dates I am using.
>>
>> My dates were imported from Excel via RExcel. Here are 3 of them in
>various formats:
>>
>> Displayed in Excel via one type of formatting: 12/1/2023, 1/1/2024,
>2/1/2024
>>
>> As integers from the "epoch" (which is only day based): 45261, 45292,
>45323
>>
>> In R imported from Excel via RExcel (a little more context here):
>>  "2023-11-30 23:00:00 EST"
>>  "2023-12-31 23:00:00 EST"
>>  "2024-01-31 23:00:00 EST"
>>  "2024-02-29 23:00:00 EST"
>>  "2024-04-01 00:00:00 EDT"
>>  "2024-05-01 00:00:00 EDT"
>>  "2024-06-01 00:00:00 EDT"
>>  "2024-07-01 00:00:00 EDT"
>>  "2024-08-01 00:00:00 EDT"
>>  "2024-09-01 00:00:00 EDT"
>>
>> Notice how March 2024 is skipped. Also notice how the conversion
>process gives some dates an EDT and some an EST tag. I assume that is
>what you were warning me about timezones.
>>
>> In addition when I convert using just the function year (lubridate)
>the result is:
>>   2023 2023 2024 2024 2024 2024 2024 2024 2024 2024
>>
>> Where I was expecting (based on the Excel dates):
>>   2023 2024 2024 2024 2024 2024 2024 2024 2024 2024
>>
>> Any help on how to do this the "right" way would be greatly
>appreciated.
>>
>> I know that Excel has "issues" with regards to dates and there may
>even be issues using RExcel for this type of conversion.
>>
>> Any help would is appreciated.
>>
>> Thanks again,
>> KW
>>
>> PS Here is a dput of the last 10 dates:
>>
>> someDates<-structure(c(1701403200, 1704081600, 1706760000,
>1709265600, 1711944000,
>> 1714536000, 1717214400, 1719806400, 1722484800, 1725163200), class =
>c("POSIXct",
>> "POSIXt"))
>>
>>
>> On Jun 15, 2012, at 1:48 AM, Jeff Newmiller wrote:
>>
>>> a) Avoid mixing Date objects and POSIXt objects. The timezones will
>mess you up in the conversions. Just eliminate the as.Date conversion
>entirely.
>>>
>>> b) By the time you reach the do.call function call, amortsByYears is
>a matrix. While a data.frame is a special kind of list, a matrix is
>not. Hence, the error message.
>>>
>>> You can replace the do.call with
>>>
>>> t( t( amortsByYears )/amortsByYears[1,] )
>>>
>>> which is rather more efficient than do.call anyway.
>>>
>---------------------------------------------------------------------------
>>> Jeff Newmiller                        The     .....       .....  Go
>Live...
>>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.
> Live Go...
>>>                                      Live:   OO#.. Dead: OO#..
> Playing
>>> Research Engineer (Solar/Batteries            O.O#.       #.O#.
> with
>>> /Software/Embedded Controllers)               .OO#.       .OO#.
> rocks...1k
>>>
>---------------------------------------------------------------------------
>>> Sent from my phone. Please excuse my brevity.
>>>
>>> Keith Weintraub <kw1958 at gmail.com> wrote:
>>>
>>>> Folks,
>>>>
>>>> I call the function calcAmorts like so:
>>>>      calcAmorts(prevAm, amort, myDates)
>>>>
>>>> Note that I use the package lubridate.
>>>>
>>>> The last line where do.call is called to first divide all the rows
>by
>>>> the first row and then rbind gives the following error:
>>>> Error in do.call("rbind", apply(amortsByYears, 1, "/",
>amortsByYears[,
>>>> :
>>>>   second argument must be a list
>>>>
>>>> By contrast if I run
>>>> do.call('rbind', apply(amortsByYears, 1, "/", amortsByYears[,1]))
>>>>
>>>> On its own with any kind of numeric data.frame the call works fine.
>>>> That is it divides every row in the data.frame by the first row.
>>>>
>>>> Thanks so much for your time,
>>>> KW
>>>>
>>>> _____________
>>>> My data:
>>>>
>>>> myDates<-structure(c(1338523200, 1341115200, 1343793600,
>1346472000,
>>>> 1349064000,
>>>> 1351742400, 1354334400, 1357012800, 1359691200, 1362110400,
>1364788800,
>>>>
>>>> 1367380800, 1370059200, 1372651200, 1375329600, 1378008000,
>1380600000,
>>>>
>>>> 1383278400, 1385870400), class = c("POSIXct", "POSIXt"))
>>>>
>>>>
>>>>
>>>> amort<-structure(c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>0.0333,
>>>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0333,
>>>> 0, 0, 0.0333, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>> 0, 0, 0.0333, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>> 0, 0, 0.0333, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>> 0.0357142857142857,
>>>> 0, 0, 0, 0, 0.0333, 0, 0, 0.0333, 0, 0, 0.0357142857142857, 0,
>>>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0333, 0, 0, 0, 0,
>>>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0333, 0, 0, 0, 0,
>>>> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0357142857142857, 0, 0, 0, 0, 0.0333,
>>>> 0, 0, 0.0333, 0, 0, 0.0357142857142857, 0, 0, 0, 0, 0, 0, 0,
>>>> 0, 0, 0, 0, 0, 0, 0, 0, 0.0333, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
>>>> 0, 0, 0, 0), .Dim = c(10L, 19L))
>>>>
>>>> prevAm<-c(0, 0, 0, 0.0666, 0, 0.0666, 0.0666, 0, 0, 0)
>>>>
>>>> calcAmorts<-function(prevAmort,  currAmorts, dates) {
>>>>  yrs<-year(as.Date(dates))
>>>>  currAmorts<-data.frame(yrs,t(currAmorts))
>>>> amortsByYears<-aggregate(x = currAmorts, by = list(yrs), FUN =
>>>> "sum")[,-c(1,2)]
>>>>  amortsByYears<-rbind(prevAmort, amortsByYears)
>>>>  amortsByYears<-(1-apply(amortsByYears, 2, cumsum))[-1,]
>>>>
>>>> do.call('rbind', apply(amortsByYears, 1, "/", amortsByYears[,1]))
>>>>
>>>> }
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ______________________________________________
>> R-help at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list