[R] Is there a funct to sum differences?

arthur brogard abrogard at yahoo.com
Sat Dec 24 21:20:25 CET 2016


Thanks a lot everyone for the help.  I will fall silent for quite some time now as I try to understand what you've put before me.

Yes, my first problem is a lack of understanding of vector and matrix arithmetic.  So I have a bit of a learning curve.

Another mistake I made was not giving you sample input with a variety of numbers in it.  I can see quite clearly. Makes it hard to tell if a routine is working properly when everything is zeroes. That was unthinking - my usual habit - I simply took the first items from the list because they included the column headers.  I could easy have cut and pasted something else in there.

And I forgot to 'reply everyone' last time, too. Will I never learn..

Jeff, yes your last didn't work for me. I entirely believe that was my implementation of it. With this 'sophisticated' or 'grownup' code I'm largely blundering around in the dark.

Okay, I'll go looking for illumination..

Have a good Xmas.


:)

ab----- Original Message -----

From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
To: "Fox, John" <jfox at mcmaster.ca>
Cc: arthur brogard <abrogard at yahoo.com>; "r-help at r-project.org" <r-help at r-project.org>
Sent: Sunday, 25 December 2016, 6:06
Subject: RE: [R] Is there a funct to sum differences?

Assuming John's understanding is correct, you can also do this without for 
loops. It takes getting used to vector and matrix arithmetic, which you
can read more about in the Introduction to R document that comes with R, 
or on R Exercises website [1].

You indicated having a problem with my last reproducible example... it did 
work, if you went through it one step at a time. If you skipped steps, you 
would have problems like you encountered. For completeness, I will give 
the whole reproducible example again here... don't mix in your own steps 
until you have worked through all the steps in this example... or at least 
if you do, go back and step through these steps one at a time if you 
change something that breaks it.

[1] http://r-exercises.com/2015/11/28/matrix-exercises/

#########------ begin
rates <- read.table( text =
"Date          Int
Jan-1959        5
Feb-1959        5
Mar-1959        5
Apr-1959        5
May-1959        5
Jun-1959        5
Jul-1959        5
Aug-1959        5
Sep-1959        5
Oct-1959        5
Nov-1959        5
", header = TRUE, colClasses = c( "character", "numeric" ) )

rates$thisone <- c(diff(rates$Int), NA)
rates$nextone <- c(diff(rates$Int, lag=2), NA, NA)
rates$lastone <- (rates$thisone + rates$nextone)/6.5*1000

rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA )
rates$Int2 <- (1:11)^2
rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )

# lag
N <- 5
# see ?embed, or https://en.wikipedia.org/wiki/Embedding
embed( c( rates$Int2, rep( NA, N ) ), N+1 )
# make a matrix of the same size as the embed result
matrix( rep( rates$Int2, N+1 ), ncol=N+1 )
# subtract the first values
embed( c( rates$Int2, rep( NA, N ) ), N+1 ) - rates$Int2
# or can rely on automatic replication ... depends on the
# fact that the embed result is a matrix which is really just
# a vector displayed in folded up form
embed( c( rates$Int2, rep( NA, N ) ), N+1 ) - rates$Int2
# anyway, the result can be computed in one line (wrapped for readability)
rates$experiment3 <- rowSums(   embed( c( rates$Int2
                                         , rep( NA, N )
                                         )
                                      , N+1
                                      )
                               - rates$Int2
                             , na.rm=TRUE
                             )
> rates
        Date Int thisone nextone lastone experiment1 Int2 experiment2 experiment3
1  Jan-1959   5       0       0       0          10    1           5          85
2  Feb-1959   5       0       0       0          10    4          13         115
3  Mar-1959   5       0       0       0          10    9          25         145
4  Apr-1959   5       0       0       0          10   16          41         175
5  May-1959   5       0       0       0          10   25          61         205
6  Jun-1959   5       0       0       0          10   36          85         235
7  Jul-1959   5       0       0       0          10   49         113         170
8  Aug-1959   5       0       0       0          10   64         145         110
9  Sep-1959   5       0       0       0          10   81         181          59
10 Oct-1959   5       0      NA      NA          10  100         221          21
11 Nov-1959   5      NA      NA      NA          NA  121          NA           0

#dput(rates)
result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959", 
"Apr-1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959", "Sep-1959", 
"Oct-1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), thisone 
= c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0, NA), nextone = c(0, 0, 0, 0, 0, 0,
0, 0, 0, NA, NA), lastone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA,
NA), experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121), experiment2 = 
c(5, 13, 25, 41, 61, 85, 113, 145, 181, 221, NA), experiment3 = c(85,
115, 145, 175, 205, 235, 170, 110, 59, 21, 0)), .Names = c("Date",
"Int", "thisone", "nextone", "lastone", "experiment1", "Int2",
"experiment2", "experiment3"), row.names = c(NA, -11L), class = 
"data.frame")

#########------ end


On Sat, 24 Dec 2016, Fox, John wrote:

> Dear Arthur,
>
> Here's a simple script to do what I think you want. I've applied it to a contrived example, a vector of the squares of the integers 1 to 25, and have summed the first 5 differences, but the script is adaptable to any numeric vector and any maximum lag. You'll have to decide what to do with the last maximum-lag (in my case, 5) entries:
>
> -------------- snip ------------
>> (x <- (1:25)^2)
> [1]   1   4   9  16  25  36  49  64  81 100 121 144 169 196 225 256 289 324 361 400 441 484 529 576
> [25] 625
>> len <- length(x)
>> maxlag <- 5
>> diffs <- matrix(0, len, maxlag)
>> for (lag in 1:maxlag){
> +     diffs[1:(len - lag), lag] <- diff(x, lag=lag)
> + }
>> head(diffs)
>     [,1] [,2] [,3] [,4] [,5]
> [1,]    3    8   15   24   35
> [2,]    5   12   21   32   45
> [3,]    7   16   27   40   55
> [4,]    9   20   33   48   65
> [5,]   11   24   39   56   75
> [6,]   13   28   45   64   85
>> tail(diffs)
>      [,1] [,2] [,3] [,4] [,5]
> [20,]   41   84  129  176  225
> [21,]   43   88  135  184    0
> [22,]   45   92  141    0    0
> [23,]   47   96    0    0    0
> [24,]   49    0    0    0    0
> [25,]    0    0    0    0    0
>> rowSums(diffs)
> [1]  85 115 145 175 205 235 265 295 325 355 385 415 445 475 505 535 565 595 625 655 450 278 143  49
> [25]   0
> -------------- snip ------------
>
> The script could very simply be converted into a function if this is a repetitive task with variable inputs.
>
> I hope this helps,
> John
>
> -----------------------------
> John Fox, Professor
> McMaster University
> Hamilton, Ontario
> Canada L8S 4M4
> Web: socserv.mcmaster.ca/jfox
>
>
>
>> -----Original Message-----
>> From: R-help [mailto:r-help-bounces at r-project.org] On Behalf Of arthur
>> brogard via R-help
>> Sent: December 24, 2016 12:29 AM
>> To: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> Cc: r-help at r-project.org
>> Subject: Re: [R] Is there a funct to sum differences?
>>
>> Yes, sorry about that.  I keep making mistakes I shouldn't make.
>>
>> Thanks for the tip about 'reply all', I had no idea.
>>
>> You can ignore the finalone. I have been doing other work on this and it comes
>> from there. I took the example from the R screen after it had run one of these
>> other things that created the finalone.
>>
>> I guess I was thinking just seeing the data mentioned in the code was be
>> enough.
>>
>> I don't want a function to do the division and multiplication.
>>
>> It's a function that will ".. automatically sum the difference between the first
>>
>>  and subsequent to the end of a list? "  that I am looking for.
>>
>> I will try to explain, I know I often don't make myself clear:
>>
>> I'm using this diff() function.
>>
>> This 'diff()' function finds the difference between two adjoining entries and it
>> applies itself to the whole list so that in an instant I can have a list of
>> differences between any two adjoining.
>>
>> Then I can have a list of differences between any two with any specified gap -
>> 'lag' it is called.
>> Using the same function.
>>
>> Now I have them and do that.  Then I add them together to find the 'lastone'
>> which is the total difference for the period.
>>
>>
>> Now here's the point:  that covers a period of two timespans, months, they are.
>>
>>  if I want to cover a span of 24 months, say, then I would have to write this
>> diff() function 24 times.
>>
>>  what I'm doing is finding the difference between the starting point and every
>> other point and then adding them all together.  bit like finding the area
>> beneath the curve maybe.
>>
>>  And that's what I want to do.
>>
>>  :)
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ----- Original Message -----
>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>> To: arthur brogard <abrogard at yahoo.com>
>> Cc: r-help at r-project.org
>> Sent: Saturday, 24 December 2016, 15:34
>> Subject: Re: [R] Is there a funct to sum differences?
>>
>> You need to "reply all" so other people can help as well, and others can learn
>> from your questions.
>>
>> I am still puzzled by how you expect to compute "finalone". If you had supplied
>> numbers other than all 5's it might have been easier to figure out what is going
>> on.
>>
>> What is your purpose in performing this calculation?
>>
>> #### reproducible code
>> rates <- read.table( text =
>> "Date          Int
>> Jan-1959        5
>> Feb-1959        5
>> Mar-1959        5
>> Apr-1959        5
>> May-1959        5
>> Jun-1959        5
>> Jul-1959        5
>> Aug-1959        5
>> Sep-1959        5
>> Oct-1959        5
>> Nov-1959        5
>> ", header = TRUE, colClasses = c( "character", "numeric" ) )
>>
>> #your code
>> rates$thisone <- c(diff(rates$Int), NA)
>> rates$nextone <- c(diff(rates$Int, lag=2), NA, NA) rates$lastone <-
>> (rates$thisone + rates$nextone)/6.5*1000 # I doubt there is a ready-built
>> function that knows you want to # divide by 6.5 or multiply by 1000
>>
>> # form a vector from positions 2:11 and append NA)
>> rates$experiment1 <- rates$Int + c( rates$Int[ -1 ], NA ) # numbers that are not
>> all the same
>> rates$Int2 <- (1:11)^2
>> rates$experiment2 <- rates$Int2 + c( rates$Int2[ -1 ], NA )
>>
>> # dput(rates)
>> result <- structure(list(Date = c("Jan-1959", "Feb-1959", "Mar-1959", "Apr-
>> 1959", "May-1959", "Jun-1959", "Jul-1959", "Aug-1959", "Sep-1959", "Oct-
>> 1959", "Nov-1959"), Int = c(5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5), thisone = c(0, 0, 0, 0, 0,
>> 0, 0, 0, 0, 0, NA), nextone = c(0, 0, 0, 0, 0, 0, 0, 0, 0, NA, NA), lastone = c(0, 0, 0,
>> 0, 0, 0, 0, 0, 0, NA, NA), Int2 = c(1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121),
>> experiment1 = c(10, 10, 10, 10, 10, 10, 10, 10, 10, 10, NA), experiment2 = c(5,
>> 13, 25, 41, 61, 85, 113, 145, 181, 221, NA)), .Names = c("Date", "Int",
>> "thisone", "nextone", "lastone", "Int2", "experiment1", "experiment2"),
>> row.names = c(NA, -11L), class = "data.frame")
>>
>> On Sat, 24 Dec 2016, arthur brogard wrote:
>>
>>>
>>>
>>> Yes, sure, thanks for your interest.  I apologise for not submitting in the
>> correct manner.  I'll learn (I hope).
>>>
>>> Here's the source - a spreadsheet with just two columns, date and 'Int'.
>>>
>>>
>>> Date    Int
>>> Jan-1959    5
>>> Feb-1959    5
>>> Mar-1959    5
>>> Apr-1959    5
>>> May-1959    5
>>> Jun-1959    5
>>> Jul-1959    5
>>> Aug-1959    5
>>> Sep-1959    5
>>> Oct-1959    5
>>> Nov-1959    5
>>>
>>>
>>> After processing it becomes this:
>>>
>>>
>>>> rates
>>> Date   Int thisone nextone     lastone finalone
>>> 1   1959-01-01  5.00    0.00    0.00    0.000000       10
>>> 2   1959-02-01  5.00    0.00    0.00    0.000000       10
>>> 3   1959-03-01  5.00    0.00    0.00    0.000000       10
>>> 4   1959-04-01  5.00    0.00    0.00    0.000000       10
>>> 5   1959-05-01  5.00    0.00    0.00    0.000000       10
>>> 6   1959-06-01  5.00    0.00    0.00    0.000000       10
>>>
>>> The one long column I'm referring to is the 'Int' column which R has imported.
>>>
>>> The actual code is:
>>>
>>>
>>> rates <- read.csv("Rates2.csv",header =
>>> TRUE,colClasses=c("character","numeric"))
>>>
>>> sapply(rates,class)
>>>
>>> rates$Date <- strptime(paste0("1-", rates$Date), format="%d-%b-%Y",
>>> tz="UTC")
>>>
>>>
>>> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>>> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>>> rates$nextone)/6.5*1000
>>>
>>>
>>> rates
>>>
>>>
>>>
>>> ab
>>>
>>>
>>>
>>> ----- Original Message -----
>>> From: Jeff Newmiller <jdnewmil at dcn.davis.ca.us>
>>> To: arthur brogard <abrogard at yahoo.com>; arthur brogard via R-help
>>> <r-help at r-project.org>; "r-help at r-project.org" <r-help at r-project.org>
>>> Sent: Saturday, 24 December 2016, 13:25
>>> Subject: Re: [R] Is there a funct to sum differences?
>>>
>>> Could you make your example reproducible? That is, include some sample
>> input and output. You talk about a column of numbers and then you seem to
>> work with named lists and I can't reconcile your words with the code I see.
>>> --
>>> Sent from my phone. Please excuse my brevity.
>>>
>>>
>>> On December 23, 2016 3:40:18 PM PST, arthur brogard via R-help <r-help at r-
>> project.org> wrote:
>>>> I've been looking but I can't find a function to sum difference.
>>>>
>>>> I have this code:
>>>>
>>>>
>>>> rates$thisone <- c(diff(rates$Int), NA) rates$nextone <-
>>>> c(diff(rates$Int, lag=2), NA, NA) rates$lastone <- (rates$thisone +
>>>> rates$nextone)
>>>>
>>>>
>>>> It is looking down one long column of numbers.
>>>>
>>>> It sums the difference between the first two and then between the
>>>> first and third and so on.
>>>>
>>>> Can it be made to automatically sum the difference between the first
>>>> and subsequent to the end of a list?
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> ---------------------------------------------------------------------------
>> Jeff Newmiller                        The     .....       .....  Go Live...
>> DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
>>                                        Live:   OO#.. Dead: OO#..  Playing
>> Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
>> /Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnewmil at dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                       Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k



More information about the R-help mailing list