[R] dplyr - add/expand rows

Bert Gunter bgunter.4567 at gmail.com
Tue Nov 28 23:08:23 CET 2017


Bill et al.:

Yes, I see it now. Thank you for the correction.

-- Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Nov 28, 2017 at 1:42 PM, William Dunlap <wdunlap at tibco.com> wrote:

> Bert wrote
>   ... Bill's solution seems to be for only one station.
>
> No, it works for any number of stations.
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Sun, Nov 26, 2017 at 11:10 AM, Bert Gunter <bgunter.4567 at gmail.com>
> wrote:
>
>> To David W.'s point about lack of a suitable reprex ("reproducible
>> example"), Bill's solution seems to be for only one station.
>>
>> Here is a reprex and modification that I think does what was requested
>> for multiple stations, again using base R and data frames, not dplyr and
>> tibbles.
>>
>> First the reprex with **two** stations:
>>
>> > d <- data.frame( station = rep(c("one","two"),c(5,4)),
>>                from = c(60,61,71,72,76,60,65,82,83),
>>                 to = c(60,70,71,76,83,64, 81, 82,83),
>>                 record = c("A","B","C","B","D","B","B","D","E"))
>>
>> > d
>>   station from to record
>> 1     one   60 60      A
>> 2     one   61 70      B
>> 3     one   71 71      C
>> 4     one   72 76      B
>> 5     one   76 83      D
>> 6     two   60 64      B
>> 7     two   65 81      B
>> 8     two   82 82      D
>> 9     two   83 83      E
>>
>> ## Now the conversion code using base R, especially by():
>>
>> > out <- by(d, d$station, function(x) with(x, {
>> +    i <- to - from +1
>> +    data.frame(YEAR =sequence(i) -1 +rep(from,i), RECORD =rep(record,i))
>> + }))
>>
>>
>> > out <- data.frame(station = rep(names(out),sapply(out,nrow)),do.call(rbind,out),
>> row.names = NULL)
>>
>>
>> > out
>>    station YEAR RECORD
>> 1      one   60      A
>> 2      one   61      B
>> 3      one   62      B
>> 4      one   63      B
>> 5      one   64      B
>> 6      one   65      B
>> 7      one   66      B
>> 8      one   67      B
>> 9      one   68      B
>> 10     one   69      B
>> 11     one   70      B
>> 12     one   71      C
>> 13     one   72      B
>> 14     one   73      B
>> 15     one   74      B
>> 16     one   75      B
>> 17     one   76      B
>> 18     one   76      D
>> 19     one   77      D
>> 20     one   78      D
>> 21     one   79      D
>> 22     one   80      D
>> 23     one   81      D
>> 24     one   82      D
>> 25     one   83      D
>> 26     two   60      B
>> 27     two   61      B
>> 28     two   62      B
>> 29     two   63      B
>> 30     two   64      B
>> 31     two   65      B
>> 32     two   66      B
>> 33     two   67      B
>> 34     two   68      B
>> 35     two   69      B
>> 36     two   70      B
>> 37     two   71      B
>> 38     two   72      B
>> 39     two   73      B
>> 40     two   74      B
>> 41     two   75      B
>> 42     two   76      B
>> 43     two   77      B
>> 44     two   78      B
>> 45     two   79      B
>> 46     two   80      B
>> 47     two   81      B
>> 48     two   82      D
>> 49     two   83      E
>>
>> Cheers,
>> Bert
>>
>>
>>
>>
>> Bert Gunter
>>
>> "The trouble with having an open mind is that people keep coming along
>> and sticking things into it."
>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
>>
>> On Sat, Nov 25, 2017 at 4:49 PM, William Dunlap via R-help <
>> r-help at r-project.org> wrote:
>>
>>> dplyr may have something for this, but in base R I think the following
>>> does
>>> what you want.  I've shortened the name of your data set to 'd'.
>>>
>>> i <- rep(seq_len(nrow(d)), d$YEAR_TO-d$YEAR_FROM+1)
>>> j <- sequence(d$YEAR_TO-d$YEAR_FROM+1)
>>> transform(d[i,], YEAR=YEAR_FROM+j-1, YEAR_FROM=NULL, YEAR_TO=NULL)
>>>
>>>
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>>
>>> On Sat, Nov 25, 2017 at 11:18 AM, Hutchinson, David (EC) <
>>> david.hutchinson at canada.ca> wrote:
>>>
>>> > I have a returned tibble of station operational record similar to the
>>> > following:
>>> >
>>> > > data.collection
>>> > # A tibble: 5 x 4
>>> >   STATION_NUMBER YEAR_FROM YEAR_TO RECORD
>>> >            <chr>     <int>   <int>  <chr>
>>> > 1        07EA001      1960    1960    QMS
>>> > 2        07EA001      1961    1970    QMC
>>> > 3        07EA001      1971    1971    QMM
>>> > 4        07EA001      1972    1976    QMC
>>> > 5        07EA001      1977    1983    QRC
>>> >
>>> > I would like to reshape this to one operational record (row) per year
>>> per
>>> > station. Something like:
>>> >
>>> > 07EA001              1960      QMS
>>> > 07EA001              1961      QMC
>>> > 07EA001              1962      QMC
>>> > 07EA001              1963      QMC
>>> > ...
>>> > 07EA001              1971      QMM
>>> >
>>> > Can this be done in dplyr easily?
>>> >
>>> > Thanks in advance,
>>> >
>>> > David
>>> >
>>> >         [[alternative HTML version deleted]]
>>> >
>>> > ______________________________________________
>>> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>> > PLEASE do read the posting guide http://www.R-project.org/
>>> > posting-guide.html
>>> > and provide commented, minimal, self-contained, reproducible code.
>>> >
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posti
>>> ng-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list