[R] Reduce woes

Stefan Kruger stefan.kruger at gmail.com
Fri Jul 29 17:43:16 CEST 2016


>> I still don't understand why you want Reduce to to lapply's
>> job.   Reduce maps many to one and lapply maps many to
>> many.

Say you want to map a function over a subset of a vector or list? With the
generalised version of Reduce you map many-to-one, but the one can be a
'complex' structure. lapply() and friends not only map many-to-many, but
X-to-X - the resulting list will be the same length as the source. This
frequently gets used in Elixir, Erlang, Haskell etc as a means of
processing a pipeline or stream - start with a vector, select a subset
based on some predicate, turn this subset into an entirely different
object/list/

In iterative-fashion pseudo code

source = list(c(1,2,3,4), c(8,7,6,5,4,3,7), c(5,4))
result = { }
foreach (item in source) {
    if (length(item) > 2) {
        result[generate_some_name()] = length(item)
    }
}

That's and example of what I want to do. It maps many (a subset of the
vectors in source) to one (the result named list). It's a map-filter - but
even more general than your typical map-filter in that you can change the
data structure - e.g. map a function over a vector, use a subset of the
results, and turn those into a list or S3 object.


Stefan



On 29 July 2016 at 15:54, William Dunlap <wdunlap at tibco.com> wrote:

> Reduce (like lapply) apparently uses the [[ operator to
> extract components from the list given to it. X[[i]] does
> not attach names(X)[i] to its output (where would it put it?).
> Hence your se
>
> To help understand what these functions are doing try
> putting print statements in your test functions:
> > data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
> > r <- Reduce(function(acc, item) { cat("acc="); str(acc) ; cat("item=");
> str(item); length(item) }, data, init=list())
> acc= list()
> item= num [1:2] 1 1
> acc= int 2
> item= num 3
> acc= int 1
> item= num [1:2] 2 2
> > data2 <- list(one = c(oneA=1, onB=1), three = c(threeA=3), two =
> c(twoA=2, twoB=2))
> > r <- Reduce(function(acc, item) { cat("acc="); str(acc) ; cat("item=");
> str(item); length(item) }, data2, init=list())
> acc= list()
> item= Named num [1:2] 1 1
>  - attr(*, "names")= chr [1:2] "oneA" "onB"
> acc= int 2
> item= Named num 3
>  - attr(*, "names")= chr "threeA"
> acc= int 1
> item= Named num [1:2] 2 2
>  - attr(*, "names")= chr [1:2] "twoA" "twoB"
>
>
> I still don't understand why you want Reduce to to lapply's
> job.   Reduce maps many to one and lapply maps many to
> many.
>
>
>
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
>
> On Fri, Jul 29, 2016 at 1:37 AM, Stefan Kruger <stefan.kruger at gmail.com>
> wrote:
>
>> Jeremiah -
>>
>> neat - that's one step closer, but one small thing I still don't
>> understand:
>>
>> > data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
>> > r = Reduce(function(acc, item) { append(acc, setNames(length(item),
>> names(item))) }, data, list())
>> > str(r)
>> List of 3
>>  $ : int 2
>>  $ : int 1
>>  $ : int 2
>>
>> I wanted the names to remain, but it seems like the "data" parameter loses
>> its names when consumed by the Reduce()? If I print "item" inside the
>> reducing function, it's not got the names. I'm probably missing some
>> central tenet of R here.
>>
>> As to your comment of this being lapply() implemented by Reduce() - as I
>> understand lapply()  (or map() in other functional languages), it's
>> limited
>> to returning a list/vector of the same length as the original. Consider
>> this contrived example:
>>
>> > r = Reduce(function(acc, item) { if (length(item) > 1) {append(acc,
>> setNames(length(item), names(item)))} }, data, list())
>> > str(r)
>>  int 2
>> > r
>> [1] 2
>>
>> I don't think you could achieve that with lapply()?
>>
>> Thanks
>>
>> Stefan
>>
>>
>> On 28 July 2016 at 20:19, jeremiah rounds <roundsjeremiah at gmail.com>
>> wrote:
>>
>> > Basically using Reduce as an lapply in that example, but I think that
>> was
>> > caused by how people started talking about things in the first place =)
>> But
>> > the point is the accumulator can be anything as far as I can tell.
>> >
>> > On Thu, Jul 28, 2016 at 12:14 PM, jeremiah rounds <
>> > roundsjeremiah at gmail.com> wrote:
>> >
>> >> Re:
>> >> "What I'm trying to
>> >> work out is how to have the accumulator in Reduce not be the same type
>> as
>> >> the elements of the vector/list being reduced - ideally it could be an
>> S3
>> >> instance, list, vector, or data frame."
>> >>
>> >> Pretty sure that is not true.  See code that follows.  I would never
>> >> solve this task in this way though so no comment on the use of Reduce
>> for
>> >> what you described.  (Note the accumulation of "functions" in a list is
>> >> just a demo of possibilities).  You could accumulate in an environment
>> too
>> >> and potentially gain a lot of copy efficiency.
>> >>
>> >>
>> >> lookup = list()
>> >> lookup[[as.character(1)]] = function() print("1")
>> >> lookup[[as.character(2)]] = function() print("2")
>> >> lookup[[as.character(3)]] = function() print("3")
>> >>
>> >> data = list(c(1,2), c(1,4), c(3,3), c(2,30))
>> >>
>> >>
>> >> r = Reduce(function(acc, item) {
>> >> append(acc, list(lookup[[as.character(min(item))]]))
>> >> }, data,list())
>> >> r
>> >> for(f in r) f()
>> >>
>> >>
>> >> On Thu, Jul 28, 2016 at 5:09 AM, Stefan Kruger <
>> stefan.kruger at gmail.com>
>> >> wrote:
>> >>
>> >>> Ulrik - many thanks for your reply.
>> >>>
>> >>> I'm aware of many simple solutions as the one you suggest, both
>> iterative
>> >>> and functional style - but I'm trying to learn how to bend Reduce()
>> for
>> >>> the
>> >>> purpose of using it in more complex processing tasks. What I'm trying
>> to
>> >>> work out is how to have the accumulator in Reduce not be the same
>> type as
>> >>> the elements of the vector/list being reduced - ideally it could be
>> an S3
>> >>> instance, list, vector, or data frame.
>> >>>
>> >>> Here's a more realistic example (in Elixir, sorry)
>> >>>
>> >>> Given two lists:
>> >>>
>> >>> 1. data: maps an id string to a vector of revision strings
>> >>> 2. dict: maps known id/revision pairs as a string to true (or 1)
>> >>>
>> >>> find the items in data not already in dict, returned as a named list.
>> >>>
>> >>> ```elixir
>> >>> data = %{
>> >>>     "id1" => ["rev1.1", "rev1.2"],
>> >>>     "id2" => ["rev2.1"],
>> >>>     "id3" => ["rev3.1", "rev3.2", "rev3.3"]
>> >>> }
>> >>>
>> >>> dict = %{
>> >>>     "id1/rev1.1" => 1,
>> >>>     "id1/rev1.2" => 1,
>> >>>     "id3/rev3.1" => 1
>> >>> }
>> >>>
>> >>> # Find the items in data not already in dict. Return as a grouped map
>> >>>
>> >>> Map.keys(data)
>> >>>     |> Enum.flat_map(fn id -> Enum.map(data[id], fn rev -> {id, rev}
>> end)
>> >>> end)
>> >>>     |> Enum.filter(fn {id, rev} -> !Dict.has_key?(dict,
>> "#{id}/#{rev}")
>> >>> end)
>> >>>     |> Enum.reduce(%{}, fn ({k, v}, d) -> Map.update(d, k, [v],
>> &[v|&1])
>> >>> end)
>> >>> ```
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On 28 July 2016 at 12:03, Ulrik Stervbo <ulrik.stervbo at gmail.com>
>> wrote:
>> >>>
>> >>> > Hi Stefan,
>> >>> >
>> >>> > in that case,lapply(data, length) should do the trick.
>> >>> >
>> >>> > Best wishes,
>> >>> > Ulrik
>> >>> >
>> >>> > On Thu, 28 Jul 2016 at 12:57 Stefan Kruger <stefan.kruger at gmail.com
>> >
>> >>> > wrote:
>> >>> >
>> >>> >> David - many thanks for your response.
>> >>> >>
>> >>> >> What I tried to do was to turn
>> >>> >>
>> >>> >> data <- list(one = c(1, 1), three = c(3), two = c(2, 2))
>> >>> >>
>> >>> >> into
>> >>> >>
>> >>> >> result <- list(one = 2, three = 1, two = 2)
>> >>> >>
>> >>> >> that is creating a new list which has the same names as the first,
>> but
>> >>> >> where the values are the vector lengths.
>> >>> >>
>> >>> >> I know there are many other (and better) trivial ways of achieving
>> >>> this -
>> >>> >> my aim is less the task itself, and more figuring out if this can
>> be
>> >>> done
>> >>> >> using Reduce() in the fashion I showed in the other examples I
>> gave.
>> >>> It's
>> >>> >> a
>> >>> >> building block of doing map-filter-reduce type pipelines that I'd
>> >>> like to
>> >>> >> understand how to do in R.
>> >>> >>
>> >>> >> Fumbling in the dark, I tried:
>> >>> >>
>> >>> >> Reduce(function(acc, item) { setNames(c(acc, length(data[item])),
>> >>> item },
>> >>> >> names(data), accumulate=TRUE)
>> >>> >>
>> >>> >> but setNames sets all the names, not adding one - and acc is still
>> a
>> >>> >> vector, not a list.
>> >>> >>
>> >>> >> It looks like 'lambda.tools.fold()' and possibly 'purrr.reduce()'
>> aim
>> >>> at
>> >>> >> doing what I'd like to do - but I've not been able to figure out
>> quite
>> >>> >> how.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >> Stefan
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On 27 July 2016 at 20:35, David Winsemius <dwinsemius at comcast.net>
>> >>> wrote:
>> >>> >>
>> >>> >> >
>> >>> >> > > On Jul 27, 2016, at 8:20 AM, Stefan Kruger <
>> >>> stefan.kruger at gmail.com>
>> >>> >> > wrote:
>> >>> >> > >
>> >>> >> > > Hi -
>> >>> >> > >
>> >>> >> > > I'm new to R.
>> >>> >> > >
>> >>> >> > > In other functional languages I'm familiar with you can often
>> >>> seed a
>> >>> >> call
>> >>> >> > > to reduce() with a custom accumulator. Here's an example in
>> >>> Elixir:
>> >>> >> > >
>> >>> >> > > map = %{"one" => [1, 1], "three" => [3], "two" => [2, 2]}
>> >>> >> > > map |> Enum.reduce(%{}, fn ({k,v}, acc) -> Map.update(acc, k,
>> >>> >> > > Enum.count(v), nil) end)
>> >>> >> > > # %{"one" => 2, "three" => 1, "two" => 2}
>> >>> >> > >
>> >>> >> > > In R-terms that's reducing a list of vectors to become a new
>> list
>> >>> >> mapping
>> >>> >> > > the names to the vector lengths.
>> >>> >> > >
>> >>> >> > > Even in JavaScript, you can do similar things:
>> >>> >> > >
>> >>> >> > > list = { one: [1, 1], three: [3], two: [2, 2] };
>> >>> >> > > var result = Object.keys(list).reduceRight(function (acc,
>> item) {
>> >>> >> > >  acc[item] = list[item].length;
>> >>> >> > >  return acc;
>> >>> >> > > }, {});
>> >>> >> > > // result == { two: 2, three: 1, one: 2 }
>> >>> >> > >
>> >>> >> > > In R, from what I can gather, Reduce() is restricted such that
>> any
>> >>> >> init
>> >>> >> > > value you feed it is required to be of the same type as the
>> >>> elements
>> >>> >> of
>> >>> >> > the
>> >>> >> > > vector you're reducing -- so I can't build up. So whilst I can
>> >>> do, say
>> >>> >> > >
>> >>> >> > >> Reduce(function(acc, item) { acc + item }, c(1,2,3,4,5), 96)
>> >>> >> > > [1] 111
>> >>> >> > >
>> >>> >> > > I can't use Reduce to build up a list, vector or data frame?
>> >>> >> > >
>> >>> >> > > What am I missing?
>> >>> >> > >
>> >>> >> > > Many thanks for any pointers,
>> >>> >> >
>> >>> >> > This builds a list:
>> >>> >> >
>> >>> >> > > Reduce(function(acc, item) { c(acc , item) }, c(1,2,3,4,5), 96,
>> >>> >> > accumulate=TRUE)
>> >>> >> > [[1]]
>> >>> >> > [1] 96
>> >>> >> >
>> >>> >> > [[2]]
>> >>> >> > [1] 96  1
>> >>> >> >
>> >>> >> > [[3]]
>> >>> >> > [1] 96  1  2
>> >>> >> >
>> >>> >> > [[4]]
>> >>> >> > [1] 96  1  2  3
>> >>> >> >
>> >>> >> > [[5]]
>> >>> >> > [1] 96  1  2  3  4
>> >>> >> >
>> >>> >> > [[6]]
>> >>> >> > [1] 96  1  2  3  4  5
>> >>> >> >
>> >>> >> > But you are not saying what you want. The other examples were
>> doing
>> >>> >> > something with names but you provided no names for the R example.
>> >>> >> >
>> >>> >> > This would return a list of named vectors:
>> >>> >> >
>> >>> >> > > Reduce(function(acc, item) { setNames( c(acc,item), 1:(item+1))
>> >>> },
>> >>> >> > c(1,2,3,4,5), 96, accumulate=TRUE)
>> >>> >> > [[1]]
>> >>> >> > [1] 96
>> >>> >> >
>> >>> >> > [[2]]
>> >>> >> >  1  2
>> >>> >> > 96  1
>> >>> >> >
>> >>> >> > [[3]]
>> >>> >> >  1  2  3
>> >>> >> > 96  1  2
>> >>> >> >
>> >>> >> > [[4]]
>> >>> >> >  1  2  3  4
>> >>> >> > 96  1  2  3
>> >>> >> >
>> >>> >> > [[5]]
>> >>> >> >  1  2  3  4  5
>> >>> >> > 96  1  2  3  4
>> >>> >> >
>> >>> >> > [[6]]
>> >>> >> >  1  2  3  4  5  6
>> >>> >> > 96  1  2  3  4  5
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> >
>> >>> >> > > Stefan
>> >>> >> > >
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > --
>> >>> >> > > Stefan Kruger <stefan.kruger at gmail.com>
>> >>> >> > >
>> >>> >> > >       [[alternative HTML version deleted]]
>> >>> >> > >
>> >>> >> > > ______________________________________________
>> >>> >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
>> see
>> >>> >> > > https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >> > > PLEASE do read the posting guide
>> >>> >> > http://www.R-project.org/posting-guide.html
>> >>> >> > > and provide commented, minimal, self-contained, reproducible
>> code.
>> >>> >> >
>> >>> >> > David Winsemius
>> >>> >> > Alameda, CA, USA
>> >>> >> >
>> >>> >> >
>> >>> >>
>> >>> >>
>> >>> >> --
>> >>> >> Stefan Kruger <stefan.kruger at gmail.com>
>> >>> >>
>> >>> >>         [[alternative HTML version deleted]]
>> >>> >>
>> >>> >> ______________________________________________
>> >>> >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> >> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> >> PLEASE do read the posting guide
>> >>> >> http://www.R-project.org/posting-guide.html
>> >>> >> and provide commented, minimal, self-contained, reproducible code.
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>> --
>> >>> Stefan Kruger <stefan.kruger at gmail.com>
>> >>>
>> >>>         [[alternative HTML version deleted]]
>> >>>
>> >>> ______________________________________________
>> >>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> >>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>> PLEASE do read the posting guide
>> >>> http://www.R-project.org/posting-guide.html
>> >>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>
>> >>
>> >
>>
>>
>> --
>> Stefan Kruger <stefan.kruger at gmail.com>
>>
>>         [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>


-- 
Stefan Kruger <stefan.kruger at gmail.com>

	[[alternative HTML version deleted]]



More information about the R-help mailing list