2

I'm trying to convert an array of strings into an object for which each member uses the string for a key, and initializes the value to 0. (Classic accumulator for Word Count, right?)

Here's the style of the input data:

%dw 2.0
output application/dw
var hosts = [
  "t.me",
  "thewholeshebang.com",
  "thegothicparty.com",
  "windowdressing.com",
  "thegothicparty.com"
]

To get the accumulator, I need a structure in this style:

var histogram_acc = {
    "t.me" : 1, 
    "thewholeshebang.com" : 1, 
    "thegothicparty.com" : 2, 
    "windowdressing.com" : 1
}

My thought was that this is a slam-dunk case for reduce(), right?

So to get the de-duplicated list of hosts, we can use this phrase:

hosts distinctBy $

Happy so far. But now for me, it turns wicked.

I thought this might be the gold:

hosts distinctBy $ reduce (ep,acc={}) -> acc ++ {ep: 0}

But the problem is that this didn't work out so well. The first argument to the lambda for reduce() represents the iterating element, in this case the endpoint or address. The lambda appends the new object to the accumulator.

Well, that's how I hoped it would happen, but I got this instead:

{
  ep: 0,
  ep: 0,
  ep: 0,
  ep: 0
}

I kind of need it to do better than that.

agentv
  • 739
  • 1
  • 9
  • 21

3 Answers3

4

As you said reduce is a good fit for this problem, alternatively you can use the "Dynamic elements" of objects feature to "flatten an array of objects into an object"

%dw 2.0
output application/dw
var hosts = [
  "t.me",
  "thewholeshebang.com",
  "thegothicparty.com",
  "windowdressing.com",
  "thegothicparty.com"
]
---
{(
    hosts 
        distinctBy $ 
        map (ep) -> {"$ep": 0}
)}

See https://docs.mulesoft.com/mule-runtime/4.3/dataweave-types#dynamic_elements

Shoki
  • 1,508
  • 8
  • 13
3

Scenario 1: The trick I think for this scenario is you need to enclose the expression for the distinctBy ... map with {}.

Example:

Input:

%dw 2.0
var hosts = [
  "t.me",
  "thewholeshebang.com",
  "thegothicparty.com",
  "windowdressing.com",
  "thegothicparty.com"
]
output application/json
---
{ // This open bracket will do the trick. 
  (hosts distinctBy $ map {($):0})
} // See Scenario 2 if you remove or comment this pair bracket

Output:

{
    "t.me": 0,
    "thewholeshebang.com": 0,
    "thegothicparty.com": 0,
    "windowdressing.com": 0
}

Scenario 2: If you remove the {} from the expression {<expression distinctBy..map...} the output will be an Array.

Example:

Input:

%dw 2.0
var hosts = [
  "t.me",
  "thewholeshebang.com",
  "thegothicparty.com",
  "windowdressing.com",
  "thegothicparty.com"
]
output application/json
---
//{ // This is now commented
  (hosts distinctBy $ map {($):0})
//} // This is now commented

Output:

[
    {
      "t.me": 0
    },
    {
      "thewholeshebang.com": 0
    },
    {
      "thegothicparty.com": 0
    },
    {
      "windowdressing.com": 0
    }
]

Scenario 3: If you want to count the total duplicate per item, you can use the groupBy and sizeOf

Example:

Input:

%dw 2.0
var hosts = [
  "t.me",
  "thewholeshebang.com",
  "thegothicparty.com",
  "windowdressing.com",
  "thegothicparty.com"
]
output application/json
---
hosts groupBy $ mapObject (value,key) -> {
    (key): sizeOf(value)
}

Output:

{
  "t.me": 1,
  "thewholeshebang.com": 1,
  "thegothicparty.com": 2,
  "windowdressing.com": 1
}
Ray A
  • 447
  • 2
  • 5
  • Okay, that third example looks outlandishly efficient. I'd love to compare it with reduce at scale to see if there's a performance difference. – agentv Sep 09 '20 at 19:49
2

Hilariously (but perhaps only to me) is the fact that I discovered the answer to this while I was writing my question. Hoping that someone will pose this same question, here is what I found.

In order to present the lambda argument in my example (ep) as the key in a structure, I must quote and intererpolate it.

"$ep"

Once I did that, it was a quick passage to:

hosts distinctBy $ reduce (ep,acc={}) -> acc ++ {"$ep": 0}

...and then of course this:

{
  "t.me": 0,
  "thewholeshebang.com": 0,
  "thegothicparty.com": 0,
  "windowdressing.com": 0
}
agentv
  • 739
  • 1
  • 9
  • 21
  • Looking at this again, I realize that I didn't need to use reduce in this context. I guess I was just on a roll. Looking at responses above, I realize that it should probably have been map instead of reduce for this job. – agentv Sep 09 '20 at 19:55