ramda.js: obtain a set of duplicates from an array of objects using specific properties

Question

Given this array, containing javascript objects (json):
Each object has a bproperty, and a u property,

(each contains additional properties I am not concerned with for this exercise).

[
    { "b": "A", "u": "F", ... },
    { "b": "M", "u": "T", ... },
    { "b": "A", "u": "F", ... },
    { "b": "M", "u": "T", ... },
    { "b": "M", "u": "T", ... },
    { "b": "X", "u": "Y", ... },
    { "b": "X", "u": "G", ... },
]

I would like to use ramda to find a set of all the duplicates. The result should look something like this.

[ 
    { "b": "A", "u":"F" },
    { "b": "M", "u":"T" } 
]

These two entries have duplicates they are repeated 2 and 3 times in the original list respectively.

edit

I have found a solution using underscore, that keeps the original array elements, and splits them perfectly into singles and duplicates. I prefer ramda.js, and underscore doesn't just give a set of duplicates - as per the question, so I am leaving the question open until someone can answer using ramda. I am moving on with underscore until the question is answered.

I have a repl that finds the unique values... as a start...

Do you want complete duplicates or only those that match on `b` and `u`? — Scott Sauyet, Sep 05 '17 at 19:25
no, only the matching fields, "b" and "u" - though for interest - it would be nice to know. I suspect that R.equals might cater for all being equal. — Jim, Sep 05 '17 at 19:55
I have spent a whole lot more time trying to resolve this using R.head, and R.tail - have come to the conclusion that this is a m*** of a tricksy question... Trying to somehow iterate over the unique list, and remove one match from the data for each match in unique seems like the right approach... but I haven't managed to get the composition correct yet. — Jim, Sep 06 '17 at 12:47

score 3 · Accepted Answer · edited Sep 07 '17 at 04:48

This seems overcomplicated and unlikely to be performant, but one options would be this:

const foo = pipe(
  project(['b', 'u']),
  reduce(
    ({results, foundOnce}, item) => contains(item, results)
      ? {results, foundOnce}
      : contains(item, foundOnce)
        ? {results: append(item, results), foundOnce}
        : {results, foundOnce: append(item, foundOnce)},
    {results: [], foundOnce: []}
  ), 
  prop('results')
)

foo(xs); //=> [{b: 'A', u: 'F'}, {b: 'M', u: 'T'}]

Perhaps this version is easier to understand, but it takes an extra iteration through the data:

const foo = pipe(
  project(['b', 'u']),
  reduce(
    ({results, foundOnce}, item) => contains(item, foundOnce)
        ? {results: append(item, results), foundOnce}
        : {results, foundOnce: append(item, foundOnce)},
    {results: [], foundOnce: []}
  ),
  prop('results'),
  uniq
)

repl here

Sorry but I can't check your results right now - I've been out celebrating - but I have to completely admire the kind of twisted mind that came up with this... I'll check it in the morning and upvote as best I can. You rock! - FYI I am more interested in the solution than performance - the list of matches is very short. — Jim, Sep 06 '17 at 20:56

score 1 · Answer 2 · answered Sep 06 '17 at 13:39

If you don't care about looping over your data multiple times, you could something like this:

Create partial copies that contain only the relevant props, using pick (your own idea)
use groupBy with a hash function to group similar objects. (Alternatively: sort first and use groupWith(equals))
Get the grouped arrays using values
Filter out arrays with only 1 item (those are not duped...) using filter
Map over the results and return the first element of each array using map(head)

In code:

const containsMoreThanOne = compose(lt(1), length);
const hash = JSON.stringify; // Naive.. watch out for key-order!

const getDups = pipe(
  map(pick(["b", "u"])),
  groupBy(hash),
  values,
  filter(containsMoreThanOne),
  map(head)
);

getDups(data);

Working demo in Ramda REPL.

A more hybrid approach would be to cramp all this logic in one reducer, but it looks kind of messy to me...

const clean = pick(["b", "u"]);
const hash = JSON.stringify;
const dupReducer = hash => (acc, o) => {
    const h = hash(o);
    // Mutate internal state
    acc.done[h] = (acc.done[h] || 0) + 1;
    if (acc.done[h] === 2) acc.result.push(o);

    return acc;
  };


const getDups = (clean, hash, data) =>
  reduce(dupReducer(hash), { result: [], done: { } }, map(clean, data)).result;

getDups(clean, hash, data);

REPL

score 0 · Answer 3 · answered Apr 21 '19 at 02:25

  const arr = [];
  const duplicates = [];
  const values1 =  [
  { b: 'A', u: 'F', a: 'q' },
  { b: 'M', u: 'T', a: 'q' },
  { b: 'A', u: 'F', a: 'q' },
  { b: 'M', u: 'T', a: 'q' },
  { b: 'M', u: 'T', a: 'q' },
  { b: 'X', u: 'Y', a: 'q' },
  { b: 'X', u: 'G', a: 'q' },
 ];
 values1.forEach(eachValue => {
 arr.push(values(pick(['b', 'u'], eachValue)));
 });
 arr.forEach(fish => {
 if ( indexOf(fish, arr) !== lastIndexOf(fish, arr) ) {
   duplicates.push(zipObj(['b', 'u'], fish));
 }
});

[blog]: https://ramdafunctionsexamples.com/ "click here for updates"

<https://ramdafunctionsexamples.com/>?

trk · Answer 4 · 2017-09-05T19:03:25.677

-1

Not an expert with Ramda JS but I think the following should work :

var p = [
    { "b": "A", "u": "F" },
    { "b": "A", "u": "F" },
    { "b": "A", "u": "F" },
    { "b": "A", "u": "F" },
    { "b": "A", "u": "F" },
    { "b": "M", "u": "T" }
];
var dupl = n => n > 1;
R.compose(
    R.map(JSON.parse),
    R.keys,
    R.filter(dupl),
    R.countBy(String),
    R.map(JSON.stringify)
)(p)

Please let me know if it does.

edited Sep 05 '17 at 19:03

answered Sep 05 '17 at 17:54

trk

2,106
14
20

updated answer to call the right R method (`uniqBy`) – trk Sep 05 '17 at 18:00
1

I believe the question is asking to find the set of elements that have a duplicate. – Steven Goodman Sep 05 '17 at 18:05
2

@StevenGoodman indeed . Let me fix the answer. Thanks for pointing out. – trk Sep 05 '17 at 18:08
@82Tuskers nice try, but I only want to compare on "b" and "u", as I mentioned, there are other fields. – Jim Sep 05 '17 at 20:01
If you only want to compare on these fields, and also only want to return these ones, then you could fix this by starting the pipeline with `project(['b', 'u'])`. I'm not fond of the `JSON.parse`/`JSON.stringify` solution, though. They may not be appropriate to your data. – Scott Sauyet Sep 05 '17 at 20:38
I spent an hour trying with R.pick, and got close, but I have to move on, I have also edited the question for clarity. – Jim Sep 06 '17 at 04:47

ramda.js: obtain a set of duplicates from an array of objects using specific properties

4 Answers4