JS: writing a function that iterates through a list of strings and returns the top 10 most frequent strings in the list

Question

I am trying to write a function that iterates through a list of strings and returns the top 10 most frequent strings in the list. I am trying to come up with multiple solutions to this question

Here is my first solution

const list = [
    "this",
    "is",
    "a",
    "test",
    "which",
    "word",
    "wins",
    "top",
    "i",
    "don't",
    "know",
    "off",
    "hand",
    "do",
    "you",
    "this",
    "a",
    "a",
    "this",
    "test",
    "a",
    "a",
    "do",
    "hand",
    "hand",
    "a",
    "whatever",
    "what",
    "do",
    "do"
  ];

function fn1(strArr) {
    const map = new Map()
    for(const str of strArr) {
        if(map.has(str)) {
            map.set(str, map.get(str) + 1)
        } else {
            map.set(str, 1)
        }
    }
    const sortedMap =[...map.entries()].sort(([_,a], [__,b]) => a < b ? 1 : -1)
    return sortedMap.slice(0 , 10).map(([str]) => str)
}

But I cannot seem to find any other solutions to this question. Can anyone suggest an alternative suggestion?

Also, one thing to note that is the list can be really large, maybe contain 1 million strings. So we need to try to minimize the runtime complexity

I saw this problem somewhere in the coding challenge platforms. — jithil, Dec 24 '20 at 06:05
`.sort()` is kind of unnecessary and might have an upper bound of `O(n log n)` but since you only need the top 10, why not loop through it once to keep it at `O(n)`. I think your initial for loop is already `O(n)`. If this is in some production application maybe also launch a web worker to halve the array. — t348575, Dec 24 '20 at 06:10
I think this (the map / dict way) is the best approach. You can do it much simpler with `cnt={}; for(const s of strArr) cnt[s] ? ++cnt[s] : cnt[s]=1;` A different but probably not better approach could be to sort the array and then count consecutive identical entries and update a list of the 10 highest frequencies. You may find many related entries through a google search for "javascript count occurrences ..." — Max, Dec 24 '20 at 06:13
@t348575 I don't think your suggestion is O(n) because for each item the lookup of the counter for that string in the dictionary of counters is O(log n) on average. I think all these methods are O(n log n). — Max, Dec 24 '20 at 06:21
@Max I think ecmascript mandates `.has` and other map functions to be at or under `O(n)` — t348575, Dec 24 '20 at 06:41
@t348575 Yes of course .has is o(n), as I said, it is O(log n) (^), but doing it for each member gives O(n log n). (^: Here n is the number of entries already stored, but this doesn't change asymptotics since you can forget about the first half and only consider the second half of the array processing (from indices n/2 to n) and put n/2 everywhere to get a lower bound. Also, given the repetitions the total number of entries will be smaller than n by some factor, but this also shouldn't change the order of complexity.) — Max, Dec 24 '20 at 07:11

Mo B. · Accepted Answer · 2021-01-02T22:39:41.867

I believe the following solution is the fastest in practice:

Map the n strings to their frequencies, as you have already done. O(n)
Convert the map into an array with string/frequency pairs. O(n)
Convert the array into a Max-heap based on the frequency using Floyd's method (i.e., by calling max-heapify for all indices from ⌊n/2⌋ - 1 down to 0). O(n)
Extract the top element k times. (In your case, k=10.) O(k log n)

(I don't provide any code here because it basically consists of calling a binary heap library (see here for a highly optimized implementation).)

Analysis

The asymptotic complexity is O(n + k log n) which is less than O(n log k) from the solution provided by Chaos Monkey. For small k (e.g. 10), there probably won't be any significant difference. The difference becomes more apparent for larger k (as long as k ≪ n; for much larger k, see "Alternative" below). Also note that the constant factor for Steps 1 and 2 is 1, and the constant factor in Step 3 is also small on average (1.8814), so the overall constant factor is less than 4.

Alternative

There is a solution that solves the problem in O(n) on average, also with a small constant factor, which is much more efficient for larger k (i.e. when k approaches n/2). The downside is that the (very unlikely) worst-case complexity is O(n²):

Map the n strings to their frequencies, as you have already done. O(n)
Convert the map into an array with string/frequency pairs. O(n)
Apply Quickselect (exactly like Quicksort, but just recurse on one side of the partition) to find the kth largest frequency. All elements to the left are even larger, so the result is the first k elements of the Quickselected array. O(n) average, O(n²) worst-case.

It is possible to implement a variant of Quickselect with guaranteed O(n) complexity using median of medians pivot selection, but this is not that great in practice because the constant factor is quite high. But from an academic standpoint, this would be the asymptotically optimal solution.

(Here is a JavaScript library for Quickselect, though from a quick glance, the implementation doesn't look like it's ideal for this case: a good implementation should do a Dijkstra-style 3-way partitioning.)

Benchmark

A quick and dirty benchmark with n = 10^6, k = 10, measuring only the runtimes after Step 2 (since Steps 1/2 are shared amongst all 5 methods):

Average time for Sort: 7.5 ms
Average time for ChaosMonkey: 10.25 ms
Average time for CountingSort: 5.25 ms
Average time for Mo B. Max-Heap: 4 ms
Average time for Mo B. Alternative (Quickselect): 3.25 ms

https://dotnetfiddle.net/oHRMsp

My conclusion is that for the given parameters there is not much of a difference between the different methods. For simplicity, I would just stick to the sorting method, which also scales well for both n and k.

(Lots of caveats: it's written (sloppily) in C#, not JavaScript; it hasn't been tested for correctness; individual methods are not optimized; runtimes may also depend on distribution of frequencies, the implemented Quickselect is naive in that it's not optimized for the (here) common case where lots of frequencies are equal etc...)

Final note on space

All the benchmarked methods use an additional O(n) space because they first create the frequency map (and the counting sort method uses an additional worst-case O(n) space on top of the frequency map for the count). It is possible to solve the problem with only O(1) additional space, but at the expense of a time complexity of O(kn²).

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/226758/discussion-on-answer-by-mo-b-js-writing-a-function-that-iterates-through-a-lis). — Samuel Liew, Jan 03 '21 at 04:11

Chaos Monkey · Answer 2 · 2020-12-28T03:56:20.550

There are two approaches to this, with the first (simple sorting) already implemented by yourself. Time Complexity: O(N log N) as you iterate override the entire array N times and then sort N items which take N log N times. Space Complexity: O(N)

The second approach will be to use a Heap of the counts (Min Priority Queue which pops the smallest counts first), after each insert we verify that if the size of the heap is larger than 10 we pop the last 1, and then instead of sorting we just pop the last 10 items. Time Complexity: O(N log k) with N being the length of the list and K being the topmost occurring strings - in your case that is 10. This is because we make sure the heap size will stay no larger than K and with every insert to the heap it sorts itself with O(log K) time, giving us O(N log k) in total time complexity. Space Complexity: O(N)

Example:

const list = [
  "this",
  "is",
  "a",
  ...
  "do"
];

const counts = {};

list.forEach(item=>{
   counts[item]? counts[item]++: counts[item] = 1;
});

// Now that counts is a map of {word -> count}, we need to add these into a priority queue and just pop.
// I will not be implementing a priority queue in this example, you can probably just find an npm package that implements it in a much more efficient way.

const priorityQueue = new PriorityQueue((a,b) => a.count - b.count); // you usually need to pass a comparator function

Object.entries(counts).forEach(([word, count]) => {
  priorityQueue.offer({word, count});
  if(priorityQueue.size() > 10)
    priorityQueue.pop();
}};

// now, just pop the first K (10) elements from the PriorityQueue:
for(let i=0; i<10; i++) {
  console.log(priorityQueue.pop())
}

"This is because with every insert to the heap it sorts itself with O(log K) time." I don't think this is right. By looking at the code, I can say that the size of the heap can go as large as N, and an insertion to a heap tree with N nodes costs O(logN) time. — yemre, Dec 27 '20 at 16:44
@yemre you are right, I forgot an important part of the approach was to actually use a min priority queue instead of a max. I edited the code example and the explanation, it should work now with actual O(N log K) — Chaos Monkey, Dec 28 '20 at 03:59
(When and where referring to the same thing, please don't alternate between upper and lower case like kK above. (If at all, use different case same letters for sufficiently related quantities.)) — greybeard, Dec 28 '20 at 11:38
I think the time complexity for the first approach, i.e. the approach I wrote in the description is not O(nlogn) but O(n + klogk) where n is the total number of strings and k is the number of *unique* strings since you are only sorting the keys of the hash map. — Joji, Mar 27 '21 at 00:17

pilchard · Answer 3 · 2020-12-30T02:28:08.063

3

Here is a solution that avoids sorting by collecting the entries of the word count Map() in an object by count value. The result is then the last 10 items in the Object.entries() of this object. It should be noted that this relies on the ordering of the object keys which has been historically underspecified but, especially for integer keys as used here, offers predictable, ascending ordering in line with updates to the specification. – see: Does ES6 introduce a well-defined order of enumeration for object properties?

This solution has the added benefit of returning arrays of words with equal counts rather than the arbitrary cutoff that the solutions using sort() introduce.

const input = [ "this", "is", "a", "test", "which", "word", "wins", "top", "i", "don't", "know", "off", "hand", "do", "you", "this", "a", "a", "this", "test", "a", "a", "do", "hand", "hand", "a", "whatever", "what", "do", "do"];

let counts = new Map(), i;
for (i = 0; i < input.length; i++) {
  counts.set(input[i], (counts.get(input[i]) ?? 0) + 1);
}

let countHash = {};
counts.forEach((count, word) => (countHash[count] ??= []).push(word));

let result = Object.entries(countHash);
if (result.length > 10) result = result.slice(result.length - 10);

// output
for (let j = result.length - 1; j >= 0; j--) {
  console.log(JSON.stringify(result[j]));
}

.as-console-wrapper { max-height: 100% !important; top: 0; }

To avoid the uncertainties of ordering/sort complexity of the object used to hash the counts Map using the count value as key, one can use a sparse array instead by simply replacing the object assignment with an array assignment and using the count value as index.

let countArr = [];
counts.forEach((count, word) => (countArr[count] ??= []).push(word));

Object.entries() called on a sparse array respects holes and returns only existing [key, value] pairs. (This is true of Object.keys() and also iterating using for...in).

const input = [ "this", "is", "a", "test", "which", "word", "wins", "top", "i", "don't", "know", "off", "hand", "do", "you", "this", "a", "a", "this", "test", "a", "a", "do", "hand", "hand", "a", "whatever", "what", "do", "do"];

let counts = new Map(), i;
for (i = 0; i < input.length; i++) {
  counts.set(input[i], (counts.get(input[i]) ?? 0) + 1);
}

let countArr = [];
counts.forEach((count, word) => (countArr[count] ??= []).push(word));

let result = Object.entries(countArr);
if (result.length > 10) result = result.slice(result.length - 10);

// output
for (let j = result.length - 1; j >= 0; j--) {
  console.log(JSON.stringify(result[j]));
}

// Output of this answer
 [
   [6, ['a']],
   [4, ['do']],
   [3, ['this', 'hand']],
   [2, ['test']],
   [1, ['is', 'which', 'word', 'wins', 'top', 'i', "don't", 'know', 'off', 'you', 'whatever', 'what']]
 ]

// Output using sort
 [
   [6, 'a'],
   [4, 'do'],
   [3, 'this'],
   [3, 'hand'],
   [2, 'test'],
   [1, 'is'],
   [1, 'which'],
   [1, 'word'],
   [1, 'wins'],
   [1, 'top']
 ]
// [ 1, 'i' ],
// [ 1, "don't" ],
// [ 1, 'know' ], 
// [ 1, 'off' ],
// [ 1, 'you' ], 
// [ 1, 'whatever' ],
// [ 1, 'what' ]

edited Dec 30 '20 at 02:28

answered Dec 27 '20 at 16:18

pilchard

12,414
5
11
23

this is interesting. So it relies on the fact that for integer key JS objects would sort them ascending automatically... – Joji Dec 29 '20 at 22:25
Exactly, (one could also use a sparse array with the counts as indexes to same the end, though I found it less performant). Overall it leaves you with an O(n) solution (the initial mapping for sure, and at worst if all words are unique, the count hashing and Object.entries call, but they are most likely smaller due to grouping). – pilchard Dec 29 '20 at 23:10
sorry how is this still O(n)? sorting the integer key still has a O(nlogn) runtime complexity right? Also "a sparse array with the counts as indexes to same the end" sounds pretty interesting to me. could you add that as an alternative to the question? – Joji Dec 30 '20 at 00:42
From what I can gather Object.keys() is O(n) [Object.keys() complexity?](https://stackoverflow.com/questions/7716812/object-keys-complexity) and integer keys aren't handled the same way as named keys (https://v8.dev/blog/fast-properties) and is highly optimized, so while there may be some cost I don't believe it approaches O(nlogn). I'll look into the sparse array option again. – pilchard Dec 30 '20 at 00:53
Added a sparse array snippet. It's really just a matter of declaring an array instead of an object. Ran some more tests and it is more performant than the object snippet I posted first. – pilchard Dec 30 '20 at 02:30
I understand that `Object.keys` is a linear complexity since it should just iterate through the keys. However since the object sorts the integer keys automatically my intuition is that sorting is a O(nlogn) process. Also I wonder if you can implement `countHash` to be a `Map` instead of a plain object? I guess `Map` also behave like Object? – Joji Dec 30 '20 at 17:05
`Map()` is ordered by insertion order and maintains this through conversion, so doesn't work. I think the sparse array is the cleanest (and in testing, most performant) with implicit ordering built in. – pilchard Dec 30 '20 at 17:12
thanks. btw where did you find an online editor that supports the latest js syntax like `??=` – Joji Dec 30 '20 at 17:16
I mostly edit locally (running node v15.4.0 in vscode), but the firefox/safari/chrome consoles all support it (and the snippets here). Just tried jsfiddle, jsbin, and stackblitz and they all do too, though codesandbox throws an error. – pilchard Dec 30 '20 at 18:04
@pilchard Upvoted, I really tried my best to find a better solution than yours, respecting performance but I couldn't. I benchmarked all the solutions in all answers here, and your solution was by far the best. The only improvement you can make is that in `for (i = 0; i < input.length; i++)` For each iteration of the loop, JavaScript is retrieving the `input.length`, a key-lookup costing operations on each cycle. There is no reason why this shouldn't be: `for (i = 0, n = input.length; i < n; i++)` – Abbas Hosseini Jan 02 '21 at 22:05
Thanks for the feedback @Abbas, and I agree with your tweak. Oversight on my part. – pilchard Jan 02 '21 at 22:08
@Joji There is no guarantee that inserting a key-value pair into an object is O(1) and getting `keys()` is O(n). In fact, since no JavaScript engine can perform magic, it has to be O(n log n) at some point when you can use it for sorting. The details depend on the specific engine and the number and distribution of keys. See this answer and the discussion below it: https://stackoverflow.com/a/64912755/689923 – Mo B. Jan 03 '21 at 09:05

customcommander · Answer 4 · 2020-12-28T19:49:20.923

Deriving a map of strings and their occurrences from a list of strings seems like a no brainer to me but since you asked for an alternative solution here's one. I don't know whether it is a good one though so caveat emptor.

Assuming list is the same as in your example, turn it into a massive string:

const search = list.join(' ');

Why? Because then you can count occurrences with RegExp#match e.g.,

search.match(/\bhand\b/g).length;
//=> 3

Of course the regular expression would have to be constructed dynamically. We'll do that later.

Next turn list into a unique list:

const uniq_list = [...new Set(list)];

Then you can sort uniq_list by counting how many times each item appears in search:

const ordered_by_occurrence =
  uniq_list.sort((a, b) => {
    const count_a = search.match(new RegExp(`\\b${a}\\b`, 'g')).length;
    const count_b = search.match(new RegExp(`\\b${b}\\b`, 'g')).length;
    return count_a >= count_b ? -1 : 1;
  });

(Returning the first n items of the ordered list is trivial.)

I was tempted to downvote, but you're right, the OP just asked for *alternative*, not *good* solutions. Your solution is definitely creative, and it would be hard to come up with an even more inefficient one. ;) — Mo B., Dec 28 '20 at 21:55

גלעד ברקן · Answer 5 · 2021-01-02T16:58:13.193

Since our range (word frequency) is bounded by the length of the list, we can have an O(n) algorithm with no log factor using counting sort. Moreover, if the frequencies are significantly smaller than n, our iterations will be that much shorter.

JavaScript code:

function getTopK(freq, min, max, k){
  let i = min;
  let j = 0;
  let counts = [];
  let result = new Array(k);

  for (; i<=max; i++)
    counts[i] = [];

  for (i=0; i<freq.length; i++)
    counts[freq[i][1]].push(i);

  for (i=max; i>=min, j<k; i--){
    for (let m=0; m<counts[i].length; m++){
      result[j] = freq[counts[i][m]][0];
      j++;
      if (j == k)
        break;
    }
  }
  return result;
};

function f(list, k){
  let max = 0;

  const freq = list.reduce(function(acc, s){
    acc[s] = -~acc[s];
    max = Math.max(max, acc[s]);
    return acc;
  }, new Map())

  return getTopK(Object.entries(freq), 1, max, k);
}

var list = [
    "this",
    "is",
    "a",
    "test",
    "which",
    "word",
    "wins",
    "top",
    "i",
    "don't",
    "know",
    "off",
    "hand",
    "do",
    "you",
    "this",
    "a",
    "a",
    "this",
    "test",
    "a",
    "a",
    "do",
    "hand",
    "hand",
    "a",
    "whatever",
    "what",
    "do",
    "do"
];

console.log(JSON.stringify(f(list, 10)));

@greybeard good point. The min determination there is incorrect. But the relevant iteration exits early anyway when k is reached. — גלעד ברקן, Jan 02 '21 at 16:56

jithil · Answer 6 · 2020-12-25T18:23:33.570

0

const list = [
  "this",
  "is",
  "a",
  "test",
  "which",
  "word",
  "wins",
  "top",
.
.
.
  "do",
  "do"
];

const newObjectArray = {};

list.forEach(item=>{
   newObjectArray[item]? newObjectArray[item]++: newObjectArray[item] = 1
});

// now sort based on the newObjectArray  values and return the first 10

edited Dec 25 '20 at 18:23

answered Dec 24 '20 at 06:15

jithil

1,098
11
11

is the `newObjectArray ` an array or object? – Joji Dec 24 '20 at 06:31
its an object {} – jithil Dec 25 '20 at 18:22
That's exactly the same approach as the OP's. – Mo B. Dec 28 '20 at 22:00

Kevin Zhang · Answer 7 · 2020-12-25T01:08:38.383

I think we can do following steps for the large data:

Mapping the raw obj to map structure;
Traverse the map to find the top 1 frequent strings this time;
Delete the top 1 from the map;
Repeat 2.

Here is the code:

const list = [
    "this",
    "is",
    "a",
    "test",
    "which",
    "word",
    "wins",
    "top",
    "i",
    "don't",
    "know",
    "off",
    "hand",
    "do",
    "you",
    "this",
    "a",
    "a",
    "this",
    "test",
    "a",
    "a",
    "do",
    "hand",
    "hand",
    "a",
    "whatever",
    "what",
    "do",
    "do"
];

// mock 1 million data;
for (let i = 0; i < 1000000; i++) {
    list.push('a');
}

console.clear();
let resultsmap = new Map();

// Object to map with frequent times
function toMap() {
    for (let i = 0; i < list.length; i++) {
        let item = list[i];
        if (resultsmap.has(item)) {
            let count = resultsmap.get(item);
            resultsmap.set(item, count + 1);
        } else {
            resultsmap.set(item, 1);
        }
    }
}

// Found the top 1 for current map
function top1() {
    let maxCount = 0;
    let keyOfMaxCount = '';
    resultsmap.forEach((value, key) => {
        if (maxCount < value) {
            maxCount = value;
            keyOfMaxCount = key;
        }
    });

    return { key: keyOfMaxCount, value: maxCount };
}

toMap();

for(let i=0; i< 10; i++) {
    let topObj = top1();
    console.log(topObj);
    resultsmap.delete(topObj.key);
}

Can I ask why is that you are using some ES6 feature like `map` and `forEach` but still using `var` instead of `const` or `let`? — Joji, Dec 24 '20 at 16:48
I am trying to not using ES6, and using object key-value pair to instead of Map, but it is very tedious, so replace with Map and some ES6 feature, and forgot to revise the var keyword. Sorry about that. — Kevin Zhang, Dec 25 '20 at 01:05

score 0 · Answer 8 · answered Jan 02 '21 at 03:49

const list = [
    "this",
    "is",
    "a",
    "test",
    "which",
    "word",
    "wins",
    "top",
    "i",
    "don't",
    "know",
    "off",
    "hand",
    "do",
    "you",
    "this",
    "a",
    "a",
    "this",
    "test",
    "a",
    "a",
    "do",
    "hand",
    "hand",
    "a",
    "whatever",
    "what",
    "do",
    "do"
  ];

function fn2(){
    const sorted = list.sort((a, b) => a < b ? -1 : 1 );
    let counter = 1; 
    let word = sorted[0]
    const counted = {} 
    for (i = 1; i < sorted.length; ++i){
        if (sorted[i] === word) counter ++
        else {
            counted[word]= counter;
            counter = 1;
            word = sorted[i]   
        } 
    } 
    const most = Object.entries(counted)
        .sort((a,b) => a[1] > b[1] ? -1: 1)
        .slice(0, 10)
        .map(e => e[0])
    return most;    
}

console.log(fn2(list))

Mister Jojo · Answer 9 · 2021-01-02T11:34:19.683

You can do that this way...

const list = 
  [ "this", "is", "a", "test", "which", "word", "wins", "top"
  , "i", "don't", "know", "off", "hand", "do", "you", "this"
  , "a", "a", "this", "test", "a", "a", "do", "hand"
  , "hand", "a", "whatever", "what", "do", "do"
  ];

const getTopTen = arr =>
  [...new Set(arr)]
    .map(txt=>({txt, n:arr.filter(t=>t===txt).length}))
    .sort((a,b)=>b.n-a.n)
    .slice(0,10)
    .map(({txt,n})=>txt);

console.log( getTopTen(list) );

.as-console-wrapper { max-height: 100% !important; top: 0; }

second solution without any sort :

const list = 
  [ "this", "is", "a", "test", "which", "word", "wins", "top"
  , "i", "don't", "know", "off", "hand", "do", "you", "this"
  , "a", "a", "this", "test", "a", "a", "do", "hand"
  , "hand", "a", "whatever", "what", "do", "do"
  ];

function getTopTen(arr)
  {
  let res = []
    , max = 10
    , cnt = [...new Set(arr)]
      .reduce((occ,txt) => {                  // making of { occurences , [ ...terms ] }
        let n = arr.filter(t=>t===txt).length  // count of sames terms
          , p = occ.find(x=>x.n===n)           
          ;
        if (!p) {                              // new occurence
          p = { n, t:[] }
          let i = occ.findIndex(x=>x.n<n)      // check is place   
          if (i<0) occ.push(p)                 // on bottom, or
          else     occ.splice(i,0,p)           // just in right place
          }
        p.t.push(txt)                          // add the term in his occ list
        return occ
      },[])
    ;
  for (let c of cnt) {        // keep only the 10 first elments
    for (let t of c.t)
      { res.push(t); if(!--max) break }  // break loop on  
    if (!max) break                      // quota zero
    }
  return res 
  }

console.log( getTopTen(list) );

.as-console-wrapper { max-height: 100% !important; top: 0; }

This has been said before ([Daniil Loban])(https://stackoverflow.com/a/65535865) for one), albeit not by everyone (yet). — greybeard, Jan 02 '21 at 06:22
@greybeard not really : my solution is currently the only one that uses an array.filter (), By the way, I added a second solution that doesn't use sorting — Mister Jojo, Jan 02 '21 at 06:50

rmiguelrivero · Answer 10 · 2021-01-04T17:24:03.293

If I got it right, Time complexity on this one should be O(n+k) where n is the length of the array and k the number of elements you want back.

const data = ["this", "is", "a", "test", "which", "word", "wins", "top", "i", "don't", "know", "off", "hand", "do", "you", "this", "a", "a", "this", "test", "a", "a", "do", "hand", "hand", "a", "whatever", "what", "do", "do"];

function getTopOccurrences(n, list) { 
    const [_, arr] = list.reduce(([obj, arr], current) => {
        const oldOccurrences = obj[current] || 0;
        const newOccurrences = oldOccurrences + 1;
        if (!arr[oldOccurrences]) { arr[oldOccurrences] = new Set(); }
        if (!arr[newOccurrences]) { arr[newOccurrences] = new Set(); }
        
        if (current in obj) {
            arr[oldOccurrences].delete(current);
        }
        arr[newOccurrences].add(current);
        obj[current] = newOccurrences;

        return [obj, arr];
    }, [{},[]]);

    const result = [];
    outer: for(let i = arr.length - 1, s = arr[i]; i >= 0; i--, s= arr[i]) {
        for (let word of s){
            result.push(word);
            n--;
            if (n <= 0) {
                break outer;
            }
        }
    }
    return result;
}

console.log(getTopOccurrences(10, data));

score -1 · Answer 11 · answered Dec 24 '20 at 06:07

this is a good problem for map reduce. :-)

const list = [
  "this",
  "is",
  "a",
  "test",
  "which",
  "word",
  "wins",
  "top",
  "i",
  "don't",
  "know",
  "off",
  "hand",
  "do",
  "you",
  "this",
  "a",
  "a",
  "this",
  "test",
  "a",
  "a",
  "do",
  "hand",
  "hand",
  "a",
  "whatever",
  "what",
  "do",
  "do"
];

const counts = list.reduce((a, c) => {
  a[c] = (a[c] || 0) + 1;
  return a;
}, {})
const items = Object.keys(counts).map(k => {
  return {
    word: k,
    count: counts[k]
  };
});
items.sort((a, b) => a.count > b.count ? -1 : 1);
const result = items.slice(0, 10).map(item => item.word);
console.log(result);

I honestly don't see how this is different than the solution I presented in the description. Sure you are using reduce. But both solutions all sorted the array which brings the runtime complexity to nlogn. — Joji, Dec 24 '20 at 06:28

score -1 · Answer 12 · answered Dec 24 '20 at 06:15

Try this:

const list = [
    "this",
    "is",
    "a",
    "test",
    "which",
    "word",
    "wins",
    "top",
    "i",
    "don't",
    "know",
    "off",
    "hand",
    "do",
    "you",
    "this",
    "a",
    "a",
    "this",
    "test",
    "a",
    "a",
    "do",
    "hand",
    "hand",
    "a",
    "whatever",
    "what",
    "do",
    "do"
  ];


let counter = {}
for (let item of list){
    counter[item] = 1 + (counter[item] || 0)
}
console.log(counter);

let result = [];
let entries = Object.entries(counter);
let sorted = entries.sort((a, b) => b[1] - a[1]);

for(let i = 0; i < 10; i++){
    result.push(sorted[i][0])
}

console.log(result)

I honestly don't see how this is different than the solution I presented in the description. — Joji, Dec 24 '20 at 06:31

JS: writing a function that iterates through a list of strings and returns the top 10 most frequent strings in the list

12 Answers12

Analysis

Alternative

Benchmark

Final note on space