3

I am trying to do a custom sort like below order

  1. special character ( - first, _ last)
  2. digit
  3. alphabets

For example, if I sort below

var words = ['MBC-PEP-1', 'MBC-PEP01', 'MBC-PEP91', 'MBC-PEPA1', 'MBC-PEPZ1', 'MBC-PEP_1'];

result should be

MBC-PEP-1,MBC-PEP_1,MBC-PEP01,MBC-PEP91,MBC-PEPA1,MBC-PEPZ1

by using my code the result is below

"MBC-PEP-1", "MBC-PEP01", "MBC-PEP91", "MBC-PEP_1", "MBC-PEPA1", "MBC-PEPZ1"

but I need the above sorting order, not sure how to achieve it.

function MySort(alphabet)
{
    return function(a, b) {
        var lowerA = a.toLowerCase()
        var lowerB = b.toLowerCase()
        var index_a = alphabet.indexOf(lowerA[0]),
        index_b = alphabet.indexOf(lowerB[0]);

        if (index_a === index_b) {
            // same first character, sort regular
            if (a < b) {
                return -1;
            } else if (a > b) {
                return 1;
            }
            return 0;
        } else {
            return index_a - index_b;
        }
    }
}

var items = ['MBC-PEP-1', 'MBC-PEP01', 'MBC-PEP91', 'MBC-PEPA1', 'MBC-PEPZ1', 'MBC-PEP_1'],
sorter = MySort('-_0123456789abcdefghijklmnopqrstuvwxyz');

console.log(items.sort(sorter));
Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194

3 Answers3

1

I ported an answer from here to JavaScript, which does what you want without using recursion or anything overly complicated:

function MySort(alphabet) {
    return function (a, b) {
       a = a.toLowerCase();
       b = b.toLowerCase();
       var pos1 = 0;
       var pos2 = 0;
       for (var i = 0; i < Math.min(a.length, b.length) && pos1 == pos2; i++) {
          pos1 = alphabet.indexOf(a[i]);
          pos2 = alphabet.indexOf(b[i]);
       }

       if (pos1 == pos2 && a.length != b.length) {
           return o1.length - o2.length;
       }

       return pos1 - pos2;
    };
}
    
var items = ['MBC-PEP-1', 'MBC-PEP01', 'MBC-PEP91', 'MBC-PEPA1', 'MBC-PEPZ1', 'MBC-PEP_1'],
sorter = MySort('-_0123456789abcdefghijklmnopqrstuvwxyz');

console.log(items.sort(sorter));
Cᴏʀʏ
  • 105,112
  • 20
  • 162
  • 194
1

As Narigo said in their answer, you're only comparing the first character. Here's a different idea that's probably simpler:

function MySort(a, b) {
  a = a.replace("_", ".");
  b = b.replace("_", ".");
  return a.localeCompare(b);
}

var items = ['MBC-PEP-1', 'MBC-PEP01', 'MBC-PEP91', 'MBC-PEPA1', 'MBC-PEPZ1', 'MBC-PEP_1'];

console.log(items.sort(MySort));

We're basically using the normal string comparison, except we change the underscore to a dot to decide the ordering, since it's compatible with what you're trying to achieve.

Aioros
  • 4,373
  • 1
  • 18
  • 21
  • 1
    I wouldn't recommend this answer because what happens when the inputs all of a sudden start containing a `.` and it needs to be specified in a different order? Maintenance on this is hard because you don't get to actually define the required order. – Cᴏʀʏ Dec 14 '18 at 02:38
  • @Cᴏʀʏ if the string format is expected to be more general then it should be stated in the question ;) I think this is the best answer for the given information. – Patrick Roberts Dec 14 '18 at 02:54
  • I have to agree with Cᴏʀʏ here. For me, it's a creative hack to be able to use native functions but it doesn't really sound like a maintainable solution. – Narigo Dec 14 '18 at 03:11
  • Definitely some good points. Though I don't think having a dot in a string would be a problem, it would just be sorted as normal. Other solutions posted do the same or ignore that case altogether. – Aioros Dec 14 '18 at 03:50
  • 1
    @Aioros I disagree with their sentiments about this being "a hack" but I do give merit to [this benchmark](https://jsperf.com/sort-by-alphabet/6) that demonstrates your solution as the slowest. This is actually due to the internal complexity of [`localeCompare()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare) and not to your approach. I suggest using `return -(a < b) || +(a > b);` which I explain [in this answer](https://stackoverflow.com/a/51398944/1541563), and as the benchmark demonstrates, blows all these answers out of the water. – Patrick Roberts Dec 14 '18 at 06:15
  • Oh and as it turns out, [replacing the strings before and after sorting large arrays makes it even faster](https://jsperf.com/sort-by-alphabet/7). – Patrick Roberts Dec 14 '18 at 06:25
-1

You are only looking at the first character in your algorithm. You need to check more of your string / the next characters as well. Here is a quick solution using recursion:

function MySort(alphabet)
{
    return function recSorter(a, b) {
        var lowerA = a.toLowerCase()
        var lowerB = b.toLowerCase()
        var index_a = alphabet.indexOf(lowerA[0]),
        index_b = alphabet.indexOf(lowerB[0]);

        if (index_a === index_b && index_a >= 0) {
            return recSorter(a.slice(1), b.slice(1));
        } else {
            return index_a - index_b;
        }
    }
}

var items = ['MBC-PEP-1', 'MBC-PEP01', 'MBC-PEP91', 'MBC-PEPA1', 'MBC-PEPZ1', 'MBC-PEP_1'],
sorter = MySort('-_0123456789abcdefghijklmnopqrstuvwxyz');

console.log(items.sort(sorter));

I'm not sure what you want to happen when you have different lengths of strings, characters outside the alphabet or at other edge cases. For the posted example, this results in the expected order.

Narigo
  • 2,979
  • 3
  • 20
  • 31
  • 2
    A recursive, scoped sorting method? That seems rather expensive for a simple string compare algorithm... – Patrick Roberts Dec 14 '18 at 02:20
  • I wouldn't recommend this for a performance award. If you want to sort by your own alphabet, for example sort by occurrence on a keyboard (`qwertyuiopasdfghjklzxcvbnm` instead of `a-z`), you still need to map to your own index and compare it. You might be faster writing an imperative loop, but since this is tail recursive, readable and probably less of a hack than replacing chars in the source string, I'd keep this answer... ;) – Narigo Dec 14 '18 at 02:32
  • I'd be impressed if you could refer to an implementation of JavaScript that actually has tail call recursion optimization, so I'm not really sure why pointing out that it's tail recursive is relevant. I wouldn't exactly call this readable or "less of a hack" either. ASCII ordering is fairly well-known so taking advantage of it with a simple string replace is quite sufficient as long as the intention is documented with a brief comment. – Patrick Roberts Dec 14 '18 at 02:36
  • For me, tail recursion is easier to wrap my head around usually. The thing is, replacing a string and then comparing the replaced strings does not sound like a good strategy to me. If you want the most performance out of it, use the solution by Cᴏʀʏ – Narigo Dec 14 '18 at 02:48
  • Agree to disagree. Delegating the string comparison to the implementation will almost always be faster than explicitly iterating the string in userland. Cᴏʀʏ's answer also manipulates the initial string with `toLowerCase()` so there's not even time saved by attempting to use the original strings. – Patrick Roberts Dec 14 '18 at 02:50
  • Seems like it depends on which browser you use and its version. Here you can play around with it: https://jsperf.com/sort-by-alphabet/1 There are a lot of things which could be micro-optimized, but still, I guess everybody has to decide for themselves what solution they need. There are obviously enough out there. – Narigo Dec 14 '18 at 03:48
  • Thank you for pointing that out. After a lot of testing, I've discovered the issue is that [`localeCompare()`](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/localeCompare) actually does _way more_ than just comparing ASCII indices under the hood. It performs international normalization as well which incurs a lot of overhead. Here's [another test](https://jsperf.com/sort-by-alphabet/6) using just the inequality operators instead and it blows everything else out of the water. – Patrick Roberts Dec 14 '18 at 06:11