0

Basically, I need to get the words from the array which are contained in the main string

I have a looping code here but I think there is a one-liner to do the trick. I need the code to be optimized not only in code length but also in performance.

Thanks

var aValidWords = ["ex", "exes", "expert", 
                   "experts", "expertise", "sex", "sexes", 
                   "exchange", "change", "changes"];
var sMainWord = "expertsExchange";
var aPossibleWords = new Array();

var sMainWordLower = sMainWord.toLowerCase();
for(i=0; i < aValidWords.length; i++){
    var sCurrentWord = aValidWords[i].toLowerCase();
    if(sMainWordLower.indexOf(sCurrentWord) != -1){
        aPossibleWords.push(aValidWords[i]);
    }
}

document.write(aPossibleWords.join("<br />"));
Jronny
  • 2,164
  • 4
  • 30
  • 41
  • My first thought was to convert the array into a regex. But that will just find the matching substrings, not tell you which valid words were found. It would be OK if there weren't overlapping substrings in the valid words, but there are lots of them. – Barmar Dec 23 '12 at 08:55
  • You could create a kind of mapping table. Since "experts" contains "ex", "expert" and "experts" itself, you would only have to process "experts" to know three words are valid. – asgoth Dec 23 '12 at 09:46
  • @asgoth thing is, the sMainWord will be changed from time to time. – Jronny Dec 23 '12 at 10:01
  • I know, but it is independent from the main word. Let's say you have such mapping table (pseudo): (experts -> expert, expert, ex), (expertise -> expert, ex), (exchange -> change), ... Then you would only have to traverse the largest words first and if any of these largest words match, you can remove the smaller words from the mapping table too. – asgoth Dec 23 '12 at 10:13
  • @asgoth I get your point, but does mapping the make the code less optimized? I have some other plans with my current algorithm though, so I guess I'll stick to the plain array of strings. :-) What I need now is maybe an optimization on the searching of the valid substrings which belong to the main word; :-) – Jronny Dec 23 '12 at 10:26
  • 2
    Unless that mapping table comes from a server, no ;) Otherwise one small optimization: replace new Array() by [] – asgoth Dec 23 '12 at 10:32

3 Answers3

3

I compared three possible implementations:

Alternative 1 - Using for loop:

function alternative1(aValidWords, sMainWordLower) {
    var aPossibleWords1 = [];
    for(i=0; i < aValidWords.length; i++){
        if(sMainWordLower.indexOf(aValidWords[i]) != -1){
            aPossibleWords1.push(aValidWords[i]);
        }
    }
    return aPossibleWords1;
}

Alternative 2 - Using jQuery grep function:

function alternative2(aValidWords, sMainWordLower) {
    return $.grep(aValidWords, function(word) {
              return sMainWordLower.indexOf(word) != -1;                    
         }
    )
}

Alternative 3 - Using JavaScript native filter method (IE9, Chrome, Firefox, Opera, Safari):

function alternative3(aValidWords, sMainWordLower) {
    return aValidWords.filter(function(word) {
              return sMainWordLower.indexOf(word) != -1;                    
         }
    )
}

I measured the execution times with Chrome Profile tool. Each alternative was executed 10 times with array of random milion words. The results are:

  • Alternative 1: 20 executions -> 21,68s
  • Alternative 2: 20 executions -> 26,31s
  • Alternative 3: 20 executions -> 34,66s

I was surprised by the fact, that native JavaScript Filter function was so slow.

If you wan't to measure execution times by yourself, there is jsFiddle - it will take some time for script to finish.

In general those three alternatives are easiest. If those execution times suits you, use one of them, otherwise the answer of @Pumbaa80 is the right one.

[UPDATE]

For the explanation of the results (why JQuery grep funcion is faster then native JavaScript filter function) please take a look at this question/answer. jsFiddle code was also ported to jsPerf thanks to @Alexander.

Community
  • 1
  • 1
PrimosK
  • 13,848
  • 10
  • 60
  • 78
  • I once thought of that but i was afraid the performance would not be optimal. +1 for shorter code though; – Jronny Dec 24 '12 at 02:25
1

This loop is definitely more concise than what you have, but it'd be worth running some tests in various browsers to find out which is faster. I imagine Regex matching is probably faster, but I'm curious whether there's any performance penalty for compiling the Regex.

for(var i=0; i<aValidWords.length; i++) {
    if (new RegExp(aValidWords[i], 'i').test(sMainWord))
        aPossibleWords.push(aValidWords[i]);
}
Randy Hunt
  • 435
  • 3
  • 10
1

I think your code is decent.

If you're really worried about performance, you may want to try some sophisticated method like Boyer-Moore. But, if you just have a handful of patterns and relatively short strings, then the initialization overhead is higher than the benefit, so you should stick with the simple approach. See Wikipedia's comparison table http://en.wikipedia.org/wiki/String_searching_algorithm#Single_pattern_algorithms

user123444555621
  • 148,182
  • 27
  • 114
  • 126