1

Currently I search within a div within an html file and remove the hideMe class if a result is found inside it, to reveal the found hymn. I'm wondering if I can search the hymn without punctuation (removing punctuation from both input and output), while also excluding the info class from the search.

<div id="himnario">
   <div id="1" class="song hideMe">
      <div class="info">I don't want this info to be searched</div>
      <div class="tuneName">This tune should be searched</div>
      <ol>
         <li>Verse 1</li>
         <li>Verse 2</li>
      </ol>
   </div>
   <div id="2" class="song hideMe">...</div>
</div>

My search code presently is:

$("#himnario div.song:Contains("+item+")").removeClass('hideMe').highlight(item);
isHighlighted = true; //check if highlighted later and unhighlight, for better performance

(extending jquery with "Contains" as follows)

return jQuery(a).text().toUpperCase().indexOf(m[3].toUpperCase()) >= 0; 

Also, I am using a jquery plugin for highlighting the results, so I suppose this would complicate things. If need be, the highlight could be disfunctional for those places where punctuation gets in the way.

Of course, the more efficient the better since this will be part of a mobile app... If removing the info class from the search takes a lot of time, I will have to just delete it from the file because it isn't absolutely essential.

I found the following code from here that might help, which is supposed to strip invalid characters, but not sure how to incorporate it into the custom Contains function properly with my limited coding ability.

Return Regex.Replace(strIn, "[^\w\.@-]", "")

Thanks so much in advance for your help.

Edit: Here is the preferred solution thanks to @Nick:

$('#himnario').children().addClass('hideMe'); // hide all hymns
//http://stackoverflow.com/questions/12152098/jquery-search-contains-without-punctuation-excluding-specific-class
// Get rid of punctuation in your search item - this only allows alphanumeric
item2 = item.toUpperCase().replace(/<(.|\n)*?>|[^a-z0-9\s]/gi, ''); 
// Loop though each song
$('#himnario').children().each(function() {
    var $this_song = $(this);
    // Examine the song title & the ordered list, but not the hidden info (first child)
    $this_song.children('.tuneName, ol').each(function() {
        // Get the html, strip the punctuation and check if it contains the item
        if ($(this).html().toUpperCase().replace(/<(.|\n)*?>|[^a-z0-9\s]/gi, '').indexOf(item2) !== -1) {
            // If item is contained, change song class
            $this_song.removeClass('hideMe').highlight(item); //original search phrase
            isHighlighted = true; //check later, for better performance
            return false;   // Prevents examination of song lines if the title contains the item
        } 
    });            
});

Highlight function:

/*
highlight v3
Highlights arbitrary terms.
<http://johannburkard.de/blog/programming/javascript/highlight-javascript-text-higlighting-jquery-plugin.html>
MIT license.
Johann Burkard
<http://johannburkard.de>
<mailto:jb@eaio.com>
*/
jQuery.fn.highlight = function(pat) {
 function innerHighlight(node, pat) {
  var skip = 0;
  if (node.nodeType == 3) {
   var pos = node.data.toUpperCase().indexOf(pat);
   if (pos >= 0) {
    var spannode = document.createElement('span');
    spannode.className = 'highlight';
    var middlebit = node.splitText(pos);
    var endbit = middlebit.splitText(pat.length);
    var middleclone = middlebit.cloneNode(true);
    spannode.appendChild(middleclone);
    middlebit.parentNode.replaceChild(spannode, middlebit);
    skip = 1;
   }
  }
  else if (node.nodeType == 1 && node.childNodes && !/(script|style)/i.test(node.tagName)) {
   for (var i = 0; i < node.childNodes.length; ++i) {
    i += innerHighlight(node.childNodes[i], pat);
   }
  }
  return skip;
 }
 return this.each(function() {
  innerHighlight(this, pat.toUpperCase());
 });
};

jQuery.fn.removeHighlight = function() {
 return this.find("span.highlight").each(function() {
  this.parentNode.firstChild.nodeName;
  with (this.parentNode) {
   replaceChild(this.firstChild, this);
   normalize();
  }
 }).end();
};
Nathan
  • 67
  • 8
  • By what do you mean exclude the .info class, do you not want to search that element at all, ot just not show it on the page ? – adeneo Aug 28 '12 at 03:29
  • @adeneo, I currently have .info hidden with display:none; It is extra tune information that I don't want to pick up in the search, but of course it currently is picked up because it is just hidden. – Nathan Aug 28 '12 at 13:46

2 Answers2

1

Why not just use straight Javascript to accomplish this? A simple regex should do the trick:

str.replace(/[^a-z0-9\s]/gi, '')

This will take a string str and remove any character that isn't a number or a letter (alphanumeric). I wouldn't overwrite the original HTML if I were you (unless that is, of course, the point), but rather I'd store the value of the HTML in a string, str, and do your nasty regex to it there. That way the original HTML stays in tact and you still have your new string to play with and output if you choose. No jQuery required really, :contains will only slow you down.

cereallarceny
  • 4,913
  • 4
  • 39
  • 74
1

jQuery works quickest if you go straight to an element by its id, and then filter from there. So, I'll assume your HTML is like this:

<div id="himnario">
    <div id="1" class="song hideMe">
        <div class="info">Hidden text</div>
        <div class="tuneName">Search me!</div>
        <ol>
            <li>Verse 1</li> 
            <li>Verse 2</li>
        </ol>
    </div>
    <div id="2" class="song hideMe">
        ...
    </div>
</div>

To find the songs most efficiently, you do this:

$('#himnario').children()...

Note: children() is much better than find() because it only searches to a depth of one level. Not specifying .song will speed things up if there are only songs as children. If so, you are going much faster already.

Once you've got the children, you can use each() which is not the absolutely fastest way, but it's OK. So this examines each song/child:

$('#himnario').children().each(function(index) {...});

For your case:

// Get rid of punctuation in you search item - this only allows alphanumeric
item = item.replace(/[\W]/gi, '');

// Loop though each song
$('#himnario').children().each(function() {
    var $this_song = $(this);

    // Loop through each line in this song [EDIT: this doesn't account for the title]
    $this_song.find('li').each(function() {

        // Get the html from the line, strip the punctuation and check if it contains the item
        if $(this).html().replace(/[\W]/gi, '').indexOf(item) !== -1 {
            // If item is contained, change song class
            $this_song.removeClass('hideMe');
            return false;   // Stops each_line loop once found one instance of item
        } 
    }            
});

I haven't done anything with the highlighting. I also haven't tested this, but it should work fine once you get any small bugs out :)

EDIT: In light of your "song title" field, you can do the following:

// Get rid of punctuation in you search item - this only allows alphanumeric
item = item.replace(/[\W]/gi, '');

// Loop though each song
$('#himnario').children().each(function() {
    var $this_song = $(this);

    // Examine the song title & the ordered list, but not the hidden info (first child)
    $this_song.children().not(':first').each(function() {

        // Get the html, strip the punctuation and check if it contains the item
        if $(this).html().replace(/[\W]/gi, '').indexOf(item) !== -1 {
            // If item is contained, change song class
            $this_song.removeClass('hideMe');
            return false;   // Prevents examination of song lines if the title contains the item
        } 
    }            
});

This version should be quicker than looping through each individual line. Note, too, that I've removed the index and index2 vars from the .each calls as you don't use them.

Nick
  • 5,995
  • 12
  • 54
  • 78
  • Thanks for all the help and effort! I suppose with my edited addition of the tuneName class to the hymn structure I would have to use your first example and search within li or .tuneName? – Nathan Aug 28 '12 at 14:37
  • @Nathan I've edited the code to allow for `tuneName`. I realised, too, that my original second version (which I've now removed) didn't allow for the fact that you don't want to search the hidden field. The above code should work fine for changing the class of the song. You shouldn't really need the plugin, unless I've misunderstood what you're trying to do. If I have, feel free to keep enquiring :) – Nick Aug 28 '12 at 22:02
  • Amazing! This works great. I made a few minor tweaks like adding .toUpperCase, highlighting code and syntax but it's working super. You can see the working result at [link](http://gospelriver.com/favhymns/). The only thing that still could be improved I think is that the highlighting doesn't work for searches where the punctuation makes a difference, but that's not a big deal. I'll try to paste the solution code in edit above... – Nathan Aug 30 '12 at 04:09
  • Should I do something different with the first line `$('#himnario').children().addClass('hideMe');` to make it faster? I've wondered if it would be faster to add :visible so it only checks the visible elements but I'm not sure that helps. Something like `('#himnario').children().addClass('hideMe').each(function() {`? Thanks again, you've been a big help. – Nathan Aug 30 '12 at 04:39
  • On second thought, I don't even want to worry about the highlight deal, so don't bother your head about it either. I'm sure it would take more regex and crunch time and they'll just be aware that it wasn't an exact match because it wasn't highlighted. – Nathan Aug 30 '12 at 05:26
  • Strangely enough, both regex solutions return results for "Davids" or "David's" when only David is in the code... Also, the same thing happens for "loves" where results for "love" are given also. not entirely a bad thing since people may actually be interested in plurals, but I'm curious why... – Nathan Aug 30 '12 at 23:59
  • @Nathan It shouldn't work that way. The regex will convert `David's` to `Davids` (because it replaces the apostrophe with nothing). Unless... do you have `gi, ''` or `gi, ' '`? If there's a gap between the quote marks, `David's` will become `David s` and that could explain the match. Anyway, try putting some `console.log` commands in to show what is in the key variables at certain places. – Nick Aug 31 '12 at 03:26
  • Thanks for the console.log tip, that's really useful. It turns out it's removing spaces also (of course) so "love sent" becomes lovesent and the rest is obvious. I also note that it is registering the
  • tag as LI as the first two and last two characters, but I don't think that that would be a real issue except in rare occasions where eg the verse ended with "came" and they searched for "camel". It also registers "BUTTON" etc. but most hidden are inconsequential except for "listen". It might help to remove anything between <> if possible and if it doesn't add lots of time? Will investigate.
  • – Nathan Aug 31 '12 at 04:50
  • I got rid of most of the issues like "listen" by specifying selectors in children('.lookInMe') (I had a listen button as a child). Still doesn't solve every issue but maybe close enough. I solved the space issue by using @cereallarceny 's regex. I see there are ways to strip html tags at [link](http://www.pagecolumn.com/tool/all_about_html_tags.htm) but not sure if I should do that... – Nathan Aug 31 '12 at 05:22
  • Do you think using /<(.|\n)*?>/g would slow the search down much, and if not, how would I incorporate it into the present regex? Thanks! – Nathan Aug 31 '12 at 19:36
  • @Nathan It really shouldn't slow it down perceptibly at all. Why not use this: `if ($(this).html().replace( /<(.|\n)*?>/g, '').replace(/[^a-z0-9\s]/gi, '').indexOf(item) !== -1) {...`? – Nick Sep 01 '12 at 09:16
  • @Nathan Actually, you can combine the two like this, I think - never liked regex much, I confess :) `replace(/<(.|\n)*?>|[^a-z0-9\s]/gi, '')` I think that's right - haven't checked it... – Nick Sep 01 '12 at 10:35
  • Nick I tried your last post and that works splendidly and cleans everything up nicely. I'm with you on the regex, that's why I asked :) Find it confusing. Thanks as always. – Nathan Sep 01 '12 at 12:02
  • It just occurred to me that jQuery's `.text()` returns the innerHTML stripped of the tags for you! So at the very least, `$(this).text().replace...` will start with less material for your replace to work with. See http://api.jquery.com/text/ – Nick Sep 22 '12 at 05:22