0

A website I am working on uses Adobe Search&Promote (SP) as it's internal website indexing and searching tool.

I need to exclude common parts of each web page from being indexed by SP (such as the header, nav, footer) because they are the same on every single page.

SP's documentation states the following:

"To prevent parts of individual web pages from being searched, you can exclude portions of a page from indexing. Surround the text with <noindex> and </noindex> tags. This method is useful if you want to exclude navigation text from searches."

Of course, <noindex>, is not a standard HTML tag/element.

Is there javascript or something I should be doing to register/create this fake tag in browsers so I don't have to worry about any strange behavior as a result of having a non-standard HTML tag just hanging out in my code? Or should I just not care because browsers will ignore this non-existent element?

Note: There is absolutely no styling that needs to be done on this <noindex> element. It simply needs to wrap around content in the HTML.

keithwyland
  • 2,998
  • 1
  • 17
  • 22

2 Answers2

1

There is nothing you need to do. Browsers are expected to ignore unknown tags, and they do, so they see <noindex>foo</noindex> just as foo. Well, not quite. Technically, modern browsers construct an element node (of type HTMLUnknownElement) in the DOM. But the element has no associated default styling and no associated action, so it’s really a dummy element and represents its content only.

It would be possible to remove such elements nodes using client-side JavaScript, but that would be quite unnecessary.

The only real risk is that some day some specification or some browser or some web-wide indexing robot might start treating noindex as a real element with some defined meaning, possibly with default rendering and default functionality. Then you would be in trouble if these differ from what you expected. But it’s a rather small risk, and it seems that you don’t have a choice.

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
1

Although it's not in the documentation, our team consulted an Adobe consultant regarding this. He told us that we can use a 'noindex' class instead of the <noindex> element. He was even recommending us to use the class instead of the tag.

A warning though, the 'noindex' class is only working with <div> elements but not on other elements such as <ul>, <header>, or <footer>.

So a usage will be something like this:

<div class="noindex">
   <p>This should not be indexed.</p>
</div>
khakiout
  • 2,372
  • 25
  • 32