2

If I have the following:

content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened."

How would I completely remove the tag altogether so the big string no longer has any anchor tags?

I reached only so far:

var href = content.indexOf("href=\"");
var href1 = content.substring(href).indexOf("\"");
DemCodeLines
  • 1,870
  • 8
  • 41
  • 60

4 Answers4

15

This is why God invented regular expressions, which the string.replace method accepts as the string to replace.

var contentSansAnchors = content.replace(/<\/?a[^>]*>/g, "");

If you're new to regex, some explanation:

/.../: Instead of wrapping the search string in quotes, you wrap it in forward slashes to reflect a regular expression.

<...>: These are literal HTML tag braces.

\/?: The tag may or may not (?) start with a forward slash (\/). The forward slash must be escaped using the backslash or the regex will end prematurely here.

a: Literal anchor tag name.

[^>]*: After the a, the tag may contain zero or more (*) characters that are not (^) a closing brace (>). The "anything but a closing brace" expression is wrapped in square braces ([...]) because it represents a single character.

g: This modifies the regular expression to be global, so that all matches are replaced. Otherwise, only the first match would be replaced.

Depending on what strings you are expecting to parse, you may also want to add the i modifier for case insensitivity.

aaaantoine
  • 900
  • 8
  • 19
2

You can use Regexp to replace all anchor tags.

var result = subject.replace(/<a[^>]*>|<\/a>/g, "");
legendJSLC
  • 437
  • 5
  • 7
2

Strip all tags keeping their text content:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

// parse the HTML string into DOM
var container = document.createElement('div');
container.innerHTML = content;

// retrieve the textContent, or innerText when textContent is not available
var clean = container.textContent || container.innerText;
console.log(clean); //"I was going here and then that happened."

Fiddle

As per OP's comment, the text only contains anchor tags, so this method should work fine.

You may drop the || container.innerText if you don't need IE <= 8 support.

Reference

  • textContent - Gets or sets the text content of a node and its descendents.
  • innerText - Sets or retrieves the text between the start and end tags of the object.

Just to answer the question in the title, here is a way to remove only the anchor elements:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

var container = document.createElement('div');
container.innerHTML = content;

var anchors = container.getElementsByTagName('a'),
    anchor;

while (anchor = anchors[0]) {
    var anchorParent = anchor.parentNode;

    while (anchor.firstChild) {
        anchorParent.insertBefore(anchor.firstChild, anchor);
    }
    anchorParent.removeChild(anchor);
}

var clean = container.innerHTML;
console.log(clean); //"I was going here and then that happened."

Fiddle

Reference

  • Node.insertBefore - Inserts the specified node before a reference element as a child of the current node.
  • Node.removeChild - Removes a child node from the DOM.
  • Element.getElementsByTagName - Returns a list of elements with the given tag name. The subtree underneath the specified element is searched, excluding the element itself.

Even though OP is not using jQuery, here is a practically equivalent jQuery version of the above for whom it may concern:

var content = "<a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.";

var clean = $('<div>').append(content).find('a').contents().unwrap().end().end().html();
console.log(clean); //"I was going here and then that happened."

Fiddle


NOTE

All of the solutions in this answer assume that the content is valid HTML -- it won't handle malformed markup, unclosed tags, etc. It also considers that the markup is safe (XSS-sanitized).

If the criteria above is not met, you're better off using a regex solution. Regex should usually be your last resort when the use case involves parsing HTML as it is very easy to break when tested against arbitrary markup (related: virgin-devouring ponies), but your use case seems very simple and a Regex solution may be just what you need.

This answer provides non-regex solutions so that you may use these once (if ever) a regex solution breaks.

Community
  • 1
  • 1
Fabrício Matté
  • 69,329
  • 26
  • 129
  • 166
0

If you could somehow obtain your string in javascript if not dynamic(say you hold it in a var named as "replacedString" in javascript), then in order to fix this you can enclose your entire html content in a div as shown below:-

<div id="stringContent">
  <a href=\"1\">I</a> was going here and then <a href=\"that\">that</a> happened.
</div>

and then your can execute this through jQuery:-

$("#stringContent").empty();
$("#stringContent").html(replacedString);