7

I am trying to trim the text which I get from kendo editor like this.

var html = "  T  "; // This sample text I get from Kendo editor
            console.log("Actual :" + html + ":");
            var text = "";
            try {
                // html decode
                var editorData = $('<div/>').html(html).text();
                text = editorData.trim();                    
                console.log("After trim :" + text + ":");
            }
            catch (e) {
                console.log("exception");
                text = html;
            }

This code is in seperate js file ( generated from typescript). When the page loads the trimming is not working. But when I run the same code in developer tools console window its working. Why it is not working?

Adding typescript code

 const html: string = $(selector).data("kendoEditor").value();
        console.log("Actual :" + html + ":");
        let text: string = "";
        try {
            // html decode
            var editorData = $('<div/>').html(html).text();
            text = editorData.trim();
            console.log("After trim :" + text + ":");
        }
        catch (e) {
            console.log("exception");
            text = html;
        }
PSR
  • 875
  • 2
  • 13
  • 37
  • *When the page loads the trimming is not working* - we're going to need more info than that to answer this question – Jamiec May 23 '16 at 09:35
  • 3
    ` ` isn't actually white space. It's rendered as whitespace by the browser but as far as Javascript is concerned it's not. It's a string. – Liam May 23 '16 at 09:36
  • There is a kendo editor in the page. User enter some text and clicks on Save button. Then this javascript gets called. Basically this code purpose is to trim the trailing spaces and save. – PSR May 23 '16 at 09:36
  • Working fine : https://jsfiddle.net/rayon_1990/4v1kmc30/ – Rayon May 23 '16 at 09:38
  • @Rayon When I run this piece of code separately, it is working fine for me. But in application its failing. That is why I felt it is strange. – PSR May 23 '16 at 09:42

5 Answers5

11

&nbsp; becomes a non-break-space character, \u00a0. JavaScript's String#trim is supposed to remove those, but historically browser implementations have been a bit buggy in that regard. I thought those issues had been resolved in modern ones, but...

If you're running into browsers that don't implement it correctly, you can work around that with a regular expression:

text = editorData.replace(/(?:^[\s\u00a0]+)|(?:[\s\u00a0]+$)/g, '');

That says to replace all whitespace or non-break-space chars at the beginning and end with nothing.

But having seen your comment:

When I run this piece of code separately, it is working fine for me. But in application its failing.

...that may not be it.

Alternately, you could remove the &nbsp; markup before converting to text:

html = html.replace(/(?:^(?:&nbsp;)+)|(?:(?:&nbsp;)+$)/g, '');
var editorData = $('<div/>').html(html).text();
text = editorData.trim();    

That removes any &nbsp;s at the beginning or end prior to converting the markup to text.

T.J. Crowder
  • 1,031,962
  • 187
  • 1,923
  • 1,875
  • Your editorData.replace solution is working for me. Thanks. – PSR May 23 '16 at 10:08
  • @Sree: Great! I'm sorry to hear browsers are still getting this wrong, but glad that helped. :-) – T.J. Crowder May 23 '16 at 10:19
  • you are using a last resolution RegExp method which is dirty, hard read & maintain, and requires explicit encodings of a targeted type of whitespaces. The solution you just down-voted - is native, universal, generic and clean; easy to implement, maintain and customize for browsers that don't support the trim method natively. – Bekim Bacaj May 23 '16 at 17:47
  • 2
    @BekimBacaj: You clearly haven't read (or at least understood) my answer above, and clearly haven't understood the comments on your answer. Well, unfortunately, nothing more I can do about that. – T.J. Crowder May 23 '16 at 17:51
  • The terminology is being used and misused in all levels of publications. This " " is a whitespace, or to be even more precise, an interval, this should be trimmed. This " " however, is not a whitespace or interval, that's a whitespace character, or to be more precise that's a white character equal to T or any other nonwhite character. Therefore it should not be treated as whitespace interval, and therefore not to be trimmed. – Bekim Bacaj May 23 '16 at 18:01
  • 2
    @BekimBacaj: For crying out loud. Just *read the specification*. I linked it with that very purpose in mind. From the `trim` link: *"Let T be a String value that is a copy of S with both leading and trailing white space removed. The definition of white space is the union of WhiteSpace and LineTerminator."* From the Table 32 link: *"The ECMAScript white space code points are listed in Table 32:...U+00A0 NO-BREAK SPACE ..."* Thus: `trim` is supposed to remove no-break-space. And it *does*, on browsers where it isn't broken. As I said above. – T.J. Crowder May 23 '16 at 18:05
  • that's wrong interpretation " " is not the same as \u00a0. – Bekim Bacaj May 23 '16 at 18:08
  • 2
    @Bekim: To quote you: "Wrong again!" Here's another specification for you: https://www.w3.org/TR/html5/syntax.html#named-character-references, which says quite clearly that the ` ` is U+00A0. And proof that it's actually true in the real world: https://jsfiddle.net/nyd3yb0z/ And with that, I'm done. If you feel the need to post further FUD, please cite reliable references. – T.J. Crowder May 23 '16 at 18:22
  • 2
    @Bekim: What utter nonsense. – T.J. Crowder May 25 '16 at 06:32
  • @T.J.Crowder, thanks for this. it works for me. how would you change it to make it work so that this: "     hello   world     " becomes: "hello   world". Thanks! – Varun Jan 02 '20 at 22:41
  • @Varun - `.replace(/(?:^(?:\ |\s)+)|(?:(?:\ |\s)+$)/g, "")` should do it. :-) – T.J. Crowder Jan 03 '20 at 08:04
5

To easiest way to trim non-breaking spaces from a string is

html.replace(/&nbsp;/g,' ').trim()
smjhunt
  • 311
  • 4
  • 8
2

If you are using jQuery you can use jQuery.trim()

function removes all newlines, spaces (including non-breaking spaces), and tabs from the beginning and end of the supplied string. source

mikewasmike
  • 348
  • 3
  • 9
0

None of these fit for me. What I want to do was to remove "&nbsp;" only from the beginning or the end of the string but not from the middle. So what my suggestion is.

  let ingredients = str.replace(/&nbsp;/g, ' ');
  ingredients = this.ingredients.trim();
  ingredients = this.ingredients.replace(/\s/g, '&nbsp;');

var txt = 'aa&nbsp;&nbsp;cc&nbsp; &nbsp; ';
var result = txt.replace(/&nbsp;/g, ' ');
result = result.trim();
result = result.replace(/\s/g, '&nbsp;');
console.log(result);
0

This implementation has proven successful for me

s=s.replaceAll('&nbsp;', ' ').replaceAll('<br>', ' ').trim();