3

I'm looking for an answer to trim a string to a certain length of chars without cutting words. I searched the questions and found this:

javascript shorten string without cutting words

I wanted to use @Hamish s answer with the regex replacement but encountered problems with multiline texts.

@Hamish s answer:

"this is a longish string of test".replace(/^(.{11}[^\s]*).*/, "$1"); 
//"this is a longish"

I searched for similar questions and found out that the dot '.' does not include newlines \n. Normally one could end an 's' at the end to have the dot also matching newlines, but in javascript that's obviously not working. I read in other threads that I should use [\s\S] to match any character. So I tried using @Hamish s regex expression like this:

infotext = infotext.replace(/^([\s\S]*{10}[^\s]*).*/, "$1");

But then I get an error message which says:

Uncaught SyntaxError: 
Invalid regular expression: /^(\[\s\S\]*{10}[^\s]*).*/: Nothing to repeat.

Can somebody help me out with that. I really can't find a solution to match any character... Thx in advance. M

Community
  • 1
  • 1
Merc
  • 4,241
  • 8
  • 52
  • 81
  • can you post the content of infotext so we can reproduce the problem. – kasper Taeymans Mar 15 '15 at 10:06
  • `*` is already a repeater, you cannot specify another repeater after it with `{10}`. This is equivalent to specifying `**` or `??`. If you think about it, it makes no sense to say "repeat this any number of times (`*`), then repeat that exactly 10 (`{10}`) times". –  Mar 15 '15 at 10:27
  • 1
    Hey Merc. It would be nice if you accepted answers on questions you asked. Because this is why people answer you in first place. And I am sure that the answer on your question has already been provided. – Manticore Mar 19 '15 at 11:22
  • @Manticore you are absolutely right. I just did not have the time getting to this issue again. I had to fix other things first due to a very tight deadline. I will have a look at it now. Sorry all for that! – Merc Mar 19 '15 at 14:44

3 Answers3

3

You can try:

infotext = infotext.replace(/^([\s\S]{10}\S*)[\s\S]*/, "$1");

Problem is your use of [\s\S]*{10}

JSFiddle

anubhava
  • 761,203
  • 64
  • 569
  • 643
  • @kasperTaeymans: I just tried fixing the **Invalid regular expression** problem. If OP tells us what is expected output then I can certainly tweak it further. – anubhava Mar 15 '15 at 10:13
  • May be `/^([\s\S]{10}\S*).*/mg` is [what OP needs here](http://jsfiddle.net/ao14vbyv/2/) – anubhava Mar 15 '15 at 10:16
  • I did not downvote but your solution does not make any difference compared to `/^(.{11}[^\s]*).*/`. The OP wants to return only a certain amount of characters. Your solution (and the regex he has already) does not work with new lines (ie: the new line is included while it should not) – kasper Taeymans Mar 15 '15 at 10:33
  • 1
    As commented above original answer I provided was just to fix the **Invalid regular expression** OP was getting. I assumed OP already had a working regex. Now coming back to this answer this is effectively same as `/^(.{11}\S*).*/mg` but why do you think OP doesn't want it since we haven't even see OP's expected output. – anubhava Mar 15 '15 at 11:04
  • Well I don't know for sure but he is clearly saying that his regex (`/^(.{11}[^\s]*).*/`) is not working over multiple lines. So to me it makes sense that he only wants the output as if there where no multiple lines. – kasper Taeymans Mar 15 '15 at 11:18
  • 1
    @kasperTaeymans: No, he's saying `/^([\s\S]*{10}[^\s]*).*/` is not working. He fixed the original problem himself by changing the `.` to `[\s\S]`, but then he broke it again by changing `{11}` to `*{10}`. However, I think it's a mistake to add the `m` and `g` flags. The way I read it, he wants to do one replacement at the beginning of the string, not multiple replacements at the beginning of each line. – Alan Moore Mar 15 '15 at 16:21
  • @AlanMoore: Thanks for you comment. I think you've interpreted OP better than us. Based on that understanding I have updated my regex above. – anubhava Mar 15 '15 at 16:41
  • @anubhava your first answer seems to do the trick! thanks for that.. ;) – Merc Mar 19 '15 at 14:54
  • 1
    @kasperTaeymans yeah you were right. I just wanted to trim a paragraph, which included several multiline chars down to the very first lets say 150 chars without cutting a word. I pasted the answer above and it seemed to work. thanks to everyone discussing here and please apologize me taking so much time getting back to this thread. – Merc Mar 19 '15 at 14:59
2

jsfiddle demo

infotext="this is a longish string of test.\n bla bla bla bla text here";
infotext = infotext.match(/^.*$/m)[0].replace(/^([\s\S]{10}\S*).*/, "$1");
console.log(infotext); //this is a longish
kasper Taeymans
  • 6,950
  • 5
  • 32
  • 51
1

The regex you are probably looking for is

/^([\s\S]{9}((?=\s)\s|[^\s]*)).*/i

in your case you have to change your line to this

infotext = infotext.replace(/^([\s\S]{9}((?=\s)\s|[^\s]*)).*/i, "$1");

This includes 9 characters of any kind, if the 10th is a whitespace then regex will stop after it and if it's no whitespace it will include the rest of that word until next whitespace so it won't get cut.

jsfiddle | regex101:

I can only recommend you to visit regex101.com and test around with some regex to get a grip on it.

Manticore
  • 1,284
  • 16
  • 38
  • This doesn't help. `((?=\s)\s|[^\s]*))` is exactly the same as `([^\s]*)` except it will sometimes consume a space. – Alan Moore Mar 15 '15 at 16:07
  • no, it cares about including the rest of the word if the character is a non-whitespace – Manticore Mar 15 '15 at 17:54
  • That's what `[^\s]*` does. After `[\s\S]{9}` is finished, it goes on to capture as many non-whitespace characters as it can. If there are no more characters, or if the next character is whitespace, it stops capturing and lets the `.*` consume whatever remains. Your regex does the same, except in the second case it also captures the whitespace character. – Alan Moore Mar 15 '15 at 19:55
  • he said he wants to 'trim a string to a certain length of chars' and by char I also understand whitespace characters – Manticore Mar 15 '15 at 22:29
  • @Manticore thanks for your answer. But unfortunately newlines are still breaking the pattern... I have something like this, which gets the text from an html object: `var infotext = $(this).text(); infotext = infotext.replace(/^([\s\S]{10}((?=\s)\s|[^\s]*)).*/i, "$1"); $(this).html(infotext);` But unfortunately it is trimmed for 10 chars and then I still have the part which comes after the newline in the string... – Merc Mar 19 '15 at 14:50