98

Just quick one, but want to make sure I'm catching cross platform variations.

I like to convert new lines entered into a text area into a [comma], so that the output can be represented on a single line, my question...

Currently, sending from google chrome, when I view the value, I find it uses \r\n for new lines. If I replace \r\n I know it will work for chrome on windows 7, but what about other platforms, are there variations on what other browsers will insert as a new line inside a text area?

Peter Gluck
  • 8,168
  • 1
  • 38
  • 37
Ninjanoel
  • 2,864
  • 4
  • 33
  • 53
  • 3
    to simplify : Do all browsers only ever send '\r\n' to represent a new line entered into a text area (I'm not programmatically creating the value, it is only ever created by the user in their browser) – Ninjanoel Jan 08 '13 at 15:08

4 Answers4

119

By HTML specifications, browsers are required to canonicalize line breaks in user input to CR LF (\r\n), and I don’t think any browser gets this wrong. Reference: clause 17.13.4 Form content types in the HTML 4.01 spec.

In HTML5 drafts, the situation is more complicated, since they also deal with the processes inside a browser, not just the data that gets sent to a server-side form handler when the form is submitted. According to them (and browser practice), the textarea element value exists in three variants:

  1. the raw value as entered by the user, unnormalized; it may contain CR, LF, or CR LF pair;
  2. the internal value, called “API value”, where line breaks are normalized to LF (only);
  3. the submission value, where line breaks are normalized to CR LF pairs, as per Internet conventions.
Anshul
  • 5,378
  • 2
  • 19
  • 18
Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • 7
    HTML 5 spec: http://www.w3.org/html/wg/drafts/html/CR/forms.html#the-textarea-element: `the user agent should allow the user to edit, insert, and remove text, and to insert and remove line breaks in the form of "LF" (U+000A) characters`. – ComFreek Aug 12 '14 at 15:58
  • Thanks, you made my day! I was just confused because when I sending contents from textarea on osx/chrome, browser sends it with CR LF.. – starikovs Jan 26 '16 at 08:57
  • 4
    Another question is why when you get ".length" of textarea, it counts CR LF as only one character but when you check on the server side (for example, with PHP strlen) it will be two chars... – starikovs Jan 26 '16 at 09:00
  • 2
    @ComFreek's link above is broken today, use: https://www.w3.org/TR/html5/forms.html#the-textarea-element – Glen Mazza Jun 21 '16 at 15:27
  • @starikovs, I suppose this has been answered with the *the internal value, called “API value”, where line breaks are normalized to LF (only);* part. What you see as one character (namely `\n`), is probably what is provided by the "internal API". No reference, this is just my supposition base on common sense. – d.k Nov 13 '17 at 15:33
  • **(1)** Suppose a form with an input-text field contain a newline. Can it be the case that when i get the value of this field in JS it will be '\n', however when i submit the form containing this text-field, the value will be sent as '\r''\n'? **(2)** Also if the value of this field is sent via AJAX to back-end, will it be sent as '\n' or '\r\n'? – nishantbhardwaj2002 Nov 27 '17 at 13:04
  • The links posted above are broken as of March 2018 - http://w3c.github.io/html/sec-forms.html#the-textarea-element will work. – alexpls Mar 01 '18 at 00:17
  • And which of the three variants do I get when I call something like [request.getParameter](https://docs.oracle.com/javaee/7/api/javax/servlet/ServletRequest.html#getParameter-java.lang.String-)? Is it the last one (since the form has been submitted)? – theyuv Oct 09 '18 at 08:52
  • That's what I get for trying to get use of underused `FormData` – Klesun Dec 29 '19 at 00:44
12

Talking specifically about textareas in web forms, for all textareas, on all platforms, \r\n will work.

If you use anything else you will cause issues with cut and paste on Windows platforms.

The line breaks will be canonicalised by windows browsers when the form is submitted, but if you send the form down to the browser with \n linebreaks, you will find that the text will not copy and paste correctly between for example notepad and the textarea.

Interestingly, in spite of the Unix line end convention being \n, the standard in most text-based network protocols including HTTP, SMTP, POP3, IMAP, and so on is still \r\n. Yes, it may not make a lot of sense, but that's history and evolving standards for you!

Ben
  • 34,935
  • 6
  • 74
  • 113
7

- Line Feed and 
 Carriage Return

These HTML entities will insert a new line or carriage return inside a text area.

Damodar Das
  • 275
  • 2
  • 5
6

It seems that, according to the HTML5 spec, the value property of the textarea element should return '\r\n' for a newline:

The element's value is defined to be the element's raw value with the following transformation applied:

Replace every occurrence of a "CR" (U+000D) character not followed by a "LF" (U+000A) character, and every occurrence of a "LF" (U+000A) character not preceded by a "CR" (U+000D) character, by a two-character string consisting of a U+000D CARRIAGE RETURN "CRLF" (U+000A) character pair.

Following the link to 'value' makes it clear that it refers to the value property accessed in javascript:

Form controls have a value and a checkedness. (The latter is only used by input elements.) These are used to describe how the user interacts with the control.

However, in all five major browsers (using Windows, 11/27/2015), if '\r\n' is written to a textarea, the '\r' is stripped. (To test: var e=document.createElement('textarea'); e.value='\r\n'; alert(e.value=='\n');) This is true of IE since v9. Before that, IE was returning '\r\n' and converting both '\r' and '\n' to '\r\n' (which is the HTML5 spec). So... I'm confused.

To be safe, it's usually enough to use '\r?\n' in regular expressions instead of just '\n', but if the newline sequence must be known, a test like the above can be performed in the app.

barncat
  • 87
  • 1
  • 4
  • From the same page, isn't the value obtained through JS is called API value? – Anshul Apr 18 '16 at 15:47
  • @Anshul - I see what you mean. The original question was "Currently, sending from google chrome, when I view the value, I find it uses \r\n for new lines..." So, since it's being "sent", I guess the value is being read on the server. I assumed it was with JS. Anyway, hopefully the facts I posted are of some use. Thanks for your comment. – barncat Apr 21 '16 at 18:32
  • 1
    @brancat, I think the server language shouldn't matter here. HTML5 spec is very clear on 2 things for a `textarea`. 1. The Request body will only have \r\n 2. The JS value will have only \n irrespective of whether you use \r, \r\n, or \n while typing. It also matches to your finding with IE9+. – Anshul Apr 23 '16 at 03:48