0

I expect the following text (note newlines at the 80 character mark):

This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds - maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error message
- a Postgres constraint failure will print a useful message, or a Javascript
exception will bubble up, or Javascript will complain about an unhandled
rejection.

when run through a markdown editor, to generate the following HTML:

<p>
This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds - maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error message
- a Postgres constraint failure will print a useful message, or a Javascript
exception will bubble up, or Javascript will complain about an unhandled
rejection.
</p>

When I run it through cmark, I get the following:

<p>This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds - maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error message</p>
<ul>
<li>a Postgres constraint failure will print a useful message, or a Javascript
exception will bubble up, or Javascript will complain about an unhandled
rejection.</li>
</ul>
<p>With this error we didn't get any of those. We also observed it could happen
anywhere - it didn't seem to correlate with any individual test or test file.</p>

Which seems incorrect - I definitely don't want a list to begin in the middle of a paragraph. Is this a problem in the spec or is there something I can do differently to not trigger this behavior?

Kevin Burke
  • 61,194
  • 76
  • 188
  • 305
  • Interesting. This is a divergence from Markdown. See [Babelmark](http://johnmacfarlane.net/babelmark2/?normalize=1&text=This+worked+for+a+few+months.+Unfortunately%2C+occasionally+a+test+would+time%0Aout+and+fail+after+18+seconds+-+maybe+1+in+100+test+runs+would+fail+with+a%0Atimeout.+Normally+when+the+tests+fail+you+get+back+some+kind+of+error+message%0A-+a+Postgres+constraint+failure+will+print+a+useful+message%2C+or+a+Javascript%0Aexception+will+bubble+up%2C+or+Javascript+will+complain+about+an+unhandled%0Arejection.) – Waylan May 26 '16 at 15:59

2 Answers2

3

In short, use the proper character, probably an emdash, rather than a hyphen.

The spec states:

In CommonMark, a list can interrupt a paragraph. That is, no blank line is needed to separate a paragraph from a following list:

Foo
- bar
- baz


<p>Foo</p>
<ul>
<li>bar</li>
<li>baz</li>
</ul>

It is later explained that this is because CommonMark adheres to the principle of uniformity. Specifically, a line that starts with a list marker is always a list item, regardless of any surrounding lines. The spec even acknowledges that Markdown behaves differently here specifically to avoid the problem you have encountered, but CommonMark favors uniformity over reasonableness/ease of use (apparently).

So the solution is to never start a line with the list marker when that line is not a list item. While you could carefully wrap your lines, future edits could reintroduce the problem. Fortunately, the hyphen (Unicode char \u2010, which is the character on your keyboard and the character used as a list marker) is rarely used in proper English grammar in the way that a list marker is. Specifically, hyphens should never be followed by whitespace, which is required for a list marker. If you want to follow the character by whitespace, you probably want an endash (Unicode char \u2013), emdash (Unicode char \u2014) or minus sign (Unicode char \u2212) (see this question for an explaination). Therefore, use the appropriate character and the problem is averted.

Let's give that a try:

This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds &mdash; maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error message
—  a Postgres constraint failure will print a useful message, or a Javascript
exception will bubble up, or Javascript will complain about an unhandled
rejection.

CommonMark's output is then:

<p>This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds — maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error message
— a Postgres constraint failure will print a useful message, or a Javascript
exception will bubble up, or Javascript will complain about an unhandled
rejection.</p>

Notice that for the first occurrence I used the HTML entity (&mdash;) whereas for the second occurrence I used the emdash character () directly. CommonMark converted the HTML entity to the emdash character and passed the literal character through unaltered. For readability, the actual character is certainly better, although difficult to type as it doesn't appear on most keyboards.

If you use SmartyPants, you could use double (--) or triple (---) hyphens which SmartyPants converts to endashes and emdashes respectively. And double and triple hyphens do not trigger lists as long as there are no spaces between the hyphens.

Community
  • 1
  • 1
Waylan
  • 37,164
  • 12
  • 83
  • 109
0

is there something I can do different to not trigger this behavior?

The following should avoid the list:

This worked for a few months. Unfortunately, occasionally a test would time
out and fail after 18 seconds - maybe 1 in 100 test runs would fail with a
timeout. Normally when the tests fail you get back some kind of error 
message - a Postgres constraint failure will print a useful message, or a 
Javascript exception will bubble up, or Javascript will complain about an
unhandled rejection.
mdickin
  • 2,365
  • 21
  • 27
  • Right, you can put the hyphen so it doesn't begin the line, but that doesn't prevent the problem from cropping up in the future. – Kevin Burke May 26 '16 at 16:05
  • Agreed, which is why I specifically addressed an alternative to avoid the behavior. – mdickin May 26 '16 at 16:06
  • 1
    @mdickin the problem is that in a future edit, the paragraph could be rewrapped and the problem reintroduced. This is not a solution, only a temporary workaround. In my answer I provide a solution in which future edits will not reintroduce the problem. – Waylan May 26 '16 at 17:31