1

I'm using man along with man2html to automatically generate some html documentation, along the lines of man manpage | man2html. This is working well, except when I run it on Travis CI, man is not generating the proper escape sequences to make headers and options bold. Is there a way to force man to generate these codes?

I also took a look at using groffer --mode=tty instead of man, which works on my Mac, but on Linux (i.e., Travis CI), instead of generating the binary ANSI codes that man2html can read, it generates plain-text codes, like [1m.

asmeurer
  • 86,894
  • 26
  • 169
  • 240

2 Answers2

4

There is some missing information, but I will attempt to fill it in:

  • there is more than one program named man2html. I believe you are referring to the Perl script, which I also use. (I have made some improvements which you can find on my scripts page, but that does not alter the issue).
  • by comparison, there is another program (see manual page), which expects to format the manpage itself - unlike the Perl script.
  • a while back, one of the developers working with groff added a (mis)feature, changing the default behavior of nroff to product escape sequences for colors. Those would be something like ^[[34m or ^[[1m, for color or bold text.
  • aside from that, everyone else's nroff produced not escape sequences` but backspace-sequences, using overstriking to simulate underlining or bold text (_^HXX^HX for example).
  • not everyone like the groff feature (see for instance this mailing list comment).
  • the groff feature can be overridden by setting the environment variable GROFF_NO_SGR, as noted in the manual page for grotty.

Beyond the problem with escape sequences versus backspace sequences, groff may generate UTF-8 if you are using a locale whose encoding is UTF-8. There are a few places where this is noticeable:

  • hyphenation
  • special characters, such as © (copyright)
  • tables

The man2html script does not know anything about multibyte encodings such as UTF-8, and will do unexpected things. As a workaround, overriding the locale settings to POSIX fixes the problem, by setting these environment variables:

LANG=C
LC_ALL=C
LC_CTYPE=C
LANGUAGE=C
Thomas Dickey
  • 51,086
  • 7
  • 70
  • 105
  • `GROFF_NO_SGR` seems to work well with `groffer --mode=tty`, except I end up with `\xe2\x80\x90` where there are hyphenations, which leads to issues with man2html. – asmeurer Apr 22 '15 at 16:23
  • I'm also curious what your improvements to man2html are. I didn't see a changelog or anything. – asmeurer Apr 22 '15 at 16:57
0

I use this environment variable in my code, to generate overtyping codes from man, for non interactive shell:

export MAN_KEEP_FORMATTING=1

The answer was on here How to run man with formatting in not interactive shell?

jcubic
  • 61,973
  • 54
  • 229
  • 402