2

When I try to add a Latin-9 encoded file to a Fossil repository I get the error:

... contains invalid UTF-8. Use --no-warnings or the "encoding-glob" setting to disable this warning.

But from the documentation I think this will suppress just the warning and will still do the wrong thing, which means that a Latin-9 file gets imported as a UTF-8 file.

How can I import a Latin-9 file as a Latin-9 file? How to specify the encoding of a file or all files?

Pekka
  • 442,112
  • 142
  • 972
  • 1,088
ceving
  • 21,900
  • 13
  • 104
  • 178
  • I don't know Fossil, but from how they put things [here:](http://fossil-scm.org/xfer/help/setting): `ignore when issuing warnings about text files that may use another encoding than ASCII or UTF-8.` it seems like it *could* be UTF-8 only. But maybe it doesn't matter as long as you check files just in and out? Any browsing the source, diffing, etc. while the file is inside fossil would probably be broken but it seems reasonable to expect that the file you get out will be ok – Pekka Jan 14 '16 at 09:24
  • I changed the title to make the question more easily searchable. Feel free to roll back if you don't like it. – Pekka Jan 14 '16 at 09:55
  • I would recommend not using Latin-9 but [UTF8 everywhere](http://utf8everywhere.org/) in 2016 – Basile Starynkevitch Jan 14 '16 at 09:57
  • @BasileStarynkevitch Of course it is easier to use UTF-8 everywhere, but it means just to give up getting the encoding right. And in this case I have no option to change the encoding. – ceving Jan 14 '16 at 11:58

1 Answers1

4

What Fossil does is to warn you during a commit that a file contains data that it didn't expect to be there (binary, not Unicode, etc.). It will not actually alter the contents of the file unless the c=convert option is there and you select it. If you select the convert option, it will first convert the file and then ask you to actually commit it in a separate step.

When you suppress warnings with --no-warnings, it will not show the warning and assume that you want to commit the file (without converting it).

For a more permanent solution, the encoding-glob setting (which can be either local to the repository or set globally) can contain a pattern (such as *.txt) that denotes files that contain text in other formats (and for binary files, the binary-glob setting does that). When Fossil encounters non-Unicode content, it will then not raise the warning and assume that you want this; again, it will not convert the file, it just tells Fossil that you know what you are doing and that the non-Unicode content is intentional.

Reimer Behrends
  • 8,600
  • 15
  • 19
  • So the essence it: I can not define the encoding of a file. But as long as I do not care about the GUI I can disable the warning. – ceving Jan 14 '16 at 12:00
  • 1
    Fossil is largely encoding-agnostic as far as the contents of the files are concerned and generally doesn't do anything special based on their encoding. The warnings are there as a "hey, was that intentional" reminder when something looks a bit funky. – Reimer Behrends Jan 14 '16 at 12:06
  • I did not tested it right now, but I have read that the result of ignoring the warning will be, that the web gui will deliver the Latin-9 file as UTF-8, which means, that it will be broken in the browser. – ceving Jan 14 '16 at 13:06
  • 1
    What the UI does and how it generates its webpages is independent of the warning. Any problems in this area would result from Latin-9 text being stored in the repository and the UI later not understanding Latin-9. Fossil does not even have a place for storing a file's encoding (though it does store the mimetype for some things). – Reimer Behrends Jan 14 '16 at 14:37