0

Using a shell/bash script, I need to convert some text to hexadecimals so I pipe the source text into hexdump, so far so good. The problem is æøå characters. They show up fine in the console (UTF-8), but the hexadecimal values hexdump provides isn't correct. All other standard latin letters. echo -en "Some text containing æøåÆØÅ"|hexdump -v -e '"xx" 1/1 "%02X", then I use sed to replace the xx with %. Well, all letters, punctations, new line, etc is, just not non-standard-latin letters.

So, how do I go about solving this? Is it the input codepage that is the problem, or is there some limitations wiht hexdump? Thanks!

EDIT: By codepage, I mean character encoding. Not 100% sure it is the same thing. Bear with me please! :)

  • 1
    What do you get, and what did you expect? In utf-8 æøå is encoded as two octets each. – asjo Apr 21 '15 at 19:30
  • Yes, I get two hexadecimal values from hexdump for letters like æøåäÖ, etc. So it's the input codepage that is the problem then? I need e.g. **ø** to be **F8**, but hexdump translates it to **C3B8**. – Laurentius Magnus Apr 21 '15 at 19:50
  • If you don't want to encode utf-8 you shouldn't input it :-) It sounds like you want iso-8859-1 (or similar)? – asjo Apr 21 '15 at 19:58
  • OK, I changed the codepage in PuTTY to Win 1252 and now hexdump translates æøå correct. So it's the input that is the problem then. The thing is, this is script will take dynamic content, output from logs etc. So I need to translate the text from UTF-8 to a more "valid" codepage, prior to piping to hexdump. Any ideas? Thanks! – Laurentius Magnus Apr 21 '15 at 19:59
  • OK, I'll experiment some more with iconv -f utf-8 -t iso-8859-1 utf-8-text > "hexdump-dump-happy-encoding". Thanks! :) – Laurentius Magnus Apr 21 '15 at 20:42
  • OK, just to conclude, this is what I ended up doing `$(iconv -f utf-8 -t ISO-8859-4 <<< $(echo -n $MSG))`. Works now. – Laurentius Magnus Apr 21 '15 at 21:27

0 Answers0