2

I'm writing Parquet with Java API and syncing to HDFS.

When I get the Parquet file and read it with vi, I see many strange symbols like:

^U^B^U^@^U^H^U^H^\^X^H^A^@^@^@^@^@^@^@^X^H^A^@^@^@^@^@^@^@^V^@^@^@^@^A^@^@^@^@^@^@^@^U^@^U

I'm wondering how to interpret these.

Rich Churcher
  • 7,361
  • 3
  • 37
  • 60
ulysses
  • 123
  • 1
  • 12

1 Answers1

2

These are non-printable characters.

Examples:

  • ^@ is the representation of NUL character (ASCII value 0)
  • ^M is a carriage return
Ronan Boiteau
  • 9,608
  • 6
  • 34
  • 56
  • Thank you. I really get a new world. I use `od -c` to see the parquet, and get the correct data. Another question, why parquet use this format, I mean, why not use `0` instead of `^@`. – ulysses Dec 04 '17 at 03:08
  • `0` is a value, `^@` is different, it is a `NUL` character, meaning "this character is empty". – Ronan Boiteau Dec 04 '17 at 09:39