0

I am a data engineer who uses both Apache Kafka for streaming as well as Delta for storage on lakes. This is more of a question to feed my curiosity. I can see both the Delta transaction log files (which are of .json extension ) as well as the Kafka topic log files (which are of .log extension) are named in the same way with zeros left padded up to 20 digit length.

E.g.:

00000000000000000000.log,00000000000000000001.log --> Kafka topic log files

00000000000000000000.json,00000000000000000001.json --> Delta transaction log files

Why the standard is 20 digit length? Why can't it be 19 or 21?

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
akhil pathirippilly
  • 920
  • 1
  • 7
  • 25
  • 4
    It likely has to do with the fact that 20 decimal digits is what you need to express every possible 64-bit unsigned integer. And padding with leading zeros is convenient because it makes lexicographic order (as used by `ls` and such) match numerical order. – Nate Eldredge Dec 20 '22 at 22:52
  • @NateEldredge : That makes sense !! You fed my curiosity :) – akhil pathirippilly Dec 20 '22 at 23:05

0 Answers0