-1

why do text files with the same content have 2 different sizes of file? Is it because they are not created at the same time?

  • How to download an extension for my text editor or go into my text editor and turn something "on" in settings to solve this the case. – Steven Chen Sep 28 '19 at 22:32
  • Many programmer's editors have a [hex mode](https://stackoverflow.com/questions/38905181/how-do-i-see-a-bin-file-in-a-hex-editor-in-visual-studio-code) to show bytes. (Alongside the bytes, they might show characters using some particular character encoding and display rules, such as . for control characters. Don't assume that the correct character encoding is being used, though, unless you can set it yourself.) – Tom Blodget Sep 29 '19 at 15:29
  • How does the ASCII text encoding relate to your question? Since the ASCII encoding won't in fact explain the number of bytes in a text file, unless that file is in fact encoded using ASCII, it's not possible to answer as the title suggests. Beyond that, there is the question of what you consider to be "the same content". Not all characters are printable (so content may appear to be identical when not), some characters can be represented in multiple ways (so content may be identical but with different encoding), and of course different encodings produce different bytes. – Peter Duniho Sep 29 '19 at 17:19
  • Your question is unclear, too broad. It *might* be answered already (see proposed duplicate), but if that doesn't address your question, you need to improve this question significantly. See also [ask] for important information on how to present your question in a clear, answerable way. – Peter Duniho Sep 29 '19 at 17:20

1 Answers1

1

The two files probably use different character encodings. See https://en.wikipedia.org/wiki/Text_file#Encoding and https://knowledgebase.progress.com/articles/Article/000057930.

You can use a hex-dump program, such as xxd to view the bytes in the text files. Then, from the bytes, you'll be able to tell if the two files use different character encodings.

mti2935
  • 11,465
  • 3
  • 29
  • 33
  • And, a Unicode BOM is either optional, required or forbidden, depending on the shared understanding of which character encoding is being used. – Tom Blodget Sep 29 '19 at 15:20