119

Maybe I'm blind but i can't find, in S3 documentation, the maximum file name length that can be uploaded in S3.

ohe
  • 3,461
  • 3
  • 26
  • 50

1 Answers1

153

As follows from the Amazon documentation,

These names are the object keys. The name for a key is a sequence of Unicode characters whose UTF-8 encoding is at most 1024 bytes long.

The max filename length is 1024 characters. If the characters in the name require more than one byte in UTF-8 representation, the number of available characters is reduced.

BSMP
  • 4,596
  • 8
  • 33
  • 44
S3 Browser Team
  • 2,404
  • 1
  • 18
  • 9
  • 5
    If your language represents Unicode characters with 16 bits, this is `((1024 bytes * 8 bits/byte) / 16 bits/character) = 512 characters`. But how to know what they use? – Ben May 05 '14 at 01:39
  • 13
    @Ben Unicode != UTF-8. UTF-8 is a way of encoding Unicode into a set of bits. For characters in the (7 bit) ASCII set, UTF-8 only uses 1 byte / 8 bits. For other characters it'll probably us 2 bytes but sometimes 3 or 4. So for file names that use exclusively ASCII characters, the max filename length will be 1024 characters. – Josh Gallagher Jul 03 '14 at 09:55
  • 25
    At first I was like "1024 bytes of UTF8-encoded text != 1024 characters", and then I was like "ah yes, but the max is still 1024 characters". Funny how sometimes you can be so eager to demonstrate a stranger wrong on the internets (and how you almost never realize you're wrong before posting :D) – Romain Jan 28 '15 at 16:15
  • 2
    @Romain sounds right at first glance. But then it is not only about LOGICALLY right not. Beyond this, the answer should be helpful. If don't note Unicode != UTF-8, reader may misunderstand it by believing as long as "keystring".length() <= 1024, without considering encoding. – Steve Oct 20 '17 at 19:49
  • 8
    Its pretty simple. If your key consists only of US alphabet (ASCII set), you will have 1024 characters. If i use only german umlauts like öäü i will only have space for 512 characters, because those are 2 byte encoded in UTF-8. – Logemann Apr 24 '18 at 23:50