Would this assumption stay true for foreseeable future? I know it depends on hash algorithm, so just by knowing what hash algorithm was used (SHA-256), the IPFS CID would be reproducable for foreseeable feature right? Or is there other information that needs to be stored as well?
No, it is not safe to assume that ipfs add <file>
will always give the same CID as there are many parameters other than the hash function itself that the binary is free to change over time. At a high level ipfs add
turns a file/directory in a tree structure called UnixFS that represents that data, and since the default way in which ipfs add
is allowed to change over time it means that CID output by ipfs add example.txt
can change
Many of the UnixFS parameters are configurable (and described in ipfs add --help
) and include options such as raw leaves and chunk size. This means that if you'd really like to ensure that ipfs add example.txt
results in the same CID there are a set of flags you can pass to ensure this is the case.
Note, in general I'd try to avoid importing the same data to IPFS multiple times (it's a waste of resources anyway) although there may be some scenarios where that's just the easiest, or best, thing to do to get your project off the ground.