0

I need to import billions of very small image files into Foundry. I understand Foundry won't handle that amount of files very well, so best practice would be to concatenate them and make the import incremental.

However, I need to be able to get the raw files and their names to be able to process them in my pipeline. How can I get the original files back from the concatenated file?

Adil B
  • 14,635
  • 11
  • 60
  • 78
  • Would it be a good idea to edit the title to be clear that this question is about binary/image files, as for unnconcatenated text files there are different (& better) schema based solutions? – ollie299792458 Jan 13 '22 at 11:10
  • Added a question on this: https://stackoverflow.com/questions/70695797/can-i-get-the-file-names-for-synced-text-files-in-my-pipeline-in-foundry/70695798 – ollie299792458 Jan 13 '22 at 11:26

1 Answers1

0

The original file names, sizes, and modified dates are included in the transaction metadata:

enter image description here

Assuming you'll need data across multiple transactions, you likely want to do this as incrementally as possible.

amy.bananagrams
  • 100
  • 1
  • 6