I have to create a design where I process raw data and envelope them into X12 or EDIFACT based on requirement. During this processing there are multiple intermediate documents gets created which are similar to raw data or EDI document. I need to store these documents into a distributed file storage? I need to know which is the best suited distributed file storage for my usecase. An eg of EDI Document is here - 0120TRANA 770034661 PREPARER'S AGENTE20080522010080014302AV901005 TZ # 0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA 92122 8585258010 # 0120ACK 5618383330100800143020001000000000000C0004 200805220090100500838801 1 NJ# 0120ACKR 561838333 01FORM 1040 00001000000100100504 # 0120****RECAP 000000000001010080014302000000000000000001000000000000000000000001000000 #
I was exploring HDFS, AWS S3, GPFS and Elasticsearch.