0

Need some help to understand how HDFS and Storm are integrated. Storm can process incoming stream of data using many nodes. My data is, let's say, log entries from different machines. So how do I store that all? Ideally I'd like to store logs from one machine to a one or many files dedicated to that machine. However does does it work? Will I be able to append to the same file in HDFS from many different Storm nodes?

PS: I still working on getting all this running so I can't test this physically... but it does bother me.

Matthias J. Sax
  • 59,682
  • 7
  • 117
  • 137
Schultz9999
  • 8,717
  • 8
  • 48
  • 87

1 Answers1

0

Write a file in hdfs with Java

No, you cannot write to the same file from more than one task at a time. Each task would need to write to it's own file in a directory and then you could process them using directory/* if you are using hadoop

Community
  • 1
  • 1
bridiver
  • 1,694
  • 12
  • 13
  • I see. So it's like normal file system access. OK. I can live with that I suppose. My workers can create files for every minute. The other process (say, Shark based) will work with 1 minute files then. – Schultz9999 Jul 01 '14 at 22:52