I'm working on big data, I'm trying to parallelize my process functions. I can use several threads and process every user is a different thread (I have 200k users).
Every thread should append the first n
lines of a file that produce, in an output file, shared between all the threads.
I wrote a Java program that execute head -n 256 thread_processed.txt >> output
(every thread will do this)
I need the output file to be wrote in an atomic way.
If the thread A wrote lines from 0 to 9 and threads B wrote lines from 10 to 19 the output should be: [0...9 10... 19]
. Lines can't overlaps, it can't be something like [0 1 2 17 18 3 4 ...]
How I can manage concurrent write access to the output file in a bash script?