The general approach in analyzing data sets in a functional way is to partition the data set in some way that makes sense, for a document you might cut it up into sections based on size. i.e. four threads means the doc is sectioned into four pieces.
The thread or process then executes its algorithm on each section of the data set and generates an output. All the outputs are gathered together and then merged. For word counts, for example, a collection of word counts are sorted by the word, and then each list is stepped through using looking for the same words. If that word occurs in more than one list, the counts are summed. In the end, a new list with the sums of all the words is output.
This approach is commonly referred to as map/reduce. The step of converting a document into word counts is a "map" and the aggregation of the outputs is a "reduce".
In addition to the advantage of eliminating the overhead to prevent data conflicts, a functional approach enables the compiler to optimize to a faster approach. Not all languages and compilers do this, but because a compiler knows its variables are not going to be modified by an outside agent it can apply transforms to the code to increase its performance.
In addition, functional programming lets systems like Spark to dynamically create threads because the boundaries of change are clearly defined. That's why you can write a single function chain in Spark, and then just throw servers at it without having to change the code. Pure functional languages can do this in a general way making every application intrinsically multi-threaded.
One of the reasons functional programming is "hot" is because of this ability to enable multiprocessing transparently and safely.