I want to store some value in map task into local disk in each data node. For example,
public void map (...) {
//Process
List<Object> cache = new ArrayList<Object>();
//Add value to cache
//Serialize cache to local file in this data node
}
How can I store this cache object to local disk in each data node, because if I store this cache in map function like above, then the performance will be terrible because I/O task?
I mean is there any way to wait for map task in this data node run completely and then we will store this cache into local disk? Or does Hadoop have a function to solve this issue?