why hbase KeyValueSortReducer need to sort all KeyValue

Question

I am learning Phoenix CSV Bulk Load recently and I found that the source code of org.apache.phoenix.mapreduce.CsvToKeyValueReducer will cause OOM ( java heap out of memory ) when columns are large in one row (In my case, 44 columns in one row and the avg size of one row is 4KB).

What's more, this class is similar with the hbase bulk load reducer class - KeyValueSortReducer. It means that OOM may happen when using KeyValueSortReducer in my case.

So, I have a question of KeyValueSortReducer - why it need to sort all kvs in treeset first and then write all of them to context? If I remove the treeset sorting code and wirte all kvs directly to the context, the result will be different or be wrong ?

I am looking forward to your reply. Best wish to you!

here is the source code of KeyValueSortReducer:

public class KeyValueSortReducer extends Reducer<ImmutableBytesWritable, KeyValue, ImmutableBytesWritable, KeyValue> {
  protected void reduce(ImmutableBytesWritable row, java.lang.Iterable<KeyValue> kvs,
      org.apache.hadoop.mapreduce.Reducer<ImmutableBytesWritable, KeyValue, ImmutableBytesWritable, KeyValue>.Context context)
  throws java.io.IOException, InterruptedException {
    TreeSet<KeyValue> map = new TreeSet<KeyValue>(KeyValue.COMPARATOR);
    for (KeyValue kv: kvs) {
      try {
        map.add(kv.clone());
      } catch (CloneNotSupportedException e) {
        throw new java.io.IOException(e);
      }
    }
    context.setStatus("Read " + map.getClass());
    int index = 0;
    for (KeyValue kv: map) {
      context.write(row, kv);
      if (++index % 100 == 0) context.setStatus("Wrote " + index);
    }
  }
}

Ram Ghadiyaram · Accepted Answer · 2016-05-08T09:00:33.750

1

please have a look in to this case study. there are some requirements where you need to order keyvalue pairs into the same row in the HFile.

edited May 08 '16 at 09:00

answered May 05 '16 at 15:56

Ram Ghadiyaram

28,239
13
95
121

Thanks for your reply. But I still trouble with why it need to sort all kvs in treeset first and then write all of them to context ? I really know the function of treeset. It this the hbase internal requirement of sorting one row of all columns ? I dont know the internal mechanism of hbase data store. – CrazyPig May 08 '16 at 06:24
Sorry for misunderstanding. I've rewritten with a link to case study, may be it will answer your question. – Ram Ghadiyaram May 08 '16 at 09:01
Thank you , RamPrasad G . With some research and your case study , I think I know why it need to sort all kvs of one row in the reduce. The purpose is to make total sorting as HFile need this sorting. And the reducer of Phoenix CSV BulkLoad cause OOM I refer to is the Phoenix issue : PHOENIX-2649. I've found the issue and resolve the exception now. Thank you very much. – CrazyPig May 09 '16 at 07:04
Good to know what was your fix for OOM ? – Ram Ghadiyaram May 09 '16 at 07:17
I found that the phoenix team had fixed the OOM issue in the version of 4.7. You can search the link to know more : [https://issues.apache.org/jira/browse/PHOENIX-2649](https://issues.apache.org/jira/browse/PHOENIX-2649) – CrazyPig May 09 '16 at 07:37
Phoenix4.6 use a reduce call CsvToKeyValueReducer to run bulkload data into hbase. In my case , a row size is about 4KB (44 columns) and it is unreasonable for causing OOM since the reduce jvm heap size is 768M. The reason of causing OOM is the comparator of CsvTableRowkeyPair error to compare two CsvTableRowkeyPair and make all of them passby one reducer in one reduce call , so it will cause OOM very quickly. – CrazyPig May 09 '16 at 08:03

score 1 · Answer 2 · answered May 09 '16 at 09:14

1.The main question : why hbase KeyValueSortReducer need to sort all KeyValue ?

Thanks to RamPrasad G's reply, we can look into the case study : http://www.deerwalk.com/blog/bulk-importing-data/

This case study will tell us more about hbase bulk import and the reducer class - KeyValueSortReducer. The reason of sorting all KeyValue in KeyValueSortReducer reduce method is that the HFile need this sorting. you can focus on the section :

A frequently occurring problem while reducing is lexical ordering. It happens when keyvalue list to be outputted from reducer is not sorted. One example is when qualifier names for a single row are not written in lexically increasing order. Another being when multiple rows are written in same reduce method and row id’s are not written in lexically increasing order. It happens because reducer output is never sorted. All sorting occurs on keyvalue outputted by mapper and before it enters reduce method. So, it tries to add keyvalue’s outputted from reduce method in incremental fashion assuming that it is presorted. So, before keyvalue’s are written into context, they must be added into sorting list like TreeSet or HashSet with KeyValue.COMPARATOR as comparator and then writing them in order specified by sorted list.

So, when your columns is very large, it will use a lot of memory for sorting. As the source code of KeyValueSortReducer memtioned :

/**
 * Emits sorted KeyValues.
 * Reads in all KeyValues from passed Iterator, sorts them, then emits
 * KeyValues in sorted order.  If lots of columns per row, it will use lots of
 * memory sorting.
 * @see HFileOutputFormat
 */

2.The referenced question : why Phoenix CSV BulkLoad reducer casue OOM ?

The reason of Phoenix CSV BulkLoad reducer casue OOM is the issue refer to PHOENIX-2649. Due to the Comparator inside CsvTableRowKeyPair error to compare two CsvTableRowKeyPair and make all rows to pass by one single reducer in one single reduce call, it will cause OOM quickly in my case.

Fortunately, Phoenix Team had fixed this issue upon the version of 4.7. If your phoenix version is under 4.7, please note about it and try to update your version, or you can make a patch to your version.

I hope this answer will help you !

why hbase KeyValueSortReducer need to sort all KeyValue

2 Answers2

Linked