I wrote a mapreduce job to generate solr index for my data. I did the generation in the reducer.But the speed is really slow. Is there any way to improve the speed? The code listed below is the code inside the reducer. Is there anything wrong in my program or is there any way to improve the speed of generating indices?
private SolrClient solr;
private UpdateResponse response;
private SolrInputDocument document;
@Override
public void reduce(Text inputKey, Iterable<Text> values, Context context) throws IOException, InterruptedException {
//process the values...
document = new SolrInputDocument();
document.addField("id", hid+"#"+refid);
document.addField();
.....
response = solr.add(document);
solr.commit();
}
public void setup(Context context) {
if(solrServerMode.equals("Cloud")){
solr = new CloudSolrClient(solrServerPath);
((CloudSolrClient) solr).setDefaultCollection("gettingstarted");
}
else if(solrServerMode.equals("Local")){
solr = new HttpSolrClient(solrServerPath);
}
}
@Override
public void cleanup(Context context) {
solr.close();
}
Edit One:
There is one suspicious part that may cause the speed very slow.As the picture showing, I just updated 46,205 documents but the version is very very high.