6

Follow the steps to generate the error:

1. Configure the large amount of data (around 4 GB or more than 50 millions of records)
2. Give proper data-config.xml file for indexing the data from remote database server.
3. During indexing the data into solr from SQL SERVER 2010, at the half way unplug the     
   network cable and see the status in solr. e.g.
   localhost:8083/solr/core1/dataimport?command=status
   or
   localhost:8083/solr/core1/dataimport
4. Pass few seconds then again plug back the cable.
5. You can clearly see that there is just only "Time Elapsed" parameter increase.      
   "Total Rows Fetched" & "Total Documents Processed" remains same for infinite time.
6. You can regenerate this for small data also.
7. Work around is you need to restart the solr. (But this is not good solution) 

Note: This is very important issue because, so many organizations not using this valuable
products just because of the this database infinite connection issue. Solution can be: Forcefully abort the data indexing or provide mechanism for forcefully abort the indexing. Hope you guys knows that abort command is also not working.

Sanket Thakkar
  • 725
  • 8
  • 18

2 Answers2

4

From Solr documentation (http://wiki.apache.org/solr/DataImportHandler)

Abort an ongoing operation by hitting the URL http://:/solr/dataimport?command=abort .

I just checked the source code for DIH and abort command is implemented

Greg S
  • 466
  • 3
  • 5
0

Good question!

You can get the URL from your network tab in chrome.

  1. Go to Dataimport and select Auto Refresh Status checkbox

  2. Open the network tab in dev tools and you should see a status request

network tab here

  1. copy one of the urls and replace status with abort

from

https://solr.yourdomain.com/solr/%3Ccollectionname%3E/dataimport?_=1685514143962&command=status&indent=on&wt=json

to

https://solr.yourdomain.com/solr/%3Ccollectionname%3E/dataimport?_=1685514143962&command=abort&indent=on&wt=json

mikoop
  • 1,981
  • 1
  • 18
  • 18