2

I have a Solr 4 index that I want to delete all its documents.

Attempt #1:

http://www.domain.com:8080/solr/collection1/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E

http://www.domain.com:8080/solr/collection1/update?stream.body=%3Ccommit/%3E

Result #1:

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
</lst>
</response>

Under the Solr Admin > collection 1, I still see Num Docs:829060! I suppose this means the delete query did not work.

I also see results when going to

http://www.domain.com:8080/solr/collection1/select?q=*%3A*&wt=xml

Attempt #2 Using Solarium PHP library

    // Create a client instance
    $config = array(
        'endpoint' => array(
            'localhost' => array(
                'host' => '127.0.0.1',
                'port' => 8080,
                'path' => '/solr/',
            )
        )
    );
    $client = new Solarium\Client($config);

    // get an update query instance
    $update = $client->createUpdate();

    // add the delete query and a commit command to the update query
    $update->addDeleteQuery('*:*');
    $update->addCommit();

    // this executes the query and returns the result
    $result = $client->update($update);

    echo '<b>Update query executed</b><br/>';
    echo 'Query status: ' . $result->getStatus(). '<br/>';
    echo 'Query time: ' . $result->getQueryTime();

Output #2:

Update query executed
Query status: 0
Query time: 3

I still see Num Docs:829060! This did not work as well.

Any ideas how to solve the problem?


UPDATE

I manually deleted the index folder /collection1/data, did a DIH full-import and still cant delete the documents in the new index. Any suggestions?

solrconfig.xml

<requestHandler name="/update" class="solr.UpdateRequestHandler">
  <!-- See below for information on defining 
       updateRequestProcessorChains that can be used by name 
       on each Update Request
    -->
  <!--
     <lst name="defaults">
       <str name="update.chain">dedupe</str>
     </lst>
     -->
</requestHandler>
Nyxynyx
  • 61,411
  • 155
  • 482
  • 830

3 Answers3

1

Can you try and query solr on the command line? Eg:

curl http://domain.com:8080/solr/collection1/update?commit=true -H "Content-Type: text/xml" --data-binary '<delete><query>*:*</query></delete>'

After running this query you should see something like

INFO: [phisch-dev] webapp=/solr path=/update params={wt=javabin&version=2} {deleteByQuery=*:* (-1428803632004857856)} 0 126

in solr logs (e.g. /var/log/tomcat7/catalina.2013-03-07.log).

I am using POST here, just to be sure GET/stream.body does not encode things in odd ways. I added the commit attribute so the delete query gets committed automatically.

Also, did you make any changes to your RequestHandler? Does it overwrite defaults or anything like that? Check your solrconfig.xml and search for /update. Mine contains (which is what ships with solr):

<requestHandler name="/update" class="solr.UpdateRequestHandler">
</requestHandler>

There should be no <lst name="defaults">, <lst name="appends" or <lst name="invariants">

BTW, changes to the Index are not visible until a new search is opened. What happens if you delete from the index and restart solr. Are the documents still there?

EDIT: It happens to be a bug: https://issues.apache.org/jira/browse/SOLR-3432 Adding a _version_ field to the schema fixes this (thanks to Nyxynyx for this precision)

phisch
  • 4,571
  • 2
  • 34
  • 52
  • Doing your suggested curl from shell and from my browser both gave the same response `The requested resource (/collection1/update) is not available.` How should I check for changes to the RequestHandler? – Nyxynyx Mar 07 '13 at 15:20
  • Sorry, this should be /solr/collection1/update. (I have a dedicated solr instance, thus i forgot to mention the /solr part). It's fixed in the above answer – phisch Mar 07 '13 at 15:22
  • Thanks, I used the updated curl command, restarted Solr using Tomcat7, and a `*:*` query shows that all the documents remain undeleted. – Nyxynyx Mar 07 '13 at 15:29
  • `` is commented out, theres nothing else in there. Updating original post with snippet... – Nyxynyx Mar 07 '13 at 15:31
  • Thanks for helping me along, it happens to be a bug in Solr... https://issues.apache.org/jira/browse/SOLR-3432 I edited my schema and now deleting works!! Will updating my version of solr help? – Nyxynyx Mar 07 '13 at 15:40
  • It's supposed to be fixed in 4.0/5.0. I am using 4.0.0 1394950, but I also have the _version_ number enabled. – phisch Mar 07 '13 at 15:42
  • I am using `4.0.0-BETA 1370099`. I guess mine is too old. – Nyxynyx Mar 07 '13 at 15:45
  • 4.1 is already available as a final release. Might be a good time to update. – phisch Mar 07 '13 at 15:46
  • Updated solr to `4.1.0 1434440` and DIH to 4.1.0. Everything's going great, thanks :) – Nyxynyx Mar 07 '13 at 15:59
0

After you delete all entries in the first approach in the index, you still have to commit it:

http://www.domain.com:8080/solr/collection1/update?stream.body=%3Ccommit/%3E
nfechner
  • 17,295
  • 7
  • 45
  • 64
  • What query are you using to check? – nfechner Mar 07 '13 at 13:50
  • Is `collection1` your default collection? You sometimes specify it and sometimes not. Could be your delete and NumDocs query go to different collections. Also, sometimes you use domain.com and sometimes 127.0.0.1 – phisch Mar 07 '13 at 13:56
  • `collection1` is the default collection. Just realized that in attempt #2 I did not specify `collection1` but thats the default collection so I guess it is fine. Also used `127.0.0.1`, thats because the PHP script is executed the same machine that hosts solr. – Nyxynyx Mar 07 '13 at 13:58
0

Just fire the below command in the browser.

http://localhost:8983/solr/update?stream.body=:&commit=true

bittu
  • 337
  • 2
  • 13