0

I am trying to schedule a Delta-import with TikaEntityProcessor.The full import is working fine but Delta-import is not updating anything.There is no error either. This much server logs gets displayed,I am not able to figure out what went wrong:

121151 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Starting delta collection.
121155 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Running ModifiedRowKey() for Entity: message
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed ModifiedRowKey for Entity: message rows obtained : 0
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed DeletedRowKey for Entity: message rows obtained : 0
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed parentDeltaQuery for Entity: message
121156 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Running ModifiedRowKey() for Entity: messages
121157 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.JdbcDataSource  û Creating a connection for entity messages with URL: jdbc:oracle:thin:@//172.16.29.92:1521/d11gr21
121176 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.JdbcDataSource  û Time taken for getConnection(): 19
121182 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed ModifiedRowKey for Entity: messages rows obtained : 1
121182 [qtp966396367-15] INFO  org.apache.solr.handler.dataimport.DocBuilder  û Completed DeletedRowKey for Entity: messages rows obtained : 0

My dataconfig.xml is as follows:

 <document>

  <entity name="messages" pk="BLOB_PK" transformer='DateFormatTransformer'
    query="select * from BLOB_TEST"
    deltaImportQuery="select * from BLOB_TEST where BLOB_PK='${dataimporter.delta.id}'"
    deltaQuery="select BLOB_PK from BLOB_TEST where to_char(last_modified,'YYYY-MM-DD HH24:MI:SS') &gt; '${dataimporter.last_index_time}' "
    dataSource="db">
   <field column ="BLOB_PK" name ="id" />
   <field column="last_modified"  dateTimeFormat="YYYY-MM-DD HH24:MI:SS" locale="en"    />
     <entity 
         name="message" 
         dataSource="dastream"
          processor="TikaEntityProcessor"
         url="message"
         dataField="messages.MESSAGE"
         format="text">

        <field column="text" name="mxMsg" blob="true" />
        </entity>
     </entity>

</document>

When I run the Delta import manually from the web-client the status is shown like this :


"statusMessages": { "Total Requests made to DataSource": "4", "Total Rows Fetched": "3", "Total Documents Skipped": "0", "Delta Dump started": "2013-12-16 14:48:28", "Identifying Delta": "2013-12-16 14:48:28", "Deltas Obtained": "2013-12-16 14:48:28", "Building documents": "2013-12-16 14:48:28", "Total Changed Documents": "3", "Total Documents Processed": "0", "Time taken": "0:0:0.50" }

1 Answers1

0

I was able to make it work. I had to remove the following from data-config.xml:

deltaImportQuery="select * from BLOB_TEST where BLOB_PK='${dataimporter.delta.id}

I had no configuration for ${dataimporter.delta.id} so probably because of that nothing was getting indexed even after detecting the correct no of added rows.