1

I am starting to learn Solr (using version 5.5.0). I am using managed-schema and data-congif.xml files to inex two sql server tables: Company & Contact.

I am able to execute from the UI, the data import, selecting one entity at a time.

This is the message I get for Company:

Indexing completed. Added/Updated: 8,293 documents. Deleted 0 documents. (Duration: 01s) Requests: 1 (1/s), Fetched: 8,293 (8,293/s), Skipped: 0, Processed: 8,293 (8,293/s) Started: less than a minute ago

This is the message I get for Contact:

Indexing completed. Added/Updated: 81 documents. Deleted 0 documents. Requests: 1, Fetched: 81, Skipped: 0, Processed: 81 Started: less than a minute ago

When I click the Query section, I want to perform a query to see all the Contact, and/ or Company records, not necessarily combined, but just be able to query them.

I am not sure how to do this, is it possible to get some help to understand how to specify against which entity I want to execute the query?

Here are the 2 files I modified:

data-cofig.xml:

<dataConfig>
  <dataSource type="JdbcDataSource" 
              driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
              url="jdbc:sqlserver://sql.server.com\test;databaseName=test"
              user="testusr" 
              password="testpwd"/>
  <document>
    <entity name="Company" pk="CompanyID" query="SELECT * FROM tblCompany">
       <field column="CompanyID" name="company_companyid"/>
       <field column="Name" name="company_name"/>
        <field column="Website" name="company_website"/>
        <field column="Description" name="company_description"/>
        <field column="NumberOfEmployees" name="company_numberofemployees"/>
        <field column="AnnualRevenue" name="company_annualrevenue"/>
        <field column="YearFounded" name="company_yearfounded"/>
    </entity>

    <entity name="Contact" pk="ContactID" query="SELECT * FROM tblContact">     
      <field column="ContactID" name="contact_contactid"/>
      <field column="FirstName" name="contact_firstname"/>
      <field column="MiddleInitial" name="contact_middleinitial"/>
      <field column="LastName" name="contact_lastname"/>
      <field column="Email" name="contact_email"/>
      <field column="Description" name="contact_description"/>
    </entity>
  </document>
</dataConfig>

managed-schema:

  <!-- Company Begin -->
  <field name="company_companyid" type="string" indexed="true"/>
  <field name="company_name" type="string" indexed="true"/>  
  <field name="company_website" type="string" indexed="true"/>  
  <field name="company_description" type="string" indexed="true"/>
  <field name="company_numberofemployees" type="string" indexed="true"/>
  <field name="company_annualrevenue" type="string" indexed="true"/>
  <field name="company_yearfounded" type="string" indexed="true"/>
  <!-- Company End -->

  <!-- Contact Begin -->
  <field name="contact_contactid" type="string" indexed="true" />
  <field name="contact_firstname" type="string" indexed="true"/>
  <field name="contact_middleinitial" type="string" indexed="true"/>
  <field name="contact_lastname" type="string" indexed="true"/>
  <field name="contact_email" type="string" indexed="true"/>
  <!-- Contact End -->

UPDATE

I tried using the fl field to select company_companyid, but I did not get any results.

I am including a screen shot:

enter image description here

erasmo carlos
  • 664
  • 5
  • 16
  • 37
  • Just to make sure I understand, you want to be able to execute a query and get, for example, only fields related to the Contact entity? – TMBT Mar 24 '16 at 14:27
  • I want to be able to execute a query and get fields for company, or contact, not combined, just as needed. Either a query against company fields, or a query for contact fields. I hope this makes sense. Thank you. – erasmo carlos Mar 24 '16 at 15:10

1 Answers1

0

To get fields as needed from a document, use fl. For example, if you were using SolrJ, you would have something like query.set("fl", "fieldA, fieldB").

In a URL, it looks like this: http://host:port/solr/coreName/select?q=*%3A*&fl=fieldA,fieldB&wt=json&indent=true

TMBT
  • 1,183
  • 10
  • 17
  • I am going to give that a try and let you know. I know I am using Solr version 5.5.0, not sure how to find out if this is also SolrJ – erasmo carlos Mar 24 '16 at 21:48
  • I tried your recommendation, but I did not get any results. I entered under the fl box: company_companyid - which you can see is included under the company entity in the data-config.xml file – erasmo carlos Mar 24 '16 at 22:12
  • Try adding stored=true to all of your schema fields. A stored field will be returned in search. You'll need to perform another full-import for the changes to take effect. – TMBT Mar 25 '16 at 14:02
  • I added stored=true to each, company and contact and then performed another full-import. When I do the full-import, under Entity, I do not select Company, nor Contact, I just click execute and it processes the correct amount of records between Company and Contact. Then I tried a new query entering first the parameter: company_name, and it returns all company name, that is good. But when I enter: contact_firstname, still does not return anything. Why do you think this is happening if the contact records are also being processed? – erasmo carlos Mar 25 '16 at 17:51
  • I didn't notice this before, but you have two entities in your config file. From my own tests and from what I know about the dih, Solr probably looks at your first entity as the root entity, imports accordingly, then doesn't necessarily know what to do with the second entity. You'll need to either nest the entities or use a join to bring your columns together in one statement. The join is preferable as using nested entities degrades performance during imports. – TMBT Mar 25 '16 at 18:07
  • How would I do this join if the tables are not related? – erasmo carlos Mar 25 '16 at 18:16
  • I'm not really sure you can if there are no shared keys between the tables. A cross join will join your tables, but you will have duplicated data across documents and poor performance during imports. It sounds like either the tables themselves are not well-designed, or you're trying to fit two unrelated concepts (Company and Contact) into your Solr index. I say unrelated because related concepts would have some kind of shared key (like a company_id or something). What are you trying to do with Solr? – TMBT Mar 25 '16 at 18:37
  • I am trying to so something like this person did on this post: http://stackoverflow.com/questions/22534121/how-to-index-and-search-two-different-tables-which-are-in-same-datasource-using – erasmo carlos Mar 25 '16 at 18:39
  • I mean your uses cases. For example, searching all contacts with a company or something like that. Depending on your needs, you may need to redesign your tables to be related, use multi-core (one for each table) and perform joins when querying, or perhaps reconsider your use cases entirely. – TMBT Mar 25 '16 at 18:44
  • I got it. I added "common fields" on each entity and use the template transformer property. Then in the admin interface under fq I do: commonfield: Company, or commonfield:Contact, and that gives me back records for either or entity. – erasmo carlos Mar 25 '16 at 18:54