2

In elastic search version 7.7, multiple _types in the index is removed, Now If we want to query across multiple index, we are doing in the following way.

/index1,index2/_search?q=type:tweet 

In 7.7, what is the best way to query from multiple indexes using Transport Java API?

Edited :
1) Say I have two indexes, "user" and "tweet" I want to search both the index - user and tweet like below

If I want to query the "user" index on the field as {"username" = "Opster"} and in "tweet" index on the field as {"data" = "some_text"}

Is this possible?

2) I understand, each index is a separate partition in elastic search but How does the search across indexes work internally in elastic search?

Thanks,
Harry

Harry
  • 3,072
  • 6
  • 43
  • 100

1 Answers1

4

I think the below code should help. Note that you can create TransportClient client instance as mentioned in this link

In order to execute the search using the Java API, the below code should help:

SearchResponse response = client.prepareSearch("index1", "index2")
        .setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
        .setQuery(QueryBuilders.termQuery("type", "tweet"))                 // Query
        .setFrom(0).setSize(60)                                             // Set whatever size you'd want
        .get();

Some of the below useful API links:

Note: ES recommends people to migrate to Java Rest Client as mentioned in this link and this guide should help you as how you can migrate from Java API to using the REST Client.

Updated Answer:

Assuming that I have two indexes

  • user having field username with value Opster
  • tweet having field data with value some text

For the sake of simplicity I have made both the fields keyword type

What you are looking for would be as below

In Elasticsearch's Query DSL:

POST /_search
{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              {
                "term": {
                  "_index": "user"
                }
              },
              {
                "term": {
                  "username": "Opster"
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must": [
              {
                "term": {
                  "_index": "tweet"
                }
              },
              {
                "term": {
                  "data": "some text"
                }
              }
            ]
          }
        }
      ]
    }
  }
}

Java API:

import java.net.InetAddress;
import java.net.UnknownHostException;

import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.action.search.SearchType;
import org.elasticsearch.client.transport.TransportClient;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.common.transport.TransportAddress;
import org.elasticsearch.index.query.BoolQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.transport.client.PreBuiltTransportClient;

public class QueryForMultipleIndexes {


    public static void main(String[] args) throws UnknownHostException {

        // on startup

        TransportClient client = new PreBuiltTransportClient(Settings.EMPTY)
                .addTransportAddress(new TransportAddress(InetAddress.getByName("127.0.0.1"), 9300));       


        QueryBuilder firstQuery = new BoolQueryBuilder()
                                    .must(QueryBuilders.termQuery("_index", "user"))
                                    .must(QueryBuilders.termQuery("username", "Opster"));


        QueryBuilder secondQuery = new BoolQueryBuilder()
                                    .must(QueryBuilders.termQuery("_index", "tweet"))
                                    .must(QueryBuilders.termQuery("data", "some text"));

        //This is the should clause which in turn contains two must clause
        QueryBuilder mainQuery = new BoolQueryBuilder()
                                    .minimumShouldMatch(1)
                                    .should(firstQuery).should(secondQuery);

        SearchResponse response = client.prepareSearch("*")
                .setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
                .setQuery(mainQuery)           
                .setFrom(0).setSize(60)
                .get();

        System.out.println(response.getHits().getTotalHits());

        // on shutdown
        client.close();

    }

}

Below is what should appear in the output/console:

2 hits

Let me know if this helps!

Kamal Kunjapur
  • 8,547
  • 2
  • 22
  • 32
  • ES Ninja Kamal, Thanks for the quick response but I think I didn't explain the query properly, I updated in EDIT, please check – Harry Jun 12 '20 at 15:47
  • 1
    @Harry I'm sure even that is possible. Please give me sometime and I will update it. – Kamal Kunjapur Jun 12 '20 at 16:08
  • Thanks for the quick response again, Also how does it works internally on this search query? – Harry Jun 12 '20 at 20:06
  • @Harry I've updated the `Java Code` as well. About downvote, fair enough & I'm not worried. Let me know if the above code helps you!! – Kamal Kunjapur Jun 12 '20 at 22:12
  • Thanks for the update @Opster ED Ninja - Kamal, But this query will search every the tweets on the user index right? – Harry Jun 13 '20 at 03:47
  • No. That is not the case. The query works separately. This is not join if that is what you think. Both the queries would be executing in independent fashion and give you the result. I thought you were aware of that. Sorry I should've clarified your intention as to what you are looking for. – Kamal Kunjapur Jun 13 '20 at 08:12
  • ED Ninja - Kamal, So you are saying the above query search the username only in user index and tweets in only tweet index, Am I correct? – Harry Jun 13 '20 at 08:16
  • Thanks, I approved this, But why this is happening now : https://stackoverflow.com/questions/62352155/parent-join-in-elasticsearch-is-not-searching-as-expected – Harry Jun 13 '20 at 09:22
  • 1
    Let me check and get back to you on this. – Kamal Kunjapur Jun 13 '20 at 09:57
  • hey Kamal, i have to do a similar query- search if field1: "abc" in index1 and field2: "xyz" in index 2. query looks pretty much like the one you have posted; it is a should of two musts. the problem is i am not getting the right results. i am expecting an "AND" on the result of two internal bool queries but it is doing an "OR". using minimum_should_match as 2 results in 0 hits. help out if you can. Thanks! – Juhi Sharma Sep 09 '22 at 13:41