I am using Scala 2.12 and we have Elasticsearch 5.2.2. My requirement is for only fetch/search based on the criteria. The search will return more than 10,000 documents or messages at one go. So I cannot use the regular search. The data (each doc/message) is a complex JSON, which I can later parse. So I need to fetch such all messages and store it in a single list of Json or anything. I am not so fluent in Scala.I can use Elastic4s in scala for search. I see it has scroll and scan option, but didn't find any full working example. So looking for some help.
I see some sample code as below, but need more help to fetch everything and put everything as above:
client.execute {
search in "index" / "type" query <yourquery> scroll "1m"
}
client.execute {
search scroll <id>
}
But how to get the scroll id and how to proceed to get all the data?
Update:
The scala version and ES version are mentioned above.
I am using the following example:
SBT:
libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s-core" % "7.0.2"
libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s-http" % "5.5.10"
libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s-http-streams" % "6.5.1"
libraryDependencies += "org.elasticsearch" % "elasticsearch" % "5.6.0"
Code:
import com.sksamuel.elastic4s.ElasticsearchClientUri
import com.sksamuel.elastic4s.requests.common.RefreshPolicy
import com.sksamuel.elastic4s.http.{ElasticClient, ElasticProperties}
import com.sksamuel.elastic4s.http.Response
import com.sksamuel.elastic4s.http.search.SearchResponse
import com.sksamuel.elastic4s.HttpClient
import com.sksamuel.elastic4s.http.ElasticDsl._
val client = HttpClient(ElasticsearchClientUri("host", 9200))
val resp1 = client.execute {
search("index")
.matchQuery("key", "value")
.scroll("1m")
.limit(500)
}.await.result
val resp2 = client.execute {
searchScroll(resp1.scrollId.get).keepAlive(1.minute)
}.await
I think I am not using the correct versions for elastic4s modules.
Isuses:
import com.sksamuel.elastic4s.HttpClient: It is not recognizing the HttpClient class. As it is showing error HttpClient not found when I am trying to initialize the "client" variable.
Next, in my resp2, when I am trying to get the "scrollId", it is not recognizing that. How to fetch the scrollId from resp1?
Basically, what is missing here?
Update 2:
I changed the version of below dependencies as per the example on github (samples)
libraryDependencies += "com.sksamuel.elastic4s" %% "elastic4s-http" % "6.3.3"
Code:
val client = ElasticClient(ElasticProperties("http://host:9200"))
Now, I am getting the following the error;
Error:
Symbol 'type <none>.term.BuildableTermsQuery' is missing from the classpath.
[error] This symbol is required by 'method com.sksamuel.elastic4s.http.search.SearchHandlers.BuildableTermsNoOp'.
[error] Make sure that type BuildableTermsQuery is in your classpath and check for conflicting dependencies with `-Ylog-classpath`.
[error] A full rebuild may help if 'SearchHandlers.class' was compiled against an incompatible version of <none>.term.
[error] val client = ElasticClient(ElasticProperties("host:9200"))
[error] ^
[error] one error found
[error] (compile:compileIncremental) Compilation failed