2

So I know that in Mongo Shell, you use dot notation to get the field you want in any document.

How is dot notation achieved in MongoDB Scala. I'm confused as to how it works. Here is the code that fetches a document from a collection:

val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)

EDIT:

I'm trying to work on a mechanism to basically re-consume Kafka records at a point where the consumer was shutdown. To do this, I store my kafka records in an external database, and then try to fetch the most recent offset from there and start consuming from that point. Here is my Scala method that should do that:

def getLatestCommitOffsetFromDB(collectionName: String): Long = {

import com.mongodb.Block
import org.bson.Document

val printBlock = new Block[Document]() {
  override def apply(document: Document): Unit = {
    println(document.toJson)
  }
}

import com.mongodb.async.SingleResultCallback
val callbackWhenFinished = new SingleResultCallback[Void]() {
  override def onResult(result: Void, t: Throwable): Unit = {
    System.out.println("Latest offset fetched from database.")
  }
}

var obj: String = " "

try {

  val record = collection.find().projection(fields(include("offset"), excludeId())).limit(1)
  //TODO FIND A WAY TO GET THE VALUE AND STORE IT IN A VARIABLE

} catch {
  case e: RuntimeException =>
    logger.error(s"MongoDB Server Error : Unable to fetch data from collection : $collection")
    logger.error(e.printStackTrace().toString())
}

obj.toLong

}

The problem isn't that I can fetch documents from Mongo, more-so that I'm trying to access a particular field in Mongo. The Document has four fields in it: topic, partition, message, and offset. I want to get the "offset" field and store that in a variable, so I can use it as a restarting point to re-consume Kafka records.

where do I go from there?

POM.xml

<?xml version="1.0" encoding="UTF-8"?>

http://maven.apache.org/xsd/maven-4.0.0.xsd"> 4.0.0

<groupId>OffsetManagementPoC</groupId>
<artifactId>OffsetManagementPoC</artifactId>
<version>1.0-SNAPSHOT</version>

<dependencies>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-clients</artifactId>
        <version>1.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka_2.12</artifactId>
        <version>1.0.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>2.11.8</version>
    </dependency>
    <dependency>
        <groupId>org.slf4j</groupId>
        <artifactId>slf4j-api</artifactId>
        <version>1.7.25</version>
    </dependency>
    <dependency>
        <groupId>org.apache.kafka</groupId>
        <artifactId>kafka-streams</artifactId>
        <version>0.10.0.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/org.apache.spark/spark-streaming-kafka-0-10 -->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.2.0</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-core -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-core</artifactId>
        <version>2.6.5</version>
    </dependency>
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-databind</artifactId>
        <version>2.6.5</version>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.fasterxml.jackson.core/jackson-annotations -->
    <dependency>
        <groupId>com.fasterxml.jackson.core</groupId>
        <artifactId>jackson-annotations</artifactId>
        <version>2.6.5</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>casbah_2.12</artifactId>
        <version>3.1.1</version>
        <type>pom</type>
    </dependency>
    <dependency>
        <groupId>com.typesafe</groupId>
        <artifactId>config</artifactId>
        <version>1.2.1</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-driver_2.12</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-compiler</artifactId>
        <version>2.11.8</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongo-java-driver</artifactId>
        <version>3.4.2</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-driver_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>bson</artifactId>
        <version>3.3.0</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb</groupId>
        <artifactId>mongodb-driver-async</artifactId>
        <version>3.4.3</version>
    </dependency>
    <dependency>
        <groupId>org.mongodb.scala</groupId>
        <artifactId>mongo-scala-bson_2.11</artifactId>
        <version>2.1.0</version>
    </dependency>
</dependencies>

2 Answers2

3

You can modify your query this way:

import com.mongodb.MongoClient
import com.mongodb.client.MongoCollection
import com.mongodb.client.model.Projections

def getLatestCommitOffsetFromDB(
  databaseName: String,
  collectionName: String
): Long = {

  val mongoClient = new MongoClient("localhost", 27017);

  val collection =
    mongoClient.getDatabase(databaseName).getCollection(collectionName)

  val record = collection
    .find()
    .projection(
      Projections
        .fields(Projections.include("offset"), Projections.excludeId()))
    .first

  record.get("offset").asInstanceOf[Double].toLong
}

I think you were missing the com.mongodb.client.model.Projections imports in order to use fields, include and excludeId

I used first instead of limit(1) to make it easier to extract the result.

first returns a Document object on which you can call get to retrieve the value of the requested field.

But in fact, since you just want one record and one field, you can remove the projection!:

val record = collection.find().first
Xavier Guihot
  • 54,987
  • 21
  • 291
  • 190
  • It's not resolving record.get for some reason, and when I call "first" it requests for a SingleResultCallBack function. The function is easy; but it still doesn't resolve get. – Haisam Tarek Elkewidy Feb 28 '18 at 15:47
  • What's the reason why it's not resolving record.get (which exception)? What's your driver version? – Xavier Guihot Feb 28 '18 at 15:49
  • It's not an exception. I'm on IntelliJ and hover over get and it says "cannot resolve symbol get". I added my POM file to the post for reference. Maybe it will clarify things. – Haisam Tarek Elkewidy Feb 28 '18 at 15:54
  • Nevermind I figured it out. I was initially using MongoClients to create my MongoClient, when I switched to the one you used it fixed the problem. That's very strange though. – Haisam Tarek Elkewidy Feb 28 '18 at 15:58
  • okay, I will. I just have one more question. What if I want to sort in descending order to get the record with the maximum offset value. How would I include that? It appears that sorting and projections don't quite mix.. – Haisam Tarek Elkewidy Feb 28 '18 at 16:44
  • You can use: `val record = collection.find().sort(Sorts.orderBy(Sorts.descending("offset"))).first` with this import: `import com.mongodb.client.model.Sorts` – Xavier Guihot Feb 28 '18 at 16:49
  • Just a note that Double can't be converted to Long, so just say "asInstanceOf[Long]", other than that the problem is solved. – Haisam Tarek Elkewidy Feb 28 '18 at 19:18
0

According to the documentation, collection.find() accepts a com.mongodb.DBObject

One of the implementations of that interface that you can use is BasicDBObject which is basically like a mutable.Map[String, Object]. You can use the constructor which accepts a map like:

val query = new com.mongodb.BasicDBObject(Map(
  "foo.bar" -> "value1"
  "bar.foo" -> "value2"
))
val record = collection.find(query)....
JoseM
  • 4,302
  • 2
  • 24
  • 35