Questions tagged [mongodb-hadoop]

29 questions
1
vote
1 answer

mongo.input.query with $date not filtering input to hadoop

I have a sharded input collection that I want to filter on before sending it to my hadoop cluster for map reduce computations. I have this parameter in my $ hadoop jar - command mongo.input.query='{_id.uuid:"device-964693"}' and it works. The…
marko
  • 2,841
  • 31
  • 37
1
vote
1 answer

com.mongodb.hadoop.MongoOutputFormat not found when submit MapReduce job in Hadoop

I follow this tutorial http://www.mongodb.org/display/DOCS/Hadoop+Quick+Start to build mongodb-hadoop. And i try to build Treasury Yield example ( My Hadoop version is 0.20.2. ) , but I got the following error when I submit the MapReduce job…
user1651520
0
votes
1 answer

MongoDB Hadoop PIG Script throws "Undefined Parameter :gte" Exception

I am importing data from mongodb into hdfs . I am currently using a PIG script to LOAD data. I need to fetch data from mongodb every 3 hours . For this i need to pass in the mongo.input.query parameter . however i am getting the following exception…
Sid
  • 115
  • 1
  • 12
0
votes
3 answers

Using MongoDB Spark Connector to filter based on timestamp

I am using Spark MongoDB connector to fetch data from mongodb..However I am not able to get how I can query on Mongo using Spark using aggregation pipeline(rdd.withPipeline).Following is my code where I want to fetch records based on timestamp &…
Akki
  • 493
  • 1
  • 11
  • 23
0
votes
1 answer

mongo-hadoop package upsert with spark doesn't seem to be working

I am attempting to use the MongoDB Connector for Hadoop with Spark to query one collection in MongoDB and upsert all of the documents retrieved into another collection. The MongoUpdateWritable class is used for the value of the RDD to update a…
0
votes
1 answer

How to load the array of sub documents data from mongodb to hive

We are trying to use the mongodb data in hive, document has array of subdocuments.. How can I load the complex data into hive? Here is the sample json: { "_id" : ObjectId("582c8cb9913e2f21e062aaa6"), "acct" : NumberLong(12345), "history"…
Maddy
  • 109
  • 1
  • 8
0
votes
1 answer

MongoHadoop Connector used with Spark duplicates results by number of partitions

I am trying to read data into spark using the mongo-hadoop connector. The problem is that if I am trying to set a limit regarding the data read, I get in the RDD the limit * the number of…
user3452075
  • 411
  • 1
  • 6
  • 17
0
votes
1 answer

Spark: Mongo-Hadoop how to query

I am trying to do a $near query with MongoDB using Spark and mongo-hadoop with lat/lon coordinates that change. How can I do a query with mongo-hadoop? Apart from somethnig like: mongodbConfig.set("mongo.input.query", "{'field':'value'}") I cannot…
Randomize
  • 8,651
  • 18
  • 78
  • 133
0
votes
0 answers

Processing MongoDB in AWS EMR with Python

I'm trying to do a map reduce using mrjob and Python against a MongoDB database. The mongodb-hadoop connector has examples on how to use AWS EMR but not with mrjob, and I'm not quite getting all the bits together. Here is what I have already as far…
Photonica
  • 1
  • 2
0
votes
1 answer

hadoop mongodb connector build failed

I have installed hadoop 2.3 and the basic tests are passing with it. So, I believe that it is working. Now I want to install mongodb hadoop connector and I am following the official guide and when I issue this command until a certain point…
SRC
  • 2,123
  • 3
  • 31
  • 44
0
votes
1 answer

MongoDB hadoop connector fails to query on mongo hive table

I am using MongoDB hadoop connector to query mongoDB using hive table in hadoop. I am able to execute select * from mongoDBTestHiveTable; But when I try to execute following query select id from mongoDBTestHiveTable; it throws following…
Chetan Shirke
  • 896
  • 4
  • 13
  • 35
0
votes
1 answer

Pig: STORE with MongoInsertStorage don't work

I'm executing this simple code in a pig script: REGISTER /home/myuser/mongodb/mongo-2.10.1.jar REGISTER /opt/cloudera/parcels/CDH-4.5.0-1.cdh4.5.0.p0.30/lib/mongo-hadoop-cdh4-1.2.0/mongo-hadoop-core_cdh4.3.0-1.2.0.jar REGISTER…
0
votes
1 answer

Mongodb-Hadoop Adaptor

For getting started with the mongodb-hadoop adaptor i am referring to the manual The current hadoop version running on my system is 0.20.2. So i edited the build.sbt file to hadoopRelease in ThisBuild := "0.20.2". But when i try the next command of…
user2135730
  • 71
  • 1
  • 3
0
votes
2 answers

Using MongoDB data inside Hadoop with the help of Morphia

I've been playing with the MongoInputFormat that allows having all documents in a MongoDB collection put through a MapReduce job written in Hadoop. As you can see in the provided examples (this, this and this) the type the document is in that is…
Niels Basjes
  • 10,424
  • 9
  • 50
  • 66
1
2