1

Using Amazon EMR, Hive .13, Hadoop 2.x, and Presto Server 0.89. Trying to set up Presto to query data that is usually queried through Hive. Hive metadata is stored in MySQL. Presto Server is installed set up on all nodes. For the most part everything is set up as is documented on prestodb.io.

I first start the server on all nodes (coordinator and workers), and then start the CLI on the coordinator/name node. When I try to run a query using the below commands I get a "Query ... No worker nodes available" error:

presto-cli presto-cli --server localhost:8080 --catalog jmx --schema default 
presto:default> SELECT * FROM sys.node;
"Query ... No worker nodes available"

If I include the node-scheduler.include-coordinator=true in my coordinator config.properties file, 1 node is returned from this query.

Configs:

etc/config.properties (only on coordinator)

coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
task.max-memory=1GB
discovery-server.enabled=true
discovery.uri=http://aws.internal.ip.of.coordinator:8080 

etc/config.properties (only on workers)

coordinator=false
http-server.http.port=8080
task.max-memory=1GB
discovery.uri=http://aws.internal.ip.of.coordinator:8080  

etc/catalog/hive.properties (all nodes)

connector.name=hive-hadoop2
hive.metastore.uri=thrift://aws.internal.ip.of.coordinator:9083  

etc/catalog/jmx.properties (all nodes)

connector.name=jmx   

etc/jvm.config (all nodes)

-server
-Xmx16G
-XX:+UseConcMarkSweepGC
-XX:+ExplicitGCInvokesConcurrent
-XX:+CMSClassUnloadingEnabled
-XX:+AggressiveOpts
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
-XX:ReservedCodeCacheSize=150M    

etc/log.properties

com.facebook.presto=INFO 

etc/node.properties

node.environment=production
node.id=unique-uuid #used uuidgen
node.data-dir=/mnt/presto-data
Piotr Findeisen
  • 19,480
  • 2
  • 52
  • 82
DJElbow
  • 3,345
  • 11
  • 41
  • 52

1 Answers1

1

Simple mistake on my part was making this not run. I had a random semi-colon instead of a period in my aws.internal.ip.of.coordinator IP address. Looking at my configs I just didn't see it.

The above code will work on an Amazon EMR multi-node cluster similar to the one above.

DJElbow
  • 3,345
  • 11
  • 41
  • 52