Using Amazon EMR, Hive .13, Hadoop 2.x, and Presto Server 0.89. Trying to set up Presto to query data that is usually queried through Hive. Hive metadata is stored in MySQL. Presto Server is installed set up on all nodes. For the most part everything is set up as is documented on prestodb.io.
I first start the server on all nodes (coordinator and workers), and then start the CLI on the coordinator/name node. When I try to run a query using the below commands I get a "Query ... No worker nodes available" error:
presto-cli presto-cli --server localhost:8080 --catalog jmx --schema default
presto:default> SELECT * FROM sys.node;
"Query ... No worker nodes available"
If I include the node-scheduler.include-coordinator=true
in my coordinator config.properties file, 1 node is returned from this query.
Configs:
etc/config.properties (only on coordinator)
coordinator=true
node-scheduler.include-coordinator=false
http-server.http.port=8080
task.max-memory=1GB
discovery-server.enabled=true
discovery.uri=http://aws.internal.ip.of.coordinator:8080
etc/config.properties (only on workers)
coordinator=false
http-server.http.port=8080
task.max-memory=1GB
discovery.uri=http://aws.internal.ip.of.coordinator:8080
etc/catalog/hive.properties (all nodes)
connector.name=hive-hadoop2
hive.metastore.uri=thrift://aws.internal.ip.of.coordinator:9083
etc/catalog/jmx.properties (all nodes)
connector.name=jmx
etc/jvm.config (all nodes)
-server
-Xmx16G
-XX:+UseConcMarkSweepGC
-XX:+ExplicitGCInvokesConcurrent
-XX:+CMSClassUnloadingEnabled
-XX:+AggressiveOpts
-XX:+HeapDumpOnOutOfMemoryError
-XX:OnOutOfMemoryError=kill -9 %p
-XX:ReservedCodeCacheSize=150M
etc/log.properties
com.facebook.presto=INFO
etc/node.properties
node.environment=production
node.id=unique-uuid #used uuidgen
node.data-dir=/mnt/presto-data