0

While running a query on Dremio 4.6.1 installed on Kubernetes, we are getting the following error message from Dremio UI:

ExecutionSetupException: One or more nodes lost connectivity during query. Identified nodes were [dremio-executor-2.dremio-cluster-pod.dremio.svc.cluster.local:0].

Dremio-env config has the following settings: DREMIO_MAX_DIRECT_MEMORY_SIZE_MB=13384 DREMIO_MAX_HEAP_MEMORY_SIZE_MB is not set We are using workers of 16G /8c (Total of 10 workers) 1 Master Coordinator with the same config Zookeeper with 1G/ 1c

Any idea what's causing this behavior ?

By doing a live logs tail before the worker crashes here are the logs:

An irrecoverable stack overflow has occurred.
Please check if any of your loaded .so files has enabled executable stack (see man page execstack(8))
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f41cdac4fa8, pid=1, tid=0x00007f41dc2ed700
#
# JRE version: OpenJDK Runtime Environment (8.0_262-b10) (build 1.8.0_262-b10)
# Java VM: OpenJDK 64-Bit Server VM (25.262-b10 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# C  0x00007f41cdac4fa8
#
# Core dump written. Default location: /opt/dremio/core or core.1
#
# An error report file with more information is saved as:
# /tmp/hs_err_pid1.log
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

[error occurred during error reporting , id 0xb]

Abba
  • 519
  • 6
  • 17
  • dremio-executor-2 pod is not able to connect during execution , check the logs for this pod. Also the logs which you shared has no error , it successfully started. – Tarun Khosla Jul 31 '20 at 08:08
  • it's worth saying that the worker node restarts several times while trying to run this query. I will try to get the logs before it crashes @TarunKhosla – Abba Jul 31 '20 at 08:27
  • if there is a restart there is a connectivity loss , am I missing something ? – Tarun Khosla Jul 31 '20 at 09:01
  • The pod restarts after dremio-worker produces the logs below – Abba Jul 31 '20 at 10:54

0 Answers0