1

I have built and deployed the following docker-compose.yml file:

services:
  solr1:
    container_name: solr1
    image: solr:5-slim
    ports:
      - "9981:9983"
      - "8981:8983"
    volumes:
      - data:/var/solr
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/schema.xml
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/conf/managed-schema
      - ./solr_configs:/opt/solr/server/solr/configsets/mri_config/conf
    environment:
      - ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
    networks:
      - solr
    depends_on:
      - zoo1
      - zoo2
      - zoo3

  solr2:
    image: solr:5-slim
    container_name: solr2
    ports:
      - "9982:9983"
      - "8982:8983"
    volumes:
      - data:/var/solr
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/schema.xml
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/conf/managed-schema
      - ./solr_configs:/opt/solr/server/solr/configsets/mri_config/conf
    environment:
      - ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
    networks:
      - solr
    depends_on:
      - zoo1
      - zoo2
      - zoo3

  solr3:
    image: solr:5-slim
    container_name: solr3
    ports:
      - "9983:9983"
      - "8983:8983"
    volumes:
      - data:/var/solr
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/schema.xml
      - ./solr_configs/schema.xml:/opt/solr/server/solr/configsets/mri_config/conf/managed-schema
      - ./solr_configs:/opt/solr/server/solr/configsets/mri_config/conf
    environment:
      - ZK_HOST=zoo1:2181,zoo2:2181,zoo3:2181
    ports:
      - 9983:9983
      - 8983:8983
    networks:
      - solr
    depends_on:
      - zoo1
      - zoo2
      - zoo3

  zoo1:
    image: zookeeper:3.4
    container_name: zoo1
    restart: always
    hostname: zoo1
    ports:
      - 2181:2181
    environment:
      ZOO_MY_ID: 1
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
    networks:
      - solr

  zoo2:
    image: zookeeper:3.4
    container_name: zoo2
    restart: always
    hostname: zoo2
    ports:
      - 2182:2181
    environment:
      ZOO_MY_ID: 2
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
    networks:
      - solr

  zoo3:
    image: zookeeper:3.4
    container_name: zoo3
    restart: always
    hostname: zoo3
    ports:
      - 2183:2181
    environment:
      ZOO_MY_ID: 3
      ZOO_SERVERS: server.1=zoo1:2888:3888 server.2=zoo2:2888:3888 server.3=zoo3:2888:3888
    networks:
      - solr

  sdc:
    image: streamsets/datacollector
    ports:
      - 18630:18630
    volumes:
      - /local/directory/path/streamsets:/data:rw
    networks:
      - solr

networks:
  solr:

volumes:
  data:

I then created a Solr collection called collection1 by executing

docker exec solr1 solr create -c collection1

After transforming my data in Streamsets, I then installed and added the Solr-6.1.0 destination module by adding it from the package manager and then restarting the sdc docker container. This is my setup for both SolrCloud settings and Single Node settings in Streamsets:

SolrCloud Streamsets Configuration

Single Node Streamsets Configuration

Everytime I run the preview or the pipeline I get an error that states:

SOLR_3 - Could not connect to the Solr instance: java.lang.ClassCastException: org.apache.solr.common.util.SimpleOrderedMap cannot be cast to java.util.Map

What does this error mean? What should I change or add to my pipeline to be able to connect the pipeline and pipe the data directly into the Solr collection?

Here is also an attached image of the Solr Admin UI state.json for the collection.

Solr Admin UI

SDC Stack Trace:

SDC Stack Trace

Any help is much appreciated.

metadaddy
  • 4,234
  • 1
  • 22
  • 46
statsguy
  • 123
  • 1
  • 12
  • 1
    Could you edit your question to include the full stack trace from sdc.log? Thanks! – metadaddy Oct 30 '19 at 19:39
  • 1
    Also, why are you running a Solr 5 cluster (via your `docker` commands), but using the Solr 6 StreamSets stage? – Jeff Evans Oct 30 '19 at 21:06
  • @metadaddy added sdc stack trace, it looks to be connecting to zookeeper but it's loading an empty cluster property... Is that main issue here? – statsguy Oct 31 '19 at 15:21
  • 1
    It says it found live nodes immediately after that, so I'm not sure that's the problem. – metadaddy Oct 31 '19 at 16:38
  • Is the problem as simple as the module only works for Solr 6.1.0 and higher? – statsguy Oct 31 '19 at 16:57
  • That seems like the most likely explanation. Is there a particular reason you're using an older version? – metadaddy Nov 01 '19 at 16:23
  • Our current stack uses Solr 4.10.3, so 5.5 was the closest we could get on Docker to develop on. – statsguy Nov 01 '19 at 16:34
  • 1
    @metadaddy we got it working with 5.5-slim, had to install the CDH 5.15.0 module from the streamsets package manager and then specify the node in Single Node and that let us connect. – statsguy Nov 07 '19 at 16:14
  • @statsguy Could you paste resolution into the answer (below) so that it's clear for everyone else? Thanks! – metadaddy Nov 09 '19 at 01:35

1 Answers1

0

I ended up installing the CDH 5.15.0 package from the package manager. This allowed us to connect to Solr 5.5-slim that was in our docker container. Within the general tab for the Solr destination module we then selected the CDH 5.15.0 package in the staging library field and then defining the single node information allowed us to connect to the Solr database.

Solr Module General Tab

statsguy
  • 123
  • 1
  • 12