Data loss when using Fiware Orion Broker, QuantumLeap and CrateDB

Question

I'm using Fiware Orion Broker, QuantumLeap and CrateDB, with the aim of recording all temporal data in cratedb.

My docker-compose configuration is this:

orion:
    image: fiware/orion:${ORION_VERSION}
    hostname: orion
    container_name: fiware-orion
    depends_on:
        - mongo-db
    networks:
        - fiware
    expose:
        - "${ORION_PORT}"
    ports:
        - "${ORION_PORT}:${ORION_PORT}"
    command: -dbhost mongo-db
    healthcheck:
        test: curl --fail -s http://orion:${ORION_PORT}/version || exit 1
        interval: 5s

mongo-db:
    image: mongo:latest
    hostname: mongo-db
    container_name: db-mongo
    expose:
        - "${MONGO_DB_PORT}"
    ports:
        - "${MONGO_DB_PORT}:${MONGO_DB_PORT}"
    networks:
        - fiware
    volumes:
        -  ./volumes/mongo-db:/data
    healthcheck:
        test: |
            host=`hostname --ip-address || echo '127.0.0.1'`; 
            mongo --quiet $host/test --eval 'quit(db.runCommand({ ping: 1 }).ok ? 0 : 2)' && echo 0 || echo 1
        interval: 5s

quantumleap:
    image: orchestracities/quantumleap:latest
    hostname: quantumleap
    container_name: fiware-quantumleap
    ports:
        - "${QUANTUMLEAP_PORT}:${QUANTUMLEAP_PORT}"
    depends_on:
        - crate-db
        - redis-db
    environment:
        - CRATE_HOST=crate-db
        - LOGLEVEL=WARNING
    healthcheck:
        test: curl --fail -s http://quantumleap:${QUANTUMLEAP_PORT}/version || exit 1
    networks:
        - fiware

crate-db:
    image: crate:latest
    hostname: crate-db
    container_name: db-crate
    ports:
        - "4200:4200"
        - "4300:4300"
    command: crate -Cauth.host_based.enabled=false  -Ccluster.name=democluster -Chttp.cors.enabled=true -Chttp.cors.allow-origin="*" -Cdiscovery.type=single-node
    environment:
        - CRATE_HEAP_SIZE=2g 
    volumes:
        - ./volumes/crate-db:/data
    networks:
        - fiware

I'm running performance tests through Apache JMeter, and consecutive requests are sent for 1 minutes to evaluate their performance, but I'm having a problem where some data is not being registered in the CrateDB, that is, in the last test I did, it was done about 18000 requests, and in CrateDB only about 10000 are registered.

I also tried using the TimescaleDB database in QuantumLeap, but the same problem happens, so I assume that the problem is not with the database.

Does anyone know what the problem could be?

Do you see failed inserts in the `sys.jobs_log` table in CrateDB? e.g. run `SELECT * FROM sys.jobs_log WHERE error is not null` — proddata, Mar 16 '23 at 08:00
If there are 0 results, this implies that there was no errors with the insert. Did you see any restarts with any of the containers? Which version of CrateDB did you use? Could you run a `REFRESH TABLE tablename` before checking the record count. It might make more sense to discuss more complex issues in the CrateDB Community community.crate.io — proddata, Mar 16 '23 at 13:34
I'm using the latest version of the CrateDB docker image. I'm suspicious that the problem is not with the database, but with the fiware components, because QuantumLeap also supports the TimescaleDB database and the same problem happens. I will add this information to the question. Thanks! — Nuno Rolo, Mar 16 '23 at 13:57
QL is implemented in Python, whereas Orion is implemented in C/C++. Orion is of course way faster than QL. We've seen this before and the reason was that QL dropped notifications as it isn't able to keep up. Can you increase the "size" of QL (more RAM, more cores) and see if the problem diminishes a little? — kzangeli, Mar 21 '23 at 09:36
@kzangeli Thanks for the reply and the tip. Yes, I've already tried increasing the number of QuantumLeap workers to 17 and I put the Orion Broker in threadpool mode, and the data loss is less but it still happens. Regardless of available resources, this data loss should not happen. Thanks! — Nuno Rolo, Mar 21 '23 at 16:46
Well, if you load your slow QL with too many notifications, internal TCP queues will overflow and the result will be that you lose packages. Not much you can do about that. Talk to Martel, add an issue on Quantum Leaps github. Or, try with Cygnus perhaps? You could even upgrade to Orion-LD that does its own history, but then you're stuck with Postgres/timescaleDB, I'm afraid. FYI: Orion/Orion-LD is fully tested under stress and not one single notification is lost. Stressed with perhaps 20 times more than QL can handle — kzangeli, Mar 22 '23 at 21:00
Thanks @kzangeli, I've already opened an issue on QL github, and I understand that the problem is in the QL itself. Perhaps a machine with more resources will solve the problem, but I will try some alternatives to subscribe to Orion notifications. — Nuno Rolo, Mar 23 '23 at 12:45

Data loss when using Fiware Orion Broker, QuantumLeap and CrateDB

0 Answers0