0

I am currently trying out the ignite database as an in-memory chache on top of a postgres database. The data that sits in the postgres database is basically data that was produced in conformity with the tpc-h schema. After the data was inserted into the postgres I loaded the data into the ignite cache. According to some count(*) queries on the ignite cache and the postgres, every row from the postgres is prensent in the ignite cache. Thats the given situation. Now I would asume a query on the postgres gives the same result as a query on the ignite cache. That' not the case for my queries.

This is the postgres query:

SELECT l_orderkey, SUM(l_extendedprice * ( 1 - l_discount )) AS revenue, o_orderdate, o_shippriority FROM customer, orders, lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1998-06-01' AND l_shipdate > DATE '1998-06-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate LIMIT 10;";

This is the ignite query:

SELECT l_orderkey, SUM(l_extendedprice * ( 1 - l_discount )) AS revenue, o_orderdate, o_shippriority FROM "CustomerCache".customer, "OrdersCache".orders, "LineitemCache".lineitem WHERE c_mktsegment = 'BUILDING' AND c_custkey = o_custkey AND l_orderkey = o_orderkey AND o_orderdate < DATE '1998-06-01' AND l_shipdate > DATE '1998-06-01' GROUP BY l_orderkey, o_orderdate, o_shippriority ORDER BY revenue DESC, o_orderdate LIMIT 10;";

As you can see above, the queries are nearly identical. Only the "FROM" parts are different. This must be like this because the ignite cache needs to be addressed in the following way: "CACHENAME".TABLENAME .

Ignite results:

Ignite results

Postgres results:

Postgres results

The ignite cache doesnt't return any results. The postgres returns the expected results. How is that possible? As a reminder: The complete data was loaded into the ignite cache. When I count the rows in the chached tables, they are as many as in the postgres tables. Question: Why doesn't the ignite resturn the right results for the query above.

The Ignite consists of two nodes deployed in a GKE Cluster. The Ignite config looks like this: NODE-CONFIGURATION.XML

The data was loaded from postgres to cache by deploying an ignite client to the cluster. This client runs the java function IgniteCache.#loadCache() on every cache.

Christian
  • 33
  • 4
  • What's your cluster configuration? How are your caches defined? Are they partitioned or replicated? Are you using affinity keys? How did you load your data from Postgres? Are you sure the o_orderdate column is a date? – Stephen Darlington Sep 17 '21 at 13:49
  • I added some information to the question above. The column "o_orderdate" is configured as "DATE". The config (linked in the question) configures the field in the cache as a "java.sql.Types.DATE" / "java.sql.Date". Seems correct to me, what do you think? – Christian Sep 20 '21 at 07:24
  • Since you're using SELECT clause with multiple tables you query actually is a query with JOIN. JOIN in Ignite requires proper affinity key configuration for records in joined table. So I would suggest you to review query entities configuration in cache config. See details here: https://www.gridgain.com/docs/latest/developers-guide/SQL/sql-api#query-entities https://ignite.apache.org/docs/latest/persistence/external-storage#cachejdbcpojostore – solveMe Sep 22 '21 at 07:39
  • Looks weird, I don't see any issues with the config - tables are REPLICATED, so the join should work. Can you try removing stuff from the WHERE clause in the query, expression by expression? Perhaps then you'll see which part causes an issue. Also, take a look at the logs of the server nodes - whether there are any warnings or errors. – Stanislav Lukyanov Sep 25 '21 at 13:59
  • Turns out this was a problem with our dataset. The dataset got some trailing spaces for the c_mktsegment field. This resulted in an empty resultset when executing a query with "c_mktsegment = 'BUILDING'" in the where statement. – Christian Oct 07 '21 at 14:03

1 Answers1

0

The empty resultset wasn't a problem regarding the cache config. It was a problem regarding the dataset. We got trailing spaces in some fields. Executing a query that compares the field with an = instead of a LIKE only works without trailing spaces ;)

Christian
  • 33
  • 4