2

I am using monetdb on a 16GB Macbook Pro with OSX 10.10.4 Yosemite.

I execute queries with SQLWorkbenchJ (configured with a minimum of 2048M RAM).

I find the performance overall erratic:

  • performance is acceptable / good with small size tables (<100K rows)
  • abysmal with tables with many rows: a query with a join of two tables (8670 rows and 242K rows) and a simple sum took 1H 20m!!

My 16GB of memory notwithstanding, in one run I never saw MSERVER5 using more than 35MB of RAM, 450MB in another. On the other hand the time is consumed swapping data onto disk (according to Activity Monitor over 160GB of data!).

There are a number of performance-related issues that I would like to understand better:

  • I have the impression that MonetDB struggles with understanding how much RAM to use / is available in OSX. How can I "force" MonetDB to use more RAM?
  • I use MonetDB through R. The MonetDB.R driver converts all the character fields into CLOB. I wonder if CLOBs create memory allocation issues?
  • I find difficult to explain the many GBs of writes (as mentioned >150GB!!) even for index creation or temporary results. On the other hand when I create the DB and load the tables overall the DB is <50MB. Should I create an artificial integer key and set it as index?
  • I join 2 tables on a timestamp field (e.g. "2015/01/01 01:00") that again is seen as a text CLOB by MonetDb / MonetDb.R. Should I just convert it to integer before saving it to MonetDb?
  • I have configured each table with a primary key, using a field of type integer. MonetDB (as a typical columnar database) doesn't need the user to specify an index. Is there any other way to improve performance?

Any recommendation is welcome.

For clarity the two tables I join have the following layout:

Calendar # classic calendar table with one entry per our in a year = 8760 rows
Fields: datetime, date, month, weekbyhour, monthbyday, yearbyweek, yearbymonth # all fields are CLOBs as mentioned

Activity # around 200K rows
Fields: company, department, subdepartment, function, subfunction, activityname, activityunits, datetime, duration # all CLOBs except activityunits; datetime refers to when the activity has occurred

I have tied various types of join syntax, but an example would (`*` used for brevity)
select * from Activity as a, Calendar as b where a.datetime=b.datetime
Enzo
  • 2,543
  • 1
  • 25
  • 38
  • More details please, schema, queries etc. My guess would be a huge intermediate result being created. Generally, MonetDB works fine on OSX. – Hannes Mühleisen Jul 23 '15 at 13:36
  • @HannesMühleisen tried to edit the question on your suggestion. Agreed: one of the issues is how to avoid huge intermediate results... – Enzo Aug 21 '15 at 09:59
  • You might consider declare calendar.datetime as primary key and double check your database for repeating values. That may produce a Cartesian product. – mkersten Aug 26 '15 at 15:09

0 Answers0