-1

I am looking for a document DB supporting Windows XP 32 bits, satisfying the following requirements:

  • The support must not be discontinued. I.e. I want to be able to install the most recent version of the DB. MongoDB does not fit, since they dropped supporting XP and CouchDB does not fit since they dropped supporting any Windows 32 bits.
  • It should be relatively simple. Obviously, the application is not an enterprise one, so a complex DB, like Cassandra, is out. In fact, I would like to avoid column databases, since I think they exist to solve enterprise level problems, which is not the case here. On the other hand, I do not want relational DBs, because I want to avoid DB upgrades each time new fields are added (and they will be added).
  • It should support indexing on part of the document, like MongoDB. I could use a relational DB, like hsqldb to store the data as json string. This lets adding new fields easy - no schema needs to be changed. But these fields would not be indexable by the database. Again, unlike MongoDB.
  • Finally, the DB will run on the same machine as the application itself - one more down for MongoDB, which would steal all the RAM from the application to itself.

So, in a sense, I am looking for something like MongoDB, but with the support of Windows XP 32 bits.

Any advices?

P.S.

I know that Windows XP has one year to live before MS drops supporting it. However, I have to support XP anyway.

Community
  • 1
  • 1
mark
  • 59,016
  • 79
  • 296
  • 580
  • Does the database HAVE to be on windows? Or jus the application? Also "Obviously, the application is not an enterprise one, so a complex DB, like Cassandra, is out." What is that supposed to mean? MySQL is a complex database... – Sammaye Mar 18 '13 at 15:47
  • MySQL may be complex, but working with it is simple. BTW, Mysql can be run embedded (libmysqld.dll), which makes it simple in using and deploying. – mark Mar 18 '13 at 16:16
  • The DB must be running on the same machine as the application - I will edit the question to emphasize the fact. – mark Mar 18 '13 at 16:17

2 Answers2

1

With HSQLDB and some other relational databases, you store the document as a CLOB. This clob can be accessed via a single table which contains the index for all the indexed fields. For example

CREATE TABLE DATAINDEX(DOCID BIGINT GENERATED BY DEFAULT AS IDENTITY, FIELDNAME VARCHAR(128), FIELD VARCHAR(10000),
    DOCUMENT CLOB, PRIMARY KEY (DOCID, FIELDNAME))

CREATE INDEX IDS ON (FIELDNAME, FIELD);

The whole document is the CLOB. A copy of selected fields that need an index for searching is stored in the (fieldnname, field) columns. The rows with the same DOCID will have the same CLOB in the DOCUMENT column. One row is inserted with the first field and the clob, then it is duplicated by selecting and inserting the existing DOCID and clob with the second field and so on.

-- use this to insert the CLOB with the first field
INSERT INTO DATAINDEX VALUES DEFAULT, 'f1', 'fieldvalue 1', ?
-- use this to insert the second, third and other fields
INSERT INTO DATAINDEX VALUES
    IDENTITY(), 'f2', 'filedvalue 2', 
    (SELECT DOCUMENT FROM DATAINDEX WHERE DOCID = IDENTITY() LIMIT 1)

The above is just one example. You can create your own DOCID. The principle is to use the same DOCID and to insert the first row with the CLOB. The second and third rows select the DOCID and the clob from the previously inserted row to create new rows with the other fields. You will probably use JDBC parameters to insert into the FIELDNAME and FIELD columns.

This allows you to perform searches such as:

SELECT DOCID, DOCUMENT FROM DATAINDEX 
    WHERE FIELDNAME = 'COMPANY NAME' AND FIELD LIKE  'Corp%'

This may not satisfy all your requirements, but the answer is intended to cover what is possible with HSQLDB.

fredt
  • 24,044
  • 3
  • 40
  • 61
  • I do not quite understand it. If I have a JSON document `{f1: x1, f2: x2, f3: {g1: y1, g2: y2}}`, then do you mean I will have 5 copies of the JSON document with `(fieldname, field)` taking values `('f1', x1)`, `('f2', x2)`, `('f3.g1', y1)`, `('f3.g2', y2)` and `()`? – mark Mar 18 '13 at 20:56
  • I have updated the answer. For example, you'll have 4 rows with the same CLOB reference. One clob can be repeatedly referenced in many rows. – fredt Mar 18 '13 at 21:36
  • How do you reference the same CLOB in several different records? – mark Mar 19 '13 at 02:02
  • I see. In my case, I store files with metadata (the schema of the latter is flexible). So, a single row contains a CLOB for metadata and a BLOB for a file. Can the SELECT statement inside INSERT fetch two fields, like DOCUMENT and FILE? Does the DB engine duplicate the contents of the LOBs or just a reference to them? – mark Mar 19 '13 at 09:53
  • The DB engine uses references to LOBs and does not duplicate contents. You can write a PROCEDURE to select both LOBs into two variables then insert the variables into new rows. – fredt Mar 19 '13 at 10:04
  • To make sure I understand you. Do you suggest using a stored procedure instead of using an INSERT statement with two SELECT substatements? That stored procedure would do one select statement to get the DOCUMENT and FILE LOBs and then issue an INSERT statement as many times as necessary. Am I correct? – mark Mar 19 '13 at 10:16
  • Yes, that's what I'm suggesting, although an INSERT with two subselects is OK too. Also consider the alternative of having one table for DOCID, YOURCLOB, YOURBLOB with single rows for each document, and another table with DOCID, FIELDNAME, FIELD, which has multiple rows per document. – fredt Mar 19 '13 at 10:19
1

Which programming framework are you using? If .NET is a possibility you can try RavenDB. It can be used as both an embedded and standalone database.

For Java you can try out OrientDB. It is also embeddable: https://github.com/nuvolabase/orientdb/wiki/Embedded-Server

uldall
  • 2,458
  • 1
  • 17
  • 31
  • We are using Java. The question is much does RavenDB add to the cost. – mark Mar 18 '13 at 18:28
  • I updated the answer with a suggestion for Java. – uldall Mar 18 '13 at 20:03
  • Need to check it. Thanks. – mark Mar 19 '13 at 09:20
  • I cannot find any sensible documentation/tutorial on using the Java object API of this database. Feels like one has to do much reverse engineer to actually use it. – mark Mar 19 '13 at 10:08
  • Have you looked here: 1. https://www.assembla.com/spaces/orientdb/wiki/The_Java_API_ 2. http://code.google.com/p/orient/wiki/JavaAPI 3. http://code.google.com/p/orient/wiki/ObjectDatabase – uldall Mar 19 '13 at 10:37
  • Thanks, will look there. I found https://code.google.com/p/orient/wiki/ObjectDatabase – mark Mar 19 '13 at 11:10
  • Checked it. It does not look mature. POJO binding is broken. Their google group has some scary questions posted just recently and addressing them seems to be the work in progress. I will not risk using them in a production system. Maybe in a couple of years. – mark Mar 27 '13 at 11:18