0

I am testing MySQL Document Store. In order to properly compare with our relational tables, I am attempting to transform our current table into a collection. There approximately 320K records in the table that I wish to export and add to a new collection. I am attempting to use the Connector/Node.js to do this. To avoid blowing it up, I am attempting to add 10K records at a time, but only the first 10K records are inserted. I have confirmed that it is the first 10K records, it is not overwriting each iteration. And those 10K records are correctly structured.

const mysqlx = require('@mysql/xdevapi');
const config = {
    password: 'notMyPassword',
    user: 'notMyUser',
    host: 'notMyHost',
    port: 33060,
    schema: 'sample'
};
var mySchema;
var myCollection;
var recCollection = [];
mysqlx.getSession(config).then(session => {
    mySchema = session.getSchema('sample');
    mySchema.dropCollection('sample_test');
    mySchema.createCollection('sample_test');
    myCollection = mySchema.getCollection('sample_test');
    var myTable = mySchema.getTable('sampledata');
    return myTable.select('FormDataId','FormId','DateAdded','DateUpdated','Version','JSON').orderBy('FormDataId').execute();
    }).then(result => {
        console.log('we have a result to analyze...');
        var tmp = result.fetchOne();
        while(tmp !== null && tmp !== '' && tmp !== undefined){
            var r = tmp;
            var myRecord = {
                'dateAdded': r[2],
                'dateUpdated': r[3],
                'version': r[4],
                'formId': r[1],
                'dataId': r[0],
                'data': r[5]
                };
            recCollection.push(myRecord);
            if (recCollection.length >= 10000){
                console.log('inserting 10000');
                try {
                    myCollection.add(recCollection).execute();
                } catch(ex){
                    console.log('error: ' + ex);
                }
                recCollection.length = 0;
            }
            tmp = result.fetchOne();
        }
        
    });
Loco
  • 3
  • 2

1 Answers1

0

It looks like an issue related to how you handle asynchronous execution. The execute() method of a CollectionAdd statement is asynchronous and it returns a Promise.

In a while loop, unless you "await" for the execution to finish, the construct is not able to handle it for you, even though the first call to the asynchronous method always goes through. That's why it only adds the first 10k documents.

You also need to be careful with APIs like createCollection() and dropCollection() because they are also asynchronous and return back a Promise similarly.

Using your own example (without looking into the specifics) it can be something like the following:

const mysqlx = require('@mysql/xdevapi');

const config = {
  password: 'notMyPassword',
  user: 'notMyUser',
  host: 'notMyHost',
  port: 33060,
  schema: 'sample'
};

// without top-level await
const main = async () => {
  const session = await mysqlx.getSession(config);

  const mySchema = session.getSchema('sample');
  await mySchema.dropCollection('sample_test');

  const myCollection = await mySchema.createCollection('sample_test');
  const myTable = mySchema.getTable('sampledata');

  const result = await myTable.select('FormDataId', 'FormId', 'DateAdded', 'DateUpdated', 'Version', 'JSON')
    .orderBy('FormDataId')
    .execute();

  const tmp = result.fetchOne();
  while (tmp !== null && tmp !== '' && tmp !== undefined) {
    let r = tmp;
    let myRecord = {
      'dateAdded': r[2],
      'dateUpdated': r[3],
      'version': r[4],
      'formId': r[1],
      'dataId': r[0],
      'data': r[5]
    };

    recCollection.push(myRecord);

    if (recCollection.length >= 10000){
      console.log('inserting 10000');
    
      try {
        await myCollection.add(recCollection).execute();
      } catch (ex) {
        console.log('error: ' + ex);
      }

      recCollection.length = 0;
    }

    tmp = result.fetchOne();
  }
}

main();

Disclaimer: I'm the lead developer of the MySQL X DevAPI connector for Node.js

ruiquelhas
  • 1,905
  • 1
  • 17
  • 17
  • Thank you for that assistance. I am new to JS, so much of the nuance is over my head. My understanding of the await keyword is that it can only be used in a module, not in a script. So, my .js file throws an error on the use of await. I saved the file as a .mjs to help node recognize it as a module, but then the require keyword throws an error. I need to use an import according to the error, but I can't find any documentation on how to import the mysql/xdevapi. – Loco Feb 08 '23 at 18:17
  • Right, top-level await is only available under special conditions. What you can do to make it work without any of that is to simply wrap the workflow in an `async` function and call it normally. I'll edit the answer for clarity. – ruiquelhas Feb 08 '23 at 18:50
  • If you end up going the mjs route you should be able to use `import * as mysqlx from '@mysql/xdevapi'` instead of the regular CommonJS `require()`. – ruiquelhas Feb 08 '23 at 18:55
  • I figured out that I can use await in a module or an async function. So, I built an async function and put the code inside that. However, I am still only getting the first 10000 records. The subsequent addCollection() calls are not putting data into the collection. – Loco Feb 08 '23 at 21:11
  • I figured it out. Final solution was to use the async function, with the await calls as you recommended. Then, I removed the 10000 record batching and ran the add on each individual record. I think that the batching was making inserts that were too large and it was killing the connection. I very much appreciate your assistance on this. I would not have found the solution without your help. – Loco Feb 08 '23 at 21:26