Group multiple rows as one unit

Question

I'm very new to ML, so please forgive me if this is a basic question.

In my SQL database, I have an invoice table and corresponding lineItems table. In the invoice table, I have a flag indicating if the customer paid the invoice.

Using the lineItems in the invoice, I am trying to predict if an invoice will be paid by the client.

I figured this was going to be similar to the Titanic dataset, which is used to predict if a passenger survived or not. However, what makes my case a bit more difficult is that I have multiple rows per invoice, whereas, in the Titanic dataset, all of the data is in one row.

In Azure ML Studio, I can import data from Mongo/Cosmos, so I was planning on copying the data by creating a document with the invoice and lineItems this way Azure ML would treat it as a unit vs. having multiple rows in SQL.

Not sure if I need to do this, or if I can just join my tables and Azure ML can tell where the groups are because of the InvoiceId field in the LineItem table.

Does the line items table have a small set of known values, or large, possibly unlimited set of values such as free-form text? The feature engineering approach depends on this. — Roope Astala - MSFT, Oct 18 '19 at 00:16
A combination of values from dropdown lists, but also some "free-form" fields where we enter amounts. — Andy T, Oct 19 '19 at 00:47

Group multiple rows as one unit

0 Answers0