0

I'm very new to ML, so please forgive me if this is a basic question.

In my SQL database, I have an invoice table and corresponding lineItems table. In the invoice table, I have a flag indicating if the customer paid the invoice.

Using the lineItems in the invoice, I am trying to predict if an invoice will be paid by the client.

I figured this was going to be similar to the Titanic dataset, which is used to predict if a passenger survived or not. However, what makes my case a bit more difficult is that I have multiple rows per invoice, whereas, in the Titanic dataset, all of the data is in one row.

In Azure ML Studio, I can import data from Mongo/Cosmos, so I was planning on copying the data by creating a document with the invoice and lineItems this way Azure ML would treat it as a unit vs. having multiple rows in SQL.

Not sure if I need to do this, or if I can just join my tables and Azure ML can tell where the groups are because of the InvoiceId field in the LineItem table.

Andy T
  • 10,223
  • 5
  • 53
  • 95
  • Does the line items table have a small set of known values, or large, possibly unlimited set of values such as free-form text? The feature engineering approach depends on this. – Roope Astala - MSFT Oct 18 '19 at 00:16
  • A combination of values from dropdown lists, but also some "free-form" fields where we enter amounts. – Andy T Oct 19 '19 at 00:47

0 Answers0