-1

I need to use ML.NET to process large blocks of text and determine if any given block of text will potentially fall into some of many different categories.

I currently have multiple boolean columns which I want to flag to true when matches are found via ML.NET for any given block of text.

I am completely new to ML and When pouring through samples for classifications it seems to be only one classification for any one block of text. Can anyone point me in a direction to handle many classifications for a single block of text? Perhaps some sample code?

chrisg229
  • 909
  • 2
  • 7
  • 21

1 Answers1

1

This is a so called multi-classification problem. In the case when we are working with (one) boolean column, that is a binary case where it can either be Yes or No, True or False. What you'll need to do is instead is to have a type column with multiple possible values, e.g. one for each type of text it may be. A good example may be an issue classifier here:

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/end-to-end-apps/MulticlassClassification-GitHubLabeler

If you are comfortable working with Jupyter Notebooks, here's another example I've created: https://github.com/aslotte/mlnet-jupyter/blob/master/src/DataView/multi-class%20classification.ipynb

I hope that helps!

Alex S
  • 51
  • 3