Can anybody tell me how to create training data for categorization. I am using OpenNLP for categorization. Is there any tool to create training data or if i have to create it manually then how it should be done? I am a complete noob in this field. Please help
Asked
Active
Viewed 736 times
-1
-
Please repeat [on topic](https://stackoverflow.com/help/on-topic) and [how to ask](https://stackoverflow.com/help/how-to-ask) from the [intro tour](https://stackoverflow.com/tour). "Show me how to solve this coding problem?" is off-topic for Stack Overflow. You have to make an honest attempt at the solution, and then ask a *specific* question about your implementation. Stack Overflow is not intended to replace existing tutorials and documentation. – Prune Dec 05 '20 at 22:24
1 Answers
0
Well, normally you have some kind of historical data of previous (manual) categorization. Else you would have to create the data that your need somehow. Such data is often created by observation.
Although it heavy depends on the data you are trying to categorize.
If your are able to generate training data you would have a perfect algorithm for the data, and you would not need to train a system, would you?
If it is not possible to have training data, you might have to look at algorithms which don't need to learn upfront, i.e. which learn as data comes in and someone is constantly correcting the system's faults.

cimnine
- 3,987
- 4
- 33
- 49