-4

I have a list of categories like Sports,Game,Religion,Finance,Market Rates,I.T,Health,Adult,Business,B2B, government, politics, education etc.. now I want to categorise a text paragraph into these categories, actually I extract whole text from a particular URL and want to categorize text into my categories, at this time I'm using dbpedia,also I have used many technologies, but unfortunately I'm still not reaching to my aim, can someone help me please...I shall be grateful.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Azhar Ak
  • 1
  • 1
  • what kind of text is it and with what criterias do you want to categorize? Keywords? Semantic analysis? If you are looking for tools and libraries you might want to ask this question on softwarerecs.stackexchange.com (but you will need to bring much more information for this to be a valid question) – Angelo Fuchs Apr 09 '14 at 10:10
  • thanks for reply, and the text is English text like (Finance is a field within economics that deals with the allocation of assets and liabilities over time under conditions of certainty and uncertainty.) and I don't know which criteria is more suitable for my program. but I want to do this with Semantic analysis. I don't need tools because I have already read many tools, like dbpedia sparql,ontologies,owl,rdf,rdfs,openNLP, standfordNLP,Gate, Mallet, Weka and many more but still I couldn't find a best approach for my program so please help me what should I do for text based categorization, – Azhar Ak Apr 09 '14 at 10:38
  • First: decide which tool to use. (If you need recommendations on this, ask here: http://softwarerecs.stackexchange.com/ ) Second: Try an implementation in the tool of your choise. Thrid: If you have actual problems, ask here again. As it stands this question is off topic on SO. – Angelo Fuchs Apr 09 '14 at 10:47
  • actually I want to make my program using dbpedia endpoint,sparql queries, linkeddata, java, etc.. and I have tried to categorize a text into pre defined categories but unfortunately my program is not giving me a good accuracy,at this time I'm using static data for my program but this is not a good way, so please help me or tell me a logic/algorithm/tool.. – Azhar Ak Apr 09 '14 at 11:07
  • Unfortunately you are at the wrong place to ask the question for "a logic/algorithm/tool". This is not within the scope of SO. If your program is not giving you the results you want than post some details. Show that you at least have tried this in the question (use the edit button) show what you have and where your solution doesn't work anymore. You won't get accurate answers to vague questions. The details you provided in the comments are a good start, edit them into the question. Then show your code, what does it do, what doesn't it do? – Angelo Fuchs Apr 09 '14 at 11:30
  • thank you Mr. Angelo Neuschitzer for good response.. one thing more can you tell me please that how to get a broader category against a token like game/finance from wikipedia?? – Azhar Ak Apr 09 '14 at 12:35

1 Answers1

2

There is an old but very good paper that covers the task of text categorization. It can be very useful for you as an introduction:

Machine Learning in automated Text Categorization, Fabrizio Sebastiani, 2002 http://orb.essex.ac.uk/CE/CE807/Readings/sebastiani02.pdf

jeojavi
  • 876
  • 1
  • 6
  • 15