Questions tagged [text-segmentation]

Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics.

Text segmentation is the process of dividing written text into meaningful units, such as words, sentences, or topics.

References:

Related Tags:

197 questions
0
votes
0 answers

Segmentation of connected characters (Python, OpenCV)

Given the follow image: My question is: how can I segment the characters ? How can I divide each char in a single ROI-box ? It is this even possible while mantaining the chars more or less readable ? I found this document and at page 6/11 there is…
lucians
  • 2,239
  • 5
  • 36
  • 64
0
votes
2 answers

Extracting content from documents

I want to extract the content from resumes having various sections like skills, certifications, work experience etc. with NLP and tag them as per their category. While I can write basic rules to extract text on various punctuation marks, but it may…
joel
  • 1,156
  • 3
  • 15
  • 42
0
votes
2 answers

onClick function is causes “Uncaught SyntaxError: Unexpected token }” error

I have a problem, I'm trying to dynamically add some html via javascript, and the html has a js function that's supposed to trigger when it's clicked, but I keep getting this error no matter what I do Uncaught SyntaxError: Unexpected token } Has…
Peniyal Abraham
  • 87
  • 2
  • 12
0
votes
2 answers

Segmentation and Collocation

I am looking for new ideas for two features I am implementing. 1.) Text segmentation feature: Ex: User Query: Resolved Query: ----------- --------------- It has…
starkk92
  • 5,754
  • 9
  • 43
  • 59
0
votes
0 answers

Drawing bounding box using pixel data of image c#

I am trying to draw a bounding box using pixel data. I want to generate a bounding box like in the attached image. I tried to draw a line graph and find box coordinates for each character. I am confused in implementation. Can any one kindly guide…
0
votes
1 answer

Word Segmentation in MATLAB

Code: img = imread ('G:\Stuff\RP\Database\0001_4.jpg'); %imshow(img); bin_img = imcomplement(im2bw(img, 0.8)); %Binarizing %figure; %imshow(bin_img); bin_img = bwareaopen(bin_img, 50); %for removing dots and commas %%%%%%%% Line Segmentation…
Junaid
  • 941
  • 2
  • 14
  • 38
0
votes
1 answer

How to use StanfordNLP Chinese segmentor in Java?

I have tried the following code, however the code does not work and only outputs null. String text = "我爱北京天安门。"; StanfordCoreNLP pipeline = new StanfordCoreNLP(); Annotation annotation = pipeline.process(text); String result =…
Fred Pym
  • 2,149
  • 1
  • 20
  • 29
0
votes
1 answer

How to extract the text part only from an image using opencv and python?

Here is the image after the Pre Processed of a water meter reading... But whenever I am using tesseract to recognize the digits its not giving an appropriate output. So, I want to extract/segment out the digits part only as an region of Interest…
Ankit
  • 1
  • 5
0
votes
1 answer

Word segmentation histogram explanation

I am trying to segment words in a handwritten text line. I am doing this based on a research paper whose word segmentation part is given in the image. I do not understand the quantities for which the histogram is to be made.Histogram for word…
0
votes
1 answer

How to use target label as feature in CRF++?

I'm trying to build a Chinese word segmentator as this paper. If I understand it correctly, they use a 2-tag segmentation approach with CRF++. My question is, how to make the tag transition in that paper (e.g.T(-1)C(0)T(0)) as a feature template in…
陳乙山
  • 23
  • 3
0
votes
1 answer

UnsupportedClassVersionError from running Stanford Chinese Segmenter

I am getting UnsupportedClassVersionError when running the Stanford Chinese Segmenter. I have seen other post saying that this results from not updating to the newest Java version. As seen below in the screenshot, I have the latest Java updated on…
YAL
  • 651
  • 2
  • 7
  • 22
0
votes
0 answers

NLP: TypeError: reduce expected at least 2 arguments, got 1

import math, functools def splitPairs(word): return [(word[:i+1], word[i+1:]) for i in range(len(word))] def segment(word): if not word: return [] allSegmentations = [[first] + segment(rest) for (first, rest) in…
0
votes
2 answers

How to segment text into sub-sentences based on enumerators?

I am segmenting sentences for a text in python using nltk PunktSentenceTokenizer(). However, there are many long sentences appears in a enumerated way and I need to get the sub sentence in this case. Example: The api allows the user to achieve…
Kristina
  • 3
  • 2
0
votes
1 answer

How to detect objects in an image based on colour?

I am using a handwriting database for writer recognition. I'm using the QUWI database, it has a sample of an original image and a sample of the image segmented into lines by giving each line a different colour. For example here is the original…
0
votes
1 answer

How to make sequential contour in opencv-python from left to right

After segmenting a handwritten number using contours in opencv-python, it is giving a random output contour. How do I obtain one going sequentially from left to right and from top to bottom? contours,hierarchy =…
Nouman
  • 21
  • 4