Questions tagged [text-extraction]

Text extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents (text).

Text extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents (text).

Text extraction mechanisms may vary depending on the context and the language applied. Approaches may vary from regular expressions to classifiers till more complex/custom models.

More Info

1282 questions
-2
votes
1 answer

Extracting a portion of a string in SQL

I am facing challenges in extracting a portion of string in SQL: Example being : I have to extract from the below string: VAGGRAWA from PAYPAL *VAGGRAWA 4029357733 VAGGRAWA from PAYPAL *VAGGRAWA8 4029357733 VAGGRAWA from PAYPAL VAGGRAWA8…
-2
votes
1 answer

Get a number after a word pattern in R

I need to get the number after a word in a data table column, for example: y = data.table(status =c( "client rating 01 approved", "John Rating: 2 reproved", "Customer rating9") ) Then, I need to get the number after the word rating and create a new…
-2
votes
1 answer

Extract html sourcecode from a javascript generated output

I am currently working on a project of finding empty classrooms in our school in real time. For that purpose, I need to extract substitution published on our school page (https://ssnovohradska.edupage.org/substitution/?), since there might be any…
-2
votes
1 answer

Loops | list | extraction

I'm trying to import some elements from file, do some work with them and print [extract to file in the future] results in a list. As Im breaking the code peace by peace, Im getting all information that I need, but when Im trying to extract all info…
-2
votes
1 answer

Extract specific string from URL

I want to extract some string from this url https://s3-ap-southeast-1.amazonaws.com/mtpdm/2019-06-14/12-14/1001_1203_20190614120605_5dd404.jpg I want to extract the 2019-06-14, how do I do that using java?
Halomaniac
  • 19
  • 5
-2
votes
1 answer

I have a string in which I want to extract both kinship and Name

How can I write a regex that will extract "père" and "Tomy" from the following texts? myText = "Qui est le père de Tomy?"; myText = "Qui est le père aimé du Jeune Tomy?"; myText = "Qui est le père du petit de Tomy";
-2
votes
2 answers

Automatic e-mail extraction in Java

How can I scan for potential e-mail addresses in a text file using Java?
kiran
  • 22
  • 1
  • 2
-2
votes
1 answer

Split text lines into words and decide which one is correct based on voting

The following code splits each lines into words and store the first words in each line into array list and the second words into another array list and so on. Then it selects the most frequent word from each list as correct word. Module…
myahia
  • 3
  • 2
-2
votes
1 answer

Regular Expression to extract a text with a variable length

I want to write a regex which will return the first occurrence of a pattern, which might have a variable length, for ex 1J-AB-AO08-F-15 ==> AB 1P-ABCD-AO08-F-15 ==> ABCD 1L-KK-KKK-F-1000 ==> KK 1M-L-AO08L-F-15 ==> L I referred some online…
user67339
  • 71
  • 1
  • 7
-2
votes
1 answer

What is the simplest way of extracting text from image?

I am a beginner and am confused which method should i use to extract text from image for using it in my project?
-2
votes
1 answer

Need a regular expression that will extract first 7 or 8 characters of string that ends with specific characters

First timer here and would greatly appreciate any assistance. Need the regular expression to get the first 7 or 8 characters of variable length strings that end with abcd.com. Example…
-2
votes
5 answers

How to find text between two specific strings in c

I want to extract only strings between and how can i extract those? please help Example : hello world this is a text this is another text Result : hello world this is another text
Erfan
  • 35
  • 2
  • 10
-2
votes
2 answers

bag-of-words approach / tools / library for C++?

I have a folder that contains many document in .txt of tourism reviews. I want to use the bag of words approach to convert them to some kind of numeric representation for machine learning (Latent Dirichlet Allocation - LDA) in c++ to train the…
-2
votes
3 answers

Extracting email domain name from whole email address

I want to extract domain name from the whole email address as shown on pictures below. Do you know how I should do it by using VBA?
HelloWorld1
  • 13,688
  • 28
  • 82
  • 145
-2
votes
1 answer

Extract zipcodes from string and build a select statement in php

I have a string that I grab from my DB which has town and zipcode information in it. I want to extract the zipcodes (always 5 digits) from it and build a select statement from it (using PHP) as follows: $townZip = 'Boston(02108, 02112, 02116),…
rogerb
  • 251
  • 1
  • 4
  • 11