Questions tagged [text-extraction]

Text extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents (text).

Text extraction is the task of automatically extracting structured information from unstructured and/or semi-structured machine-readable documents (text).

Text extraction mechanisms may vary depending on the context and the language applied. Approaches may vary from regular expressions to classifiers till more complex/custom models.

More Info

1282 questions
-3
votes
1 answer

Match string and print output specified field side by side for multiple files

I'm new to programming so I might need explanation for each step and I have an issue: Say I have these (tab delimited) files: genelist.txt contains: start_position end_position description 1 840 putative replication protein 1839 2030 …
-3
votes
2 answers

Get text after 'x' and before ''y' in Excel

I have a source .csv file which is non-editable, I'd like to extract the quantity that is enclosed within a particular text. I've tried using MID with a combination of LEFT and RIGHT. However, I've only been able to get either of them (before /…
-3
votes
3 answers

Get numbers that immediately follow @ symbol in string

Imagine I have a textarea with the following value. @3115 Hello this is a test post. @115 Test quote I'm trying to find a way in PHP using regex that will get the numeric value that comes after the '@' symbol even if there's multiple symbols. I…
Tony
  • 351
  • 3
  • 19
-3
votes
1 answer

machine learning: keyword extraction from list of files

I have a list of pdf files that have different numbers of pages and presentations. Each file contains a list of information that I need to extract. but the problem is that the information is wrapped in different type of phrases and syntax. I need to…
abderr080
  • 13
  • 1
  • 3
-3
votes
1 answer

How can i extract keyword from pdf file asp.net c#?

I have cv in pdf format and i want to extract keyword NLP (Natural language processing).Here is attached images. But i don't know how to do it ,I'm beginner please help me Thanks img img2
-3
votes
1 answer

script to parse a text file

Any help is greatly appreciated. I have a Cisco voice gateway that I connect to with SSH and can send a command to get all the current calls on the gateway. I'm trying to automate this so that I can pull out this information and display it on a…
-3
votes
1 answer

How do I extract data from pptx file using Apache POI?

I am using XSLFPowerPointExtractor to extract text from a pptx file. However all the text in the pptx file is returned to me in a single string. Is there anyway i can get the text on each slide separately? I am completely new to this concept, so…
-3
votes
1 answer

Extraction text line

I need to create iOS app and this app allow user to capture paper and automatically detect text-line and then extract each line as new image. Example: Image contains 4 lines of text after process become 4 images and each image contain text…
-3
votes
1 answer

extracting data from formatted string (python)

I have a string like this: (63, 166) - (576, 366) I need to extract the values out so that I have: x1 = 63 y1 = 166 x2 = 576 y2 = 366 I can easily use the split() function and save the results in temporary arrays and then then further split them…
user961627
  • 12,379
  • 42
  • 136
  • 210
-3
votes
4 answers

Extract numbers from a textbox

I have a textbox with both letters and numbers in and other symbols in which you can find on your keyboard. I have this code which works fine when I manually put the data in and it only lets me put numbers in and deletes letters. Everything what I…
user3290171
  • 121
  • 1
  • 3
  • 19
-4
votes
0 answers

Extra spaces, extra new line characters and unable to identify the headers, which are bold, while reading the pdf from python

/*Hi Everyone, I have a PDF file which has some bold side heading(visually bold. Not capital letters). The paragraphs in between the headings are considered as the sections. I am searching for a particular word in the PDF. If any section has that…
-4
votes
2 answers

Extract part of the content in a Google Sheets cell

I have this content in a cell and I would like to extract just the word overview in this case, but could be any work between / and ?. What's the best way for…
-4
votes
1 answer

How can i compare if a text line starts with a word in Python?

I have a key text "Outros" and a text file im reading and enumerating. I need to print only the lines that starts with that key. Is it possible? Im alreadying printing lines, but it prints every line that has this key inside the line, not only the…
Vitor Vito
  • 35
  • 5
-4
votes
3 answers

Get text that occurs after a number in a string

I have: $string = "Some minimal or large text 820 some minimal or large descr"; I need: some minimal or large descr
tsla
  • 1
  • 2
-4
votes
1 answer

Best text extractor for android

I am new to android please tell which framework/library is best to extract the text from image. Thank you.
Suresh S
  • 49
  • 10
1 2 3
85
86