-1

Here i share my code

main.py

from fitz import fitz
import spacy


location = "D:\python\Resume-Sample.pdf"
text = ''

with fitz.open(str(location)) as doc:
    for page in doc:
        text+=page.get_text("block")

    NER = spacy.load("en_core_web_sm")

text1 = NER(text)
for word in text1.ents:
    print(word.text, word.label_)

Result

:Abdul Moeez :E-mail- amoeez14@gmail.com : Phone +1111111111 : Address Karachi, Sindh, Pakistan

How i make and train a model so it recognizes Name Email Phone Address

1 Answers1

0

you need to go with spacy with proper guidance visit the link below for more details https://spacy.io/universe/category/training/

the process using regex with pyperclip library https://medium.com/@branzoldecode/phone-number-and-email-extractor-with-python-c88f88b42a8a

import re 
import pyperclip 
Alltext = pyperclip.paste() 
EmailRegex = re.compile(r'[a-zA-Z0–9_.+]+@[a-zA-Z0–9_.+]+',re.VERBOSE) 
result = EmailRegex.findall(Alltext) 
pyperclip.copy(result) 

this is just an example to find the specified things, but you need to implement you own logic to read the file first.

azhar
  • 351
  • 3
  • 13