I have extracted text from pdf (using pdfplumber) to txt but there are some spaces between words that are not in PDF file.
I have tried to nltk to find out Words using "Previous_word" + "current_word" combination and checking if they exist in NLTK.words to find out where there is extra space between words but it is not working well.
I am looking for some suggestions, Thanks