I have an image with mostly light text on a dark background. Some of the text is a darker color (purple).
I'm using opencv-python to manipulate the image for better OCR parsing.
There is a little more processing that happens before this, but I feel like the processing steps giving me trouble are as follow.
The image gets converted to grayscale
cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
The image then gets inverted (this seems to keep the final text clearer)
cv2.bitwise_not(img)
The image then gets run through threhold
cv2.threshold(img, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
You can see I'm totally losing the darker text. Switching to an adaptive threshold does preserve the text better but creates a ton of noise (the background appears flat black but is not).
Any thoughts on how I can modify my current thresholding to preserve that darker text?