I am using PHP preg_match_all, and this is what I can get so far....
[A-Za-z+\W]+\s[\d]
The only problem is that I need the \W to not be a "
.
So I have tried:
[A-Za-z+[^\dA-Za-z"]\s?]+\s[\d]
[A-Za-z+]\s?+[^A-Za-z\d"]?\s[\d]
among other things, and it is just failing and I really can't figure out why.
EDIT:
Here is the entire RegEx;
([A-Z][a-z]+\s){1,5}\s?[^a-zA-Z\d\s:,.\'\"]\s?
[A-Za-z+\W]+\s[\d]{1,2}\s[A-Z][a-z]+\s[\d]{4}
I split it into two line, the second line begins with what I posted.
Patterns trying to match:
India – Adulterated Tea Powder Seized 18 April 2011
India – Importer of Haldiram’s Petha Sweet Cubes Issuing Voluntary Recall 26 April 2011
India – Undeclared Gluten Found in Sweets by Canadian Authorities 27 April 2011
India – Adulteration Found in Edible Oils 28 April 2011
India – Viral Disease Affects Chili Crop in Goa 28 April 2011
NOT ----> Chili – India: Goa”. 8 April 2011
Ivory Coast – Potential Cocoa Quality Decline despite Sufficient Surplus 11 April 2011
Japan – Sanuki Kanzume Co. and Failure to Comply with FDA Standards 27 April 2011
Madagascar – Toxic Sardines 14 April 2011
Madagascar – Update: Toxic Sardines 26 April 2011