3

I am training an OCR model for recognizing MRZ from passport. To train my model for more accuracy, I need to train it with maximum pictures possible. I tried to find passport's dataset on KAGGLE but could not find it.

Can anybody tell me from where I can get passport images dataset which contains passports of almost every country or north and south american passports?

Your help will be much appreciated.

Best, Asma

Asma Imtiaz
  • 467
  • 1
  • 6
  • 12
  • you can find related data-set in 25 million free Google data-set search engine. https://datasetsearch.research.google.com/ – asim Feb 03 '20 at 14:16
  • Thanks @asim. I checked that already and could not find the required dataset. Could you share the exact link you are referring to? – Asma Imtiaz Feb 03 '20 at 14:35

1 Answers1

8

One such dataset is maintaned by EdisonTD. http://www.edisontd.net

Edison TD (Travel Documents) is a database of travel documents and other travel-related documents from most countries in the world. The database is developed by the Dutch authorities in cooperation with the authorities in Canada, Australia, USA, United Arab Emirates and Interpol.

Another one is Prado: https://www.consilium.europa.eu/prado/en/prado-start-page.html

PRADO, a database created by the Council of the European Union, contains information on travel and ID documents and selected security features. The database is maintained by experts of EU countries together with experts from Iceland, Norway and Switzerland. PRADO mainly contains information on ID documents from EU countries but it also includes some countries outside the EU. PRADO is publicly accessible.

As far as I know, there are no other public datasets as they would by definition contain personally identifiable data.

If you're planning to train an OCR model, you might have a decent number of samples with these datasets. However, you'll potentially need to find a way to augment these datasets so that you get much better results.

Cerovec
  • 1,273
  • 10
  • 19