4

I would like to know the difference between

from sklearn import datasets
dataset = datasets.fetch_mldata("MNIST Original")

and

from sklearn.datasets import load_digits
tempdigits = load_digits()  

How is these two related to MNIST dataset?

Gayathri
  • 140
  • 11

1 Answers1

5

sklearn comes with a few small standard datasets that do not require to download any file from some external website. load_digits includes around 1800 samples of size 8X8 from the UCI ML dataset:

http://archive.ics.uci.edu/ml/datasets/Optical+Recognition+of+Handwritten+Digits

fetch_mldata downloads the MNist dataset from http://mldata.org/repository/data/viewslug/mnist-original/ which contains 70000 samples of size 28x28 pixels

So basically the datasets downloaded are different.

Praveen
  • 2,137
  • 1
  • 18
  • 21
  • Are these two different from standard MNIST dataset? – Gayathri Nov 29 '17 at 11:15
  • The dataset fetched by fetch_mldata is the original MNIST dataset. The dataset fetched by load_digits is preprocessed version with some modification. The UCI link specifies what were the changes done. – Praveen Nov 29 '17 at 11:22