Having fun with the MNIST database

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST.

It consists of four files that store data in a simple file format (idx format) which is documented in the MNIST database homepage. I wrote this C program to be able to extract the descriptions of handwritten digits of a certain value separately (eg. only the 0s or only the 1s). The text output is pretty simple and closely resembles the format of a PGM file. It is describes the grayscale pixel values of the image in decimal ASCII. Pixel values range from 0 to 255.

As a bonus, there is the opportunity to extract the images in portable graymap file format (PGM) image files.

This program was written with these guys mostly in mind.

[download mnist.c]

4 thoughts on “Having fun with the MNIST database

  1. ok! eidame ton kwdika, katebasame ta arxeia apo to mnist, eidame kai to site tou wikipedia, apo ekei kai pera ti ginetai?

  2. Παίρνεις τα data που σε ενδιαφέρουν (τα 0 και τα 1), τα βάζεις το MatLab …και κάνεις την άσκηση.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s