This example shows how scikit-learn can be used to recognize images of hand-written digits, from 0–9.
The digits dataset consists of 8x8 pixel images of digits. The
images attribute of the dataset stores 8x8 arrays of grayscale values for each image. We will use these arrays to visualize the first 4 images. The
target attribute of the dataset stores the digit each image represents and this is included in the title of the 4 plots below.
Note: if we were working from image files (e.g., ‘png’ files), we would load them using
To apply a classifier on this data, we need to flatten the images, turning each 2-D array of grayscale values from shape
(8, 8) into shape
(64,). Subsequently, the entire dataset will be of shape
(n_samples, n_features), where
n_samples is the number of images and
n_features is the total number of pixels in each image.
We can then split the data into train and test subsets and fit a support vector classifier on the train samples. The fitted classifier can subsequently be used to predict the value of the digit for the samples in the test subset.
Below we visualize the first 4 test samples and show their predicted digit value in the title.
classification_report builds a text report showing the main classification metrics.
We can also plot a confusion matrix of the true digit values and the predicted digit values.
“I am thankful to mentors at https://internship.suvenconsultants.com for providing awesome problem statements and giving many of us a Coding Internship Exprience. Thank you www.suvenconsultants.com"