Machine Learning Projects
03 Jun 2017TensorFlow - Image Classification
An image classification program which, transferring learning from Inception v3 (trained for the ImageNet Challenge), classifies a given image as a daisy, sunflower, dandelion, tulip or rose. I used TensorFlow & Python for this project. It achieves and accuracy of ~90%. Link to code.
Image compression
The project used the k-means clustering algorithm. It scans an image and returns it using only the most popular 16 colors in the image. The other colors are assigned based on which of the other 16 colors they are closest to. The compressed image weighs 7 KB less, which means it was compressed by 64%.
NLP - Text Classifier
The program tried to find the most popular customer questions asked by email. It took a whole mailbox, and filters only to get statements sent by customers. Then, it classifies each statement into Accept, Bye, Clarify, Continuer, Emotion, Emphasis, Greet, No Answer, Other, Reject, Statement, System, Wh-Question, Yes Answer andYes/No Question.
It uses a naive Bayes classifier, trained on the NPS Chat Corpus.
On the training set, it got a 63% classification accuracy. Seems a lot but the results were mostly unusable. Link to code.
SVM - Spam Classifier
This program used support vector machines (SVMs) to build a spam classifier. It achieved a 98% accuracy.
Neural Network - MINST Dataset
The classic ML project. Built a neural network from scratch which recognized handwritten digits form the MINST data set. It got 97.52% accuracy on the training set.
Backpropagation for Neural Network
Expanding on the Neural Network project, added backward propagation from scratch to the neural network.
‘Yes you should understand backprop’ by Andrej Karpathy
Logistic Regression
A simple introduction project to get acquainted with fundamentals of machine learning through logistic regression. The program computes costs for logistic regression (regularized and non-regularized), its gradients (regularized and non-regularized) and a prediction function. It achieves a 89% prediction accuracy for non-regularized and 83% for regularized functions.
Anomaly Detection & Recommender Systems
In this project, I implemented an anomaly detection algorithm and applied it to detect failing servers on a network. Next, I used collaborative filtering to build a recommender system for movies. The anomaly detection system estimates a Guassian fit, and highlights anomalies (servers which failed). It finds the optimal threshold value for an anomaly based on a crossed validation set based precision and recall (F1 score). The movie recommender system computes recommended movies based on other users reviews. It creates a gradient for each movie and each user preferences based on their reviews.