Experienced ML Engineer with a Springer-published paper and business scaling expertise.
Having goal to
contribute in Humanity and Mankind.
I'm an experienced Artificial Intelligence Engineer with a proven track record in building AI for startups, leading machine learning teams, and developing innovative AI solutions. With a strong commitment to contributing to humanity and mankind, I've also made contributions to the research community with a published paper in the Springer journal. My expertise lies in AI, computer vision, NLP, and data science.
In this research paper, I introduced a convolutional neural network model named Flynet to address the
challenge of automatic building detection in high-resolution satellite images. Existing methods often suffer
from time-consuming processes and incomplete solutions due to the complexity of visual features and the
presence of other objects in the images. Flynet is designed with an encoder-decoder architecture,
incorporating improvements that enhance its speed, lightweight nature, and accuracy.
The experimental results demonstrate that it outperforms U-Net, providing more accurate
predictions while
being three times faster and 70% smaller in size. Through this research
paper, my aim was not only to
contribute to the development of state-of-the-art algorithms for satellite image analysis but also to open new
possibilities for real-world applications in remote sensing, urban planning, and disaster management.
Illustration of predictions by the models on the validation dataset. Columns starting from left are as follow: raw satellite image, its corresponding ground truths, prediction by proposed model and prediction by U Net model.
January 2022 - October 2023
1 yrs 9 mos
Aftershoot is a software startup specializing in computer vision-powered solutions for photographers.
Played a key role as the first full-time employee in scaling the business to over 2Mn ARR and growing the ML team to 10.
Leading machine learning team & building computer vision solutions based on object detection, classification, statistics leading algorithms, segmentation and object recognition.
Developed a face detection algorithm achieving >99% accuracy, even on images with a resolution of 6k.
Trained new ML models with improved accuracy and faster inference times, leading to increased customer satisfaction and improved product performance.
June 2021 - January 2022
8 mos
Handled the development of build infrastructure and streamlined the CI/CD pipeline to ensure faster and more efficient product delivery.
Deployed a machine learning model to accurately predict ticket build time, lead to increased developer efficiency.
Designed and implemented an internal chatbot to instantly resolve queries from developers, improving productivity and collaboration within the team.
October 2018 - October 2019
1 yrs
Oversaw machine, parts, and manpower planning, leading to significant improvements in efficiency and productivity.
Automated MIS reporting in Excel, reducing manual effort by more than 1 hour and improving accuracy.
Redesigned the coolant return tank of a machine, eliminating waste of waiting and processing on the production line, and achieving an annual cost savings of Rs 100k.
Sept. 2020 – May 2022
2 yrs
Aug 2014 – May 2018
4 yrs
Problem Statement:
The project involves the development of a face detection machine learning algorithm on high resolution images
that achieved an impressive accuracy rate of over 99%. High-resolution images contain a substantial amount of
detail, and very tiny face, making it more challenging for a face detection algorithm to accurately identify
faces amidst the abundance of visual information.
Solution:
To address the challenges of developing a high-resolution face detection algorithm in tensorflow, I undertook
a
comprehensive approach.
Dataset Preparation:
To prepare the dataset for training, I prepared few python scripts. I first divided each high-resolution image
into 16 parts, ensuring overlapping sections to cover every possible face. I then applied an open-source face
detection model to these image segments. This process resulted in a well-annotated and diverse dataset,
crucial
for training a robust model.
Model Architecture: Afterward, I crafted the model architecture from scratch and embarked on training the
model
using the dataset. This custom architecture allowed for fine-tuning the algorithm to perform optimally on
high-resolution images with tiny faces, a challenging task in itself.
Inferencing Optimization: I also optimized the inferencing process by dividing the input image into 4 parts,
ensuring efficient and accurate detection of faces during application usage.
The combination of these steps resulted in an exceptional high-resolution face detection algorithm, which not
only achieved an accuracy rate of over 99% but also significantly improved the overall speed and accuracy of
face detection in real-world scenarios.
Impact:
This project led to a rapid growth in the company, as we received significantly fewer complaints about face
detection. Furthermore, people began recommending the product to others due to the enhanced accuracy and
reliability of the face detection algorithm.
Problem Statement:
Task aimed to classify images into three categories: 'kiss,' 'almost kiss,' and 'no kiss.' The primary
challenge
faced was the high resolution of the images, with an average resolution of around 6k pixels. This high
resolution made it challenging for machine learning models to achieve accurate results, particularly as the
kiss
event typically occurred in a very small region of the entire image. The initial model had an accuracy of only
24%, which was lower than what a random model could provide (33% in this case).
Solution:
To address these challenges, I adopted a multi-step approach. Firstly, I leveraged a face detection model's
predictions to identify the region of interest within the image. Next, I implemented a smart cropping
algorithm
to extract and prepare the input for my machine learning model. Subsequently, I trained a Convolutional Neural
Network (CNN) on these cropped images.
One additional challenge I encountered was the class imbalance in the dataset. I had a limited number of kiss
images (15k) compared to a much larger number of images for the 'almost kiss' and 'no kiss' classes
(approximately 100k each). To mitigate this imbalance, I employed weighted loss functions and down-sampled the
dataset.
Impact:
Approach yielded significant improvements. The model's accuracy increased to
approximately 80%,
representing a substantial enhancement over the initial 24% accuracy. Additionally, the preprocessing step
significantly improved inference speed. As a result, the overall speed of the application increased by 10%,
leading to heightened customer satisfaction.
Problem Statement:
The project's primary goal was to assess the composition quality of an image, specifically in terms of
adhering
to the rule of thirds and accurately identifying the subject's position within the image.
Solution:
To address this challenge, I employed the YOLO machine learning model to detect and locate the subjects within
the images. YOLO is known for its real-time object detection capabilities, making it suitable for identifying
the subjects swiftly and accurately.
Subsequently, I developed an algorithm that assigned a composition score to each image based on two key
factors:
the subject's position within the image and adherence to the rule of thirds. This algorithm considered the
subject's placement in relation to the rule of thirds grid and evaluated how well the image composition
aligned
with this fundamental principle of visual design.
Impact:
The implemented solution demonstrated remarkable performance in assessing image composition. By efficiently
detecting subjects and evaluating composition quality, it provided valuable insights for selecting the best
images
Problem Statement:
The project's primary challenge was to correct the white balance of an image by accurately predicting the
appropriate temperature and tint values for editing in Adobe Lightroom. Adobe Lightroom offers a wide range of
white balance temperature options, from 2000K to 50,000K Kelvin, allowing photographers to adjust them based
on
their preferences. Predicting the correct values within this extensive range can be a daunting task.
Solution:
To address this challenge, I conducted a detailed analysis of the impact of temperature values on images.
Through this analysis, I observed that changes in temperature values beyond 9,000K and below 2,400K had
minimal
impact on the image. Therefore, I focused on developing a regression-based machine learning model within this
temperature range.
The model's objective was to predict the ideal temperature and tint values for white balance correction,
simplifying the editing process in Adobe Lightroom. In this case, I employed transfer learning and designed a
CNN architecture that not only takes images but also the current Temp and Tint values into account. I then
trained the model using the prepared dataset.
Impact:
The implementation of this machine learning solution had a significant impact on image editing efficiency. By
predicting temperature and tint values within a practical range, it streamlined the white balance correction
process for photographers, saving them time and improving the overall quality of image editing.
Problem Statement:
The challenge was to categorize products into 10,000 different categories based on their product title and the
description. The dataset posed a significant challenge due to its vast size, comprising 3 million data
points.
Solution:
Given the immense dataset, my approach began with a thorough data analysis and data cleaning to gain a
comprehensive understanding. I identified the primary challenge as data imbalance, with over ~95% of the data
points belonging to approximately 2,700 categories, leaving the remaining 5% distributed across roughly 7,300
categories.
To address this challenge, I used a two-fold solution. First, I trained a machine learning model using
Tensorflow/Keras on the 95% of the dataset that belonged to the 2,700 classes. I experimented with various
Natural Language Processing (NLP) models, starting with simple techniques like Bag of Words, TF-IDF, and
Word2Vec, leveraging the Python's NLTK library for data preparation. For the remaining 5% of the dataset,
encompassing approximately 7,300 classes, I implemented a straightforward random model that provided random
predictions for these categories.
Attached graph shows the cumulative distribution of the target classes.
Impact:
While I was unable to submit my solution within the deadline, my approach demonstrated remarkable results.
Even
with the simplicity of using clever data preparation and a basic Bag of Words model, the accuracy achieved was
on par with the best solutions at the time that utilized advanced models like BERT. This project underscored
my
ability to handle complex, large-scale datasets and devise effective strategies to tackle data imbalance
issues,
leading to competitive results in a challenging machine learning competition.
Problem Statement:
The problem statement was to create a machine learning model meant to predict ticket build times accurately.
The
objective was to enhance developer efficiency by providing a more precise estimate of the time required to
build
the software after submitting a commit.
Solution:
To achieve this, the initial challenge was to create a dataset because there wasn't a readily available one. I
talked to developers from different departments to figure out if the number of files affected build time and
which changes took more or less time. Afterward, I conducted data analysis and feature engineering using
Python
libraries like pandas and matplotlib to prepare the necessary features. I trained several regression machine
learning models and found that the decision tree performed the best. The great thing about it was that it not
only predicted the build time but also provided information about the importance of each feature.
Impact:
The deployment of the machine learning model delivered significant benefits. Developers now had access to
precise predictions for ticket build times, leading to increased efficiency and better project management.
This
project showcased my ability to leverage machine learning to optimize processes and enhance productivity
within
the development team, ultimately contributing to more effective software development practices.
Problem Statement:
The challenge at hand was to create a machine learning model capable of automatically removing backgrounds
from
images. It involves distinguishing the main subject from the background, which can be complex and vary
significantly across different images.
Solution:
To tackle this problem, I leveraged a diverse dataset of images containing various subjects and backgrounds.
The
objective was to develop an efficient model that could accurately identify and isolate the main subject from
the
background.
Initially, I explored the use of existing deep learning models, such as Mask R-CNN and U-Net, known for their
segmentation capabilities. However, these models often came with computational overhead and larger model
sizes.
To optimize for speed and model size, I devised a custom CNN-based architecture, drawing inspiration from
U-Net
but tailoring it to the specific task of background removal. The challenge lay in achieving real-time or
near-real-time processing while preserving accuracy.
Web Application Development:
In addition to the model development, I also created a user-friendly web application using Flask. This web app
allowed users to upload their images and receive instant background removal results. It provided a seamless
and
intuitive interface for individuals and businesses to utilize this powerful image editing capability without
the
need for extensive technical expertise.
Impact:
The result of this project was a highly efficient model for automatic background removal in images, coupled
with
a user-friendly web application. It offered real-time or near-real-time performance while effectively removing
complex backgrounds.
2020
2020
2020
2020
2022