Clarifying some terms I came across in my CS3244 project on ML and Computer Vision.

It’s easy to forget to appreciate what really matters in life…

The project that my team of six is working on is a pneumonia detection machine learning tool to help medical staff. This is my first time actually working on an actual ML project and hence, i would like to list down my learning points so that i don’t forget about them.

What is Jupyter Notebook?

What are keypoints descriptors?

They provide information about the raw image data that we want to feed into a model for machine learning.

Keypoints usually contain information about their position, and sometimes their coverage area in the image. By just knowing some general characteristics of the extracted keypoints (they are centered around blobs, edges, prominent corners…) , you will not know how different or similar one key point is to the other.

Hence, this is where descriptors come in. Descriptors help to compare the key points. They summarise some characteristics of the key points. It assigns a numerical description to the area of the image the keypoint refers to.

What is SIFT?

It stands for Scale Invariant Feature Transform. It is a feature detection algorithm used to extract key points of images in computer vision.

We can also use the keypoints generated using SIFT as features for the image during model training. The major advantage of SIFT features, over edge features or hog features, is that they are not affected by the size or orientation of the image.

What is SVM?

Support vector machines are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.

For more:

What is XGBoost?

XGBoost is like SVM. It is just another model, using decision trees but boosted.

What is K-means?

Image classifiers and Image detectors are two different things.

The former classifies and differentiates different images. For eg, classify an image to be either a rose or a lily flower. The latter checks for the presence of an item. For eg, check if the image contains a face.

How does machine learning code and model work?

Feed data to a model. Then the model will produce the output.

Or you can feed descriptors to the model.

How is Computer Vision different from Machine learning? Are these the same 2 fields?

Computer Vision is most probably different from machine learning but has some intersections.

“The invention of machine learning rendered the entire field of computer vision redundant.”

This is probably because computer vision is the study of mathematical concepts and application of them to detect or classify images. For eg, count the number of image pixels and use mathematical formulas to detect changes in the pixels calculations which in turn allows us to find out which part of the image we are looking at. On the other hand, machine learning is simply getting the computers to learn how to classify images or detect objects without having to use mathematical calculations. Machine learning is a different method for doing things with images.

Convolution Neural Networks?

They are like neural networks, but made of convoluted neural layers.

The sliding grid example.

Representation of Image by the computer

The computer views an image as a 3D array. (Width, Height, 3) where 3 are the channels representing Red, Green, Blue values.

To allow the computer understand images, we have to use numpy to convert images into arrays.

Or alternatively use cv2

code is just:

import cv2

img = cv2.imread(“image.jpd”)

Also can use skimg



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store



This is a repository of my thoughts on my personal life, my random interests & notes taken down as I navigate my way through the tech world!