Introduction to Basic Computer Vision & Image Processing

Bishal Bose
6 min readApr 27, 2022

--

Computer Vision

Q: What is Computer Vision?

Computer and Vision as the name suggest we are enabling a Computer / Machine to have visual capacity. Well, this is the layman’s definition of the word, but let’s formally understand what Computer Vision is. I’ll refer to Computer Vision as CV interchangeably from now on.

Computer vision is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos.

Computer vision combines cameras, edge- or cloud-based computing, software, and artificial intelligence (AI) to enable systems to “see” and identify objects.

It is a field of artificial intelligence (AI) that enables computers and systems to derive meaningful information from digital images, videos, and other visual inputs — and take actions or make recommendations based on that information.

The concept of computer vision is based on teaching computers to process an image at a pixel level and understand it.

Computer vision uses deep learning to form neural networks that guide systems in their image processing and analysis.

Once fully trained, computer vision models can perform object recognition, detect and recognize people, and even track movement.

Now, CV needs Image Processing as the very first step to be implemented before we can feed the machine with those images.

Let’s understand what Image Processing means.

Q: What is Image Processing?

Image Processing is a step that analyses the Image and processes it digitally before we can provide input to a Model.

Image processing is a way to convert an image to a digital aspect and perform certain mathematical functions on it, to get an enhanced image or extract other useful information from it.

Image processing involves the following three steps.

Importing an image with an optical scanner or digital photography.

Analysis and image management including data compression and image enhancement and visual detection patterns such as satellite imagery.

It produces the final stage where the result can be changed to an image or report based on image analysis.

Image processing is a way by which an individual can enhance the quality of an image or gather alerting insights from an image and feed it to an algorithm to predict the later things.

Q: What are the steps involved in Image Processing?

Now let’s see what are the different types of image processing one can perform, they are as follows:

Rearrange channels: from BGR to RGB -

Converting a BGR image to RGB and vice versa can have several reasons, one of them being that several image processing libraries have different pixel orderings.

OpenCV reads the image in BGR mode, whereas Matplotlib needs the image to be in RGB mode. That is why you might face a situation in which you need to display an image using Matplotlib. In that case, you have to convert the image to RGB mode.

Visualize channels separately -

This is an analysis step where you try to find out information hidden in any one channel which might not be properly visible in the other channels. This is more of a human analysis than a machine analysis.

Visualize histograms

In an image processing context, the histogram of an image normally refers to a histogram of the pixel intensity values. This histogram is a graph showing the number of pixels in an image at each different intensity value found in that image.

Histograms have many uses. One of the more common is to decide what value of threshold to use when converting a grayscale image to a binary one by thresholding.

Crop an image

Image Cropping is a common photo manipulation process, which improves the overall composition by removing unwanted regions. Image Cropping is widely used in photographic, film processing, graphic design, and printing businesses. Cropping allows us to focus on the subject alone rather than its unique combination with the surrounding.

Image Augmentation

Image augmentation is a technique of altering the existing data to create more data for the model training process. In other words, it is the process of artificially expanding the available dataset for training a deep learning model.

Image Rotation

Rotation transformation applies a rotation on the image, from right to left, on an axis between 1° and 359°

Random Shifts

Shifting the entire pixels of an image from one position to another position is called shift augmentation.

There are two types of shifts:

Horizontal Shift Augmentation

Shifting all pixels of an image in a horizontal direction is called Horizontal shift augmentation.

Vertical Shift Augmentation

Shifting all pixels of an image in a vertical direction is called vertical shift augmentation.

Random Flips

Flipping means rotating an image on a horizontal or vertical axis.

In a horizontal flip, the flipping will be on a vertical axis, In a Vertical flip, the flipping will be on a horizontal axis.

Horizontal Flip Augmentation:

Reversing the entire rows and columns of an image pixel horizontally is called horizontal flip augmentation.

Vertical Flip Augmentation:

Reversing the entire rows and columns of an image pixel vertically is called Vertical flip augmentation.

Random Scale

Scaling is used to change the visual appearance of an image, to alter the quantity of information stored in a scene representation, or as a low-level preprocessor in a multi-stage image processing chain that operates on features of a particular scale. Scaling is a special case of an affine transformation.

Image scaling is an essential part of image processing. Images need to be scaled up or down for multiple reasons.

We will assume we have an image with a resolution of width×height that we want to resize to new_width×new_height. First, we will introduce the scaling factors scale_x and scale_y

A scale factor <1 indicates shrinking while a scale factor >1 indicates stretching.

Gaussian Noise

Gaussian Noise is a statistical noise having a probability density function equal to the normal distribution, also known as Gaussian Distribution. A random Gaussian function is added to the Image function to generate this noise. It is also called electronic noise because it arises in amplifiers or detectors

It is commonly known that Gaussian noise is statistical noise with a probability density function (PDF) equal to the normal distribution. Gaussian noise has a uniform distribution throughout the signal.

A noisy image has pixels that are made up of the sum of their original pixel values plus a random Gaussian noise value. The probability distribution function for a Gaussian distribution has a bell shape. Additive white Gaussian noise is the most common application for Gaussian noise in applications.

Removing Gaussian noise involves smoothing the inside distinct region of an image. For this, classical linear filters such as the Gaussian filter reduce noise efficiently but blur the edges significantly.

Filtering image data is a standard process used in almost every image processing system. Filters are used for this purpose. They remove noise from images by preserving the details of the same. The choice of filter depends on the filter behavior and type of data.

We all know that noise is an abrupt change in pixel values in an image. So when it comes to filtering images, the first intuition that comes is to replace the value of each pixel with the average of the pixel around it. This process smooths the image.

Conclusion:

These are mostly the types of Processing you’d encounter in any Computer Vision task.

So there you go, now you understand what are the steps required for any kind of Image Processing project you’re working on. Keep up the good work.

Stay updated with all my blogs & updates on Linked In. Welcome to my network. Follow me on Linked In Here — -> https://www.linkedin.com/in/bishalbose294/

--

--

Bishal Bose

Senior Lead Data Scientist @ MNC | Applied & Research Scientist | Google & AWS Certified | Gen AI | LLM | NLP | CV | TS Forecasting | Predictive Modeling