Inside the mathematics of Gradient Descent.

Image for post
Image for post

In the first part of this article, we saw an intuitive understanding of Gradient Descent and some of the concepts required for mathematical understanding of it. In this article, we are going to dive into mathematical details of Gradient Descent. If you have not read the first intuitive part of this blog, I would suggest you to go and read that blog by clicking here.

Outline of this blog:

  1. What is optimization in machine learning?
  2. How we can perform optimization using Gradient Descent?

We are ready to go now,

1. What is optimization in machine learning?

You must be wondering why we are reading about this optimization thing, the reason is simple, Gradient Descent is an Optimization Algorithm and to understand it we must know what is optimization first. Optimization is in simple terms The action of making the best or most effective use of a situation or resource.” But in terms of Machine Learning what are those situations or resources? As we use machine learning algorithms to perform some predictive tasks, every algorithm has some loss function to measure the performance of that algorithm for performing a predictive task. This loss function helps us to know how well the machine learning algorithm is predicting the desired output. Lower the loss better is the algorithm, so the aim of any machine learning task is to achieve minimum loss and thus maximum accuracy. Loss is basically an error measurement technique and also known as the cost function. We can call it a function that calculates the difference between the desired output and what our model is predicting. …

