Hello guys, in my previous blog that is Mathematical implementation of Linear Regression I explained the internal operations of the algorithm, in this blog I will explain you the evaluation metrics that are used to evaluate the model.
Before going to the Evaluation metrics let me show you a graph.
- (x,y) is an Actual value.
- (x,ŷ) is an predicted value.
- y-ŷ= Difference between the actual and predicted value /also called as Residual.
EVALUATION METRICS
- MEAN SQUARED ERROR(MSE)
- MEAN ABSOLUTE ERROR(MAE)
- R-SQUARED
- ROOT MEAN SQUARE(RMSE)
MEAN SQUARED ERROR (MSE)
- MSE is nothing but the squared difference between the actual and the predicted value.
- Let me take an example
- Here y is the actual value and yhat is the predicted value.
- Let's begin the process:
- Calculate the difference between actual(Y) and predicted(YHAT).
- Here Error is the difference between Y and Yhat
- Error=Y - Yhat
- So, our next step is to square this Error
- We have got the squares of the error now, now we have to calculate ∑ of squared error which means adding all the elements in the Squared error column
- Adding up all the elements in the squared error we get 3.6.
- The total no element present in our dataset is 5 or just calculate the no of rows.
- Now let's calculate the Mean squared error
- Formula to caluclate Mean squared error is,
- No of data points(n) =5
- The sum of Squared error /
= 3.6
- we have got the value now let; caluclate
- MSE=3.6/5 =0.72
- MSE=0.72
- Smaller the Mean Squared error(MSE) closer we are to the best-fit line/Regression line.
- This indicates the model is performing well and the distance between the actual and predicted value is less.
MEAN ABSOLUTE ERROR
- Mean absolute error is another loss function used for Regression models.
- MAE is the sum of absolute differences between our actual(y) and predicted(yhat) values.
- The absolute difference between the two numbers 4 and 2 is |4-2|=2
- The absolute difference between the two numbers -4 and -2 is |-4-2|=|-6|=6
- It measures the average magnitude/(size) of errors in a set of predictions.
- So, let's apply this to our dataset now
- As I have mentioned above y is the actual value and yhat is the predicted value.
- In MAE we will take the absolute difference, though the difference might be negative, the absolute difference will be positive because the modulus of the no is always positive.
- Now let's calculate ∑ Squared errors
- Let's apply the calculated values on the formula,
- MAE=3.2 / 5 = 0.64
- MAE= 0.64
ROOT MEAN SQUARED ERROR(RMSE)
- RMSE is the standard deviation of the residuals/loss/error.
- RMSE is a measure of how spread out the residuals are.
- RMSE is calculated by taking the square root of MSE.
- Formula to caluclate RMSE is,
- Formula inside the square root is the formula of MSE.
- Taking the square root of it √MSE, we will get RMSE
- MSE=0.72(from the above calculation)
- √0.72 = 0.848
- RMSE=0.848
- Lower values of RMSE indicates better fit
R-SQUARED
- R-squared is a statistical measure of how close the data are to the fitted regression line.
- R-squared is always between 0 - 100%
- R-squared of 0% reveals us that 0% of data fits the regression model.
- R-squared of 100% reveals us that 100% of data fits the regression model.
- In general, the higher the R-squared, the better the model fits your data
- Formula to calculate R_squared are:
- Here,
- SSRES = Sum of square of residuals
- SSTOT=Sum of squares of total
- SSRES we have calculated while calculating the MSE, you can go above and refer.
- SSTOT is nothing but square of Actual value - Average of Actual value.
- so let's implemented the above calculated values in the formula
- 1- 3.6 / 5.2 = 0.3076
- R_squared = 0.3076 0.3076*100= 30.76%
- In most of the cases R-squared threshold will be 50% .
- If the calculated R-squared value is less than the threshold(50%) it is not considered as good fit.
- If the calculated R-squared value is more than the threshold(50%) it is considered as good fit.
- our model indicates only 30.076% of data fit the regression model.
- so, our model is not a good fit.
- So, the R-squared value tells us how well the model fits our data.
- Higher the R-squared score, better the model fits your's data.



Nice one :-)
ReplyDeleteThnaks bro
DeleteExcellent 👌🏻👌🏻
ReplyDeleteThank you
Delete