1. What is Linear regression?
A method for supervised machine learning called linear regression is used to determine the relationship between one or more independent (x) variables and a dependent (y) variable.
2. How do you represent a simple linear regression?
Simple linear regression is a type of linear regression in which the value of a numerical dependent variable is predicted using only one independent variable.
Line equation: y = mx + c
where,
the dependent variable is y.
x is the independent variable.
m is the coefficient of linear regression.
c = the line's intercept
3. What is multiple linear regression?
Multiple linear regression is the term for a type of linear regression in which the value of a numerical dependent variable is predicted using more than one independent variable.
Linear equation: y = m1x1 + m2x2 +.... + mNxN + c
4. What are the assumptions of the Linear regression model?
1. Linearity: There must be a linear relationship between the independent and dependent variables.
2. Lack of Multicollinearity (Independence): Each observation operates independently from the rest of the data.
3. Residual Normality: The discrepancy between the actual and expected values of y
4. Multicollinearity: The characteristics do not exhibit multicollinearity.
5. Homoscedasticity: At every x-level, residual variance is constant. We called this homoscedasticity.
5. What is the assumption of homoscedasticity? How to check Linearity? How to prevent heteroscedasticity?
Assumption of homoscedasticity
1) No multicollinearity
2) linearity
How to check Linearity:
1. Coefficient of correlation
2. Scatter Plot
3. Correlation matrix
How to prevent heteroscedasticity?
It may be due to outliers
It may be due to omitted variable bias
Log transformation
6. What does multicollinearity mean?
This phenomenon occurs when two or more independent variables, or predictors, have a high degree of correlation with one another; in other words, one variable may be predicted linearly with the aid of other variables. It discovers how independent variables are correlated and associated with one another. Multicollinearity can occasionally be associated with collinearity.
7. What is VIF? What is the best value of VIF?
An independent variable's VIF score shows how well it can be explained by other independent variables.1 is an ideal value for VIF.
8.What are the feature selection methods in Linear Regression?
Feature selection is the process of identifying and selecting a subset of input variables that are most relevant to the target variable.
1. Stepwise Regression
In the Stepwise regression technique, we start fitting the model with each individual predictor and see which one has the lowest p-value.
2. Forward Selection
Forward selection is almost similar to Stepwise regression however the only difference is that in forward selection we only keep adding the features.
3. Backward Elimination
In backward elimination in the first step, we include all predictors and in subsequent steps, keep on removing the one which has the highest p-value (>.05 the threshold limit).
9.What is feature scaling? Is it required in Linear Regression?
The technique of normalizing a dataset's feature range is known as feature scaling.
Features in real-world datasets frequently vary in terms of magnitude, range, and unit of measurement. Consequently, feature scaling is required so that machine learning models can interpret various features on the same scale.
10.How to find the best fit line in a linear regression model?
A regression line is referred to as the best fit line (BFL) if it yields the lowest error.
The gradient decent approach is used in the linear regression model to determine the most appropriate line, which has the lowest sum of squared errors.
11.What is the cost Function in Linear Regression?
The difference, or error, between actual and predicted y at a given position is determined by the cost (or loss) function.
12. Briefly explain the gradient descent algorithm
Gradient descent is an optimization algorithm that’s used when training a machine learning model and is based on a convex function and tweaks its parameters iteratively to minimize a given function to its local minimum (that is, slope = 0).
For a start, we have to select a random bias and weights, and then iterate over the slope function to get a slope of 0. The way we change update the value of the bias and weights is through a variable called the learning rate. We have to be wise on the learning rate because choosing:
A small leaning rate may lead to the model to take some time to learn. A large learning rate will make the model converge as our pointer will shoot and we’ll not be able to get to minima.
13.How to evaluate regression models?
MAE, or Mean Absolute Error
The most straightforward metric is this one. The absolute difference between the actual data and the forecasts is divided into eight equal parts and averaged.
RMSE, or root mean square error
By taking the square root of the average of the squared difference between the predicted and actual values, the Root Mean Square Error is calculated. It shows the sample standard deviation of the variations between observed and expected values (also known as residuals).
MSE, or mean squared error
The mean of each data point's squared errors, or residuals. To put it simply, it can be expressed as the squared average of the disparities between the expected and actual values.
R^2, or the Coefficient of Determination
It gauges how effectively the regression line reproduces the actual results. It aids in your comprehension of how successfully the independent variable in your model adjusted for variation. This indicates the model's fit to the dataset.
Adjusted R-squared
There is a drawback of R^2 that it improves every time when we add new variables in the model. Think about it, whenever you add a new variable there can be two circumstances, either the new variable improves your model or not. When the new variable improves your model then it is ok. But what if it does not improve your model? Then the problem occurs. The value of R^2 keeps on increasing with the addition of more independent variables even though they may not have a significant impact on the prediction.
14.Which evaluation technique should you prefer to use for data with many outliers in it?
When there are too many outliers in the dataset, Mean Absolute Error (MAE) is recommended since it is robust to outliers while MSE and RMSE are very sensitive to outliers and begin penalising the outliers by squaring the error terms, also referred to as residuals.
15.What is residual? How is it computed?
residual also called as error and it is the difference between actual value(ya) and predicted value(yp).
residual = ya-yp
ya = actual value
yp = predicted value
16.What are SSE, SSR, and SST? and What is the relationship between them?
SSE
SSE is the sum of squared error, and it is defined as the sum of square of difference between actual value and predicted value.
SSE = sum(ya-yp)**2
SSR
SSR is the sum of squared error due to the regression and it can be defined as the sum of square of difference between
predicted value and mean.
SSR = sum(yp-y(mean))**2
SST
SST is the sum of squared of total error and it can be defined as sum of squared difference between actual and mean value of
dependent variable.
SST = sum(ya-y(mean))**2
The relationship between them is given by SST = SSR + SSE
17.What does the coefficient of determination explain?
In a linear regression model, the R-square (R2), also called the coefficient of determination, indicates the percentage of variation in your dependent variable (Y) that is explained by your independent variables (X).The primary issue with the R-squared is that, when we add additional independent variables, it either stays constant or gets larger.
18.What’s the intuition behind R-Squared?
A statistical metric in a regression model called R-Squared, or the coefficient of determination, indicates how much of the variance in the dependent variable can be explained by the independent variable. Put alternatively, the goodness of fit, or r-squared, illustrates how well the data fit the regression model.
19.What is the Coefficient of Correlation: Definition, Formula
Correlation coefficients are used to measure how strong a relationship is between two variables.
Correlation coefficient(r) = covariance(x,y)/std(x)*std(y)
20.What is the difference between overfitting and underfitting?
Overfitting happens when the model performs well on the training set but not so well on test data.
Underfitting happens when it neither performs well on the train set nor on the test set.