Thursday, 21 December 2023

The Top 10 Logistic Regression Algorithm Question answer

 

1.  What is logistic regression?


One machine learning technique for resolving categorization problems is logistic regression. It is a predictive analytical method predicated on the concept of probability. The probability of a categorical dependent variable is predicted using the classification algorithm Logistic Regression. In logistic regression, the dependent variable is a data-driven binary variable.

either a 1 or a 0. A logistic regression model is comparable to a linear regression model, but instead of using a linear function, it makes use of a more complex cost function called the "sigmoid function" or "logistic function."

2. How will you deal with the multiclass classification problem using logistic regression?


We're dealing with more than two classes when we use multi-class classification. That class is represented by 1 in the one vs. rest method, whereas the remaining classes become 0.

3. Why is logistic regression very popular/widely used?


Because it can transform the values of logits (log-odds), which can range from −∞ to +∞, to a range between 0 and 1, logistic regression is widely used. Logistic functions can be applied to a wide range of real-world circumstances since they provide the probability of an event occurring. This explains why the logistic regression model, which can handle categorical variables, is so widely used.

4. Why can’t linear regression be used instead of logistic regression for classification?


Distribution of error terms: There are differences in the data distribution between logistic and linear regression. Error terms are assumed to have a normal distribution in linear regression. This presumption is false when it comes to binary classification.

Model output: The output of a linear regression is continuous. When it comes to binary categorization, a continuous value's output makes no sense. Linear regression may forecast values for binary classification issues that extend beyond 0 and 1. Its range should be limited to 0 and 1 if we want the output to be probabilities that can be assigned to two distinct classes. The logistic regression model is favoured over linear regression because it can produce probabilities with a logistic/sigmoid function.

Variance of Residual Errors: Random error variance is assumed to be constant in linear regression. In the case of logistic regression, this assumption is likewise rejected.

5. What are the assumptions of logistic regression?


1. It is predicated on the assumption that the independent variables have minimal to no multicollinearity, or that the predictors are uncorrelated.

2. Each predictor variable and the outcome logit should have a linear relationship. The formula for the logit function is logit(p) = log(p/(1-p)), where p is the expected outcome's probability.

3. A large sample size is typically necessary in order to make accurate predictions.

4. The ordered logistic regression requires the target variable to be sorted, whereas the binary classification logistic regression assumes that the target variable is binary, i.e., it is divided into two groups.

6. Why is logistic regression called regression and not classification?


The basic approach for logistic regression is the same as that for linear regression, but it involves regressing for the likelihood of a categorical result.

related to linear regression Logistics regression computes the coefficients of the independent variable in the same manner as it employs the same linear equation containing all of the independent variables to predict the target variable. Because it is used to solve classification problems, logistic regression first transforms the equation into a sigmoidal function to obtain probabilities, and then it classifies the record assuming an appropriate threshold (such as 0.5 or the mean of the probabilities).

Y = b0 + b1X1 + b2X2 +... + bnXn is the linear regression formula.

Y equals sigmoid (b0 + b1X1 + b2X2 +... + bnXn) in logistic regression.

Logistic regression is a generalized linear model and it uses the same basic formula of linear regression.
So basically, Logistic Regression is just a sigmoid of Linear Regression.

7.  Explain the significance of the sigmoid function.


A mathematical function called the sigmoid function is utilized to convert expected values into probabilities. It converts any real number between 0 and 1 into another value.

The logistic regression's result must lie between 0 and 1, and as it cannot be greater than this, it takes the shape of a "S" curve. The logistic or sigmoid function is another name for the S-form curve. The concept of the threshold value, which indicates a probability of either 0 or 1, is applied in logistic regression. For example, numbers that are higher than the threshold value tend to be 1 and values that are lower than it tends to be 0.

8. Explain the general intuition behind logistic regression.


Logistic regression applies a logit function to fit data and forecast the likelihood of an event occurring.
An S-shaped curve known as the logistic or sigmoid function can translate any real number into a value between 0 and 1, but never precisely at those boundaries. The goal of logistic regression is to identify the best line or plane that divides the two classes. A logistic regression model can be trained by simply using m and c to create the best feasible line to divide the two classes' points so that, in the event of a new unseen data point, the model can quickly determine which class the unseen data point belongs to.

9. What are outliers and how can the sigmoid function mitigate the problem of outliers in logistic regression?


The Logistic Regression's assumptions are susceptible to irregular data, including outliers, highly leveraged observations, and swaying observations. Therefore, in logistic regression, a sigmoid function is utilized to overcome the outlier problem. It can restrict the result value to fall between 0 and 1. 

10. Why can’t we use Mean Square Error (MSE) as a cost function for logistic regression?


The sigmoid function is used in logistic regression to carry out a non-linear transformation and obtain the probability. This nonlinear transformation squared will result in the non-convexity issue with local minimums, making it impossible to reach the global minimum by gradient descent in such circumstances. MSE therefore becomes appropriate for use with logistic regression.

Top 10 Pandas Question Answer

  1. Define the Pandas/Python pandas? Pandas is an open-source library for high-performance data manipulation in Python. 2. What are the dif...