Logistic Regression: The Discrete Beauty!
“Why was your recent credit card application rejected? How did your telecom service provider figure out you were unhappy? How did your friend predict your favourite team losing in the world cup?”
It’s logistic regression at work.
How many times have we wondered, if life could be a simple yes or no? One-day, logistic regression, might take us there. Let’s know more about it.
What is Logistic Regression?
Logistic regression is predictive, like other regression analysis techniques. The key different is that the output is discrete. In other words, the outcome of logistic regression, is the probability of an event happening or not. Basis, the probability, and the types of output required, several types of logistic regression used are:
Type | Output of dependent variable |
Binomial Logistic Regression | Only 2 possible outcomes (0,1). For e.g., Yes/No, Win/Loss, Fraud/Not-Fraud |
Multinomial Logistic Regression | 3 or more possible outcomes, but discrete. For e.g., Yes/No/Maybe, High/Medium/Low |
Ordinal Logistic Regression | Discrete possible outcomes, but ordered. For e.g., Excellent, Good, Average, Poor |
Therefore, logistic regression is widely used for classification problems.
What is under the hood?
At the heart of the logistic regression models, is the S-Curve (sigmoid curve). The Sigmoid curve can take any real-value, and produce an output with value between the range 0 and 1.
Source: https://en.wikipedia.org/wiki/Sigmoid_function
Therefore, this finds favours when classification problems need to be solved. Take an example, where you want to predict the outcome, as loan-approved (1) or loan-denied (0). There could be many input variables (demographics, salary, credit score, etc), basis which the outcome is decided. You might build a business process, like the one below:
Output (0 – 1 Range) | Business Outcome |
<0.5 | Loan – Denied |
>0.5 and <0.75 | Loan – Hold (perform secondary verification) |
> 0.75 | Loan – Approved |
Further reading & learning
For the ones interested in getting into the depths of the subject, these are the topics that will come handy.
Maximum Likelihood Estimate | https://en.wikipedia.org/wiki/Maximum_likelihood |
Sensitivity & Specificity | https://en.wikipedia.org/wiki/Sensitivity_and_specificity |
ROC Curve | https://en.wikipedia.org/wiki/Receiver_operating_characteristic |
This brings us to an end of the commonly used regression tools. In the upcoming blog, let’s learn about the decision trees.