This analysis aimed to develop a logistic regression model to predict loan defaults based on financial data, facilitating risk assessment and decision-making in the lending industry. The dataset comprised various financial attributes of loan applicants, and the target variable indicated loan status (defaulted or not). Exploratory data analysis revealed class imbalance, with a larger proportion of non-defaulting loans. The machine learning process followed standard stages, focusing solely on logistic regression for the classification task.
- Logistic Regression:
- Accuracy: 0.99
- Precision:
- Healthy Loans: 1.00
- High-Risk Loans: 0.87
- Recall:
- Healthy Loans: 1.00
- High-Risk Loans: 0.89
The Logistic Regression model exhibited high accuracy and precision for both loan classes, with better performance in predicting healthy loans compared to high-risk loans. This asymmetry suggests potential implications for decision-making, emphasizing the importance of correctly identifying high-risk loans to mitigate financial losses. Given the model's balanced accuracy and robust precision-recall scores, further evaluation and potential deployment are recommended. However, continuous monitoring and refinement are crucial to address class imbalance and adapt to evolving financial landscapes.