AI Models
Machine learning models powering CAREN fraud detection
Random Forest
Ensemble of decision trees with bagging. Primary production model with highest accuracy across all metrics. Handles class imbalance through SMOTE oversampling.
XGBoost
Gradient boosted trees with regularization. Secondary production model providing ensemble diversity. Excels at capturing complex non-linear fraud patterns.
Logistic Regression
Linear classification baseline model. Provides interpretable probability estimates and serves as the performance baseline for all other models in the pipeline.
K-Nearest Neighbors
Instance-based lazy learner using proximity voting. On standby for production failover scenarios. Strong performance on localized fraud clusters in feature space.
Decision Tree (AdaBoost)
Adaptive boosting ensemble of shallow decision trees. Currently under evaluation for potential production deployment. Shows promising results on recent fraud patterns.
Performance Comparison
Side-by-side metrics for all trained models
| Model | Accuracy | Precision | Recall | F1 Score | AUC-ROC | Status |
|---|---|---|---|---|---|---|
Random Forest Best | 99.94% | 94.12% | 81.63% | 87.43% | 98.21% | Production |
XGBoost | 99.92% | 92.35% | 79.59% | 85.49% | 97.84% | Production |
Logistic Regression | 97.41% | 85.71% | 61.22% | 71.43% | 95.12% | Baseline |
K-Nearest Neighbors | 99.65% | 89.47% | 69.39% | 78.16% | 96.53% | Standby |
Decision Tree (AdaBoost) | 99.87% | 91.18% | 77.55% | 83.78% | 97.42% | Evaluation |
Live Model Testing
Generate and analyze test transactions in real-time
Click the button above to generate a test transaction and see how the CAREN ensemble model analyzes it.
Feature Importance
Top PCA features contributing to fraud detection
Feature importance is derived from the Random Forest model's Gini impurity reduction. V14, V4, and V12 are the most discriminative PCA components for separating fraudulent from legitimate transactions.