XGBoost Classifier

1 minute read

Using XGBoost in predicting staff promotion algorithm.

XGBoost is one of the most popular machine learning algorithm these days. Regardless of the type of prediction task at hand; regression or classification.

XGBoost is well-known to provide better solutions than other machine learning algorithms. In fact, since its inception, it has become the “state-of-the-art” machine learning algorithm to deal with structured data.

But what makes XGBoost so popular?

Speed and performance: Originally written in C++, it is comparatively faster than other ensemble classifiers.
Core algorithm is parallelisable: Because the core XGBoost algorithm is parallelisable it can harness the power of multi-core computers. It is also parallelizable onto GPU’s and across networks of computers making it feasible to train on very large datasets as well.
Consistently outperforms other algorithm methods: It has shown better performance on a variety of machine learning benchmark datasets.
Wide variety of tuning parameters: XGBoost internally has parameters for cross-validation, regularization, user-defined objective functions, missing values, tree parameters, scikit-learn compatible API etc.

# Using XGBoost in Python First of all, just like what you do with any other dataset, you are going to import the dataset and store it in a variable called “Main_Data”. To import we use the Pandas python package. We import other libraries as we did for the EDAsExploratory Data Analysis

Importing libraries:

 from sklearn.model_selection import train_test_split, cross_val_score
 from imblearn.over_sampling import SMOTE
 from xgboost import XGBClassifier
 from sklearn.preprocessing import StandardScaler
 from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
 from sklearn.metrics import mean_squared_error, f1_score, precision_score, recall_score

Twitter Facebook Google+ LinkedIn

Godfred Nsabo

XGBoost Classifier

You May Also Enjoy

Exploratory Data Analysis

Optimal Control