A walkthrough of the Gradient Boosted Trees algorithm’s maths

XGBoost (https://github.com/dmlc/xgboost) is one of the most popular and efficient implementations of the Gradient Boosted Trees algorithm, a supervised learning method that is based on function approximation by optimizing specific loss functions as well as applying several regularization techniques.

The original XGBoost paper can be found here: https://arxiv.org/pdf/1603.02754.pdf

The purpose…


…and a “Robust” scikit-learn Label Encoder

Machine Learning algorithms/models work by having as input numerical valued features, here we can consider some real-world application examples like age, income, days since last transaction and many others.

Why we need categorical values encoding?
Logistic regression and Neural Nets are simple or complex nested numerical functions, Random Forests and GBMs…


This post is about Kaggle’s “Home Depot Product Search Relevance” competition: Home Depot is asking Kagglers to help them improve their customers’ shopping experience by developing a model that can accurately predict the relevance of search results.

The work we present here is done by our team consisting by the…

Dimitris Leventis

Senior Software Engineer at Microsoft | ML/AI | Kaggle Master

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store