Hands-on tutorial to effectively use different Regression Algorithms

Linear regression is usually the first algorithm that people learn for Machine Learning and Data Science. It’s simple and easy to understand, however, it’s unlikely the best choice to real-world data because of its limited capabilities. …


Hands-on tutorial to effectively use Hyperparameter tuning in Machine Learning

In Machine Learning, hyperparameters refer to the parameters that cannot be learned from data and need to be provided before training. The performance of machine learning models relies heavily on finding the optimal set of hyperparameters.

Hyperparameter tuning basically refers to tweaking the hyperparameters of the model, which is basically…


5 tricks to effectively use the Pandas count() method

The Pandas library is one of the most preferred tools for data manipulation and analysis. Data Scientists often spend most of their time exploring and preprocessing the data. …


Pandas tricks for Exploratory Data Analysis and Data Preprocessing

Data Scientists often spend most of their time exploring and preprocessing data. When it comes to data profiling and understand the data structure, Pandas value_counts() is one of the top favorites. The function returns a Series containing counts of unique values. …


Creating animation graph with matplotlib FuncAnimation in Jupyter Notebook

When fitting values to a line using Linear Regression, it can be very helpful to illustrate how the line fits the data as more data are added. …


Tips and tricks to transform numerical data into categorical data

Numerical data is common in data analysis. Often you have numerical data that is continuous, very large scales, or highly skewed. Sometimes, it can be easier to bin those data into discrete intervals. This is very helpful to perform descriptive statistics when values are divided into meaningful categories. …


Creating animation graph with matplotlib FuncAnimation in Jupyter Notebook

Matplotlib is one of the most popular plotting libraries for exploratory data analysis. It’s the default plotting backend in Pandas and other popular plotting libraries are based on it, for instance, seaborn. Plotting a static graph should work well in most cases, but when you are running simulations or doing…


Pandas tips and tricks to help you get started with Data Analysis

Pandas is one of the most important libraries for Data Manipulation and Analysis. It not only offers data structure & operations for manipulating data but also prints the result in a pretty tabular form with labeled rows and columns.

In most cases, the default settings of Pandas display should work…


Tip and tricks to improve your Google Colab Experience

Colab (short for Colaboratory) is a free platform from Google that allows users to code in Python. Colab is essentially the Google version of a Jupyter Notebook. Some of the advantages of Colab over Jupyter include zero configuration, free access to GPUs & CPUs, and seamless sharing of code.

More…


Pandas tips and tricks to help you get started with data analysis

A MultiIndex (also known as a hierarchical index) DataFrame allows you to have multiple columns acting as a row identifier and multiple rows acting as a header identifier. With MultiIndex, you can do some sophisticated data analysis, especially for working with higher dimensional data. …

B. Chen

Machine Learning practitioner | Formerly health informatics at University of Oxford | Ph.D. | https://www.linkedin.com/in/bindi-chen-aa55571a/

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store