Reading data is the first step in any data science project. As a machine learning practitioner or a data scientist, you would have surely come across JSON (JavaScript Object Notation) data. JSON is a widely used format for storing and exchanging data. For example, NoSQL database like MongoDB store the data in JSON format, and REST API’s responses are mostly available in JSON.
Although this format works well for storing and exchanging data, it needs to be converted into a tabular form for further analysis. You are likely to deal with 2 types of JSON structure, a JSON object or…
Numerical data is common in data analysis. Often you have numerical data that is continuous, or very large scales, or is highly skewed. Sometimes, it can be easier to bin values into discrete intervals. This is helpful to perform descriptive statistics when values are divided into meaningful categories. For example, we can divide the age into Toddler, Child, Adult, and Elder.
Pandas’ built-in cut()
function is a great way to transform numerical data into categorical data. In this article, you’ll learn how to use it to deal with the following common tasks.
DataFrame and Series are two core data structures in Pandas. DataFrame is a 2-dimensional labeled data with rows and columns. It is like a spreadsheet or SQL table. Series is a 1-dimensional labeled array. It is sort of like a more powerful version of the Python list. Understanding Series is very important, not only because it is one of the core data structures, but also because it is the building blocks of a DataFrame.
In this article, you’ll learn the most commonly used data operations with Pandas Series and should help you get started with Pandas. …
The activation functions are at the very core of Deep Learning. They determine the output of a model, its accuracy, and computational efficiency. In some cases, activation functions have a major effect on the model’s ability to converge and the convergence speed.
In this article, you’ll learn why ReLU is used in Deep Learning and the best practice to use it with Keras and TensorFlow 2.
In artificial neural networks (ANNs), the activation function is a mathematical “gate” in between the input feeding the current neuron and its output going to the next layer [1].
The activation functions are at the very core of Deep Learning. They determine the output of a model, its accuracy, and computational efficiency. In some cases, activation functions have a major effect on the model’s ability to converge and the convergence speed.
In this article, you’ll learn the following most popular activation functions in Deep Learning and how to use them with Keras and TensorFlow 2.
Callbacks are an important type of object in Keras and TensorFlow. They are designed to be able to monitor the model performance in metrics at certain points in the training run and perform some actions that might depend on those performances in metric values.
Keras has provided a number of built-in callbacks, for example, EarlyStopping
, CSVLogger
, ModelCheckpoint
, LearningRateScheduler
etc. Apart from these popular built-in callbacks, there is a base class called Callback
which allows us to create our own callbacks and perform some custom actions. …
Reading data is the first step in any data science project. Often, you’ll work with data in JSON format and run into problems at the very beginning. In this article, you’ll learn how to use the Pandas built-in functions read_json()
and json_normalize()
to deal with the following common problems:
Please check out Notebook for the source code.
Let’s begin with a simple example.
[ { "id"…
Web scraping is the process of collecting and parsing data from the web. The Python community has come up with some pretty powerful web scrapping tools. However, many modern websites are dynamic, in which the content is loaded and populated using client JavaScript. Therefore, some extra setups are required in order to scrape data from JavaScript webpages.
In this article, you’ll learn how to scrape tables from a JavaScript webpage using Selenium, BeautifulSoup, and Pandas.
Please check out the source code from…
Web scraping is the process of collecting and parsing data from the web. The Python community has come up with some pretty powerful web scrapping tools. Among them, Pandas read_html()
is a quick and convenient way for scraping data from HTML tables.
In this article, you’ll learn Pandas read_html()
to deal with the following common problems and should help you get started with web scraping.
parse_dates
converters
match
Suppose you encountered a situation where you need to push all rows in a DataFrame or require to use the previous row in a DataFrame. Maybe you want to calculate the difference in consecutive rows, Pandas shift()
would be an ideal way to achieve these objectives.
In this article, we’ll be going through some examples of manipulating data using Pandas shift()
function. We will focus on practical problems and should help you get started with data analysis.
periods
freq
Machine Learning practitioner | Formerly health informatics at University of Oxford | Ph.D.