Predicting Customer Satisfaction for the purchase made from the Brazilian e-commerce site Olist.

This Articles Includes:
1.Introduction
2.Business Problem
3.Problem Statement
4.Bussiness objectives and constraints
5.Machine Learning Formulation
i Data Overview
ii.Data Description
iii.Machine Learning Problem
iv.Performance Metrics
6.Exploratory Data Analysis(EDA)
a.Data Cleaning and Deduplication
b.High Level Statistics
c.Univariate Analysis
d.Bivariate Analysis
e.Multivariate Analysis
f.RFM Analysis
g.Conclusion
7.Data Preprocessing and Feature Engineering
8.Model Selection
9.Summary
10.Deployment
11.Improvements to Existing Approach
12.Future Work
13.Reference

The e-commerce sector is rapidly evolving as internet accessibility is increasing in different parts of the world over the years. This sector is redefining commercial activities worldwide…


Box Plot (also called as Box and Whiskers Plot) is a very popular and widely used plot for visualizing data in the field of Statistics and Data Analysis. In comparison with other graphical techniques, Box Plot not only shows the distribution/spread of data but also indicates the minimum and maximum values, quartiles, the symmetry and skewness of the data. Box Plot is also used to detect outliers. In Machine Learning, you might have used this plot in Exploratory Data Analysis.

Let us understand more about it -

This article includes:1.What is Box Plot?2.How to read a Box Plot?


As we Know, Outliers are patterns in the datasets that do not conform to the expected behaviour. It may appear in the dataset due to low-quality measurements, malfunctioning equipment, manual error e.t.c. The presence of outliers may create problems in building a good machine learning model.

There are mainly two types of Outliers:

Global Outliers - The data points which are significantly different from the rest of the dataset are called Global Outliers.

Local Outliers -The data points which are significantly different from their neighbours in the dataset are called Local Outliers.


NumPy library is an important foundational tool for studying Machine Learning. Many of its functions are very useful for performing any mathematical or scientific calculation. As it is known that mathematics is the foundation of machine learning, most of the mathematical tasks can be performed using NumPy.


Matplotlib is one of the most popular and oldest plotting libraries in Python which is used in Machine Learning. In Machine learning, it helps to understand the huge amount of data through different visualisations.

Now, let us explore more about Matplotlib.

Contents
1.Introduction to Matplotlib
2. How to Install?
3. How to import?
4.Understanding the basics of Graph/Plots using Matplotlib
5.Important plots used in Machine Learning
6.Three-Dimensional Plotting with Matplotlib

1. Introduction to Matplotlib
Matplotlib is an open-source plotting library in Python introduced in the year 2003. …


Pandas for Machine Learning

Pandas is one of the tools in Machine Learning which is used for data cleaning and analysis. It has features which are used for exploring, cleaning, transforming and visualizing from data.

Paritosh Mahto

Currently working on -Lithium-Ion Battery Capacity Estimation using Machine Learning

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store