Python Visualization with Seaborn
Hello world, last time we talked about data visualization with matplotlib Today we would be going through another library for visualization one which I personally like and that is Seaborn. Seaborn is a statistical plotting library. One reason why this is a personal favorite is that it has beautiful default styles. It is built on top of matplotlib so it’s not necessarily a replacement and the matplotlib knowledge is not a waste. It is also designed to work very well with pandas dataframe objects
Installing Seaborn
To Install Seaborn in your Jupyter Notebook you will need to install with pip or conda on your command line by entering the following codes
Conda install Seaborn
Or
Pip Install Seaborn
Importing Seaborn
import seaborn as sns
%matplotlib inline
Types of Plots
Before we actually go into the coding in Seaborn it is imperative we highlight the types of plots we would be discussing here they are as follows;
1. Distribution Plots
2. Categorical Plots
3. Matrix Plots
4. Regression Plots
5. Grid Plots
1. Distribution Plots:
To view the distribution of a dataset we use the plots below to visualize the it. These plots are:
- distplot
- jointplot
- pairplot
- rugplot
- kdeplot
We would be using an inbuilt data set in seaborn to work here enter the code shown in the image below to load the dataset.
· distplot
The distplot shows the distribution of a univariate set of observations
To remove the kde layer and just have the histogram use:
· jointplot
jointplot() allows you to basically match up two distplots for bivariate data. With your choice of what kind parameter to compare with:
- “scatter”
- “reg”
- “resid”
- “kde”
- “hex”
you can also try out the rest of the kinds to see the visualization you want.
· pairplot
pairplot will plot pairwise relationships across an entire dataframe (for the numerical columns) and supports a color hue argument (for categorical columns). Enter the code below
sns.pairplot(tips)
we can specify a hue value for a category like the “sex”column by entering the following code(note: coolwarm is just a color style i use its not an important part of the code)
· rugplot
rugplots are actually a very simple concept, they just draw a dash mark for every point on a univariate distribution. They are the building block of a KDE plot:
· kdeplot
kdeplots are Kernel Density Estimation plots. These KDE plots replace every single observation with a Gaussian (Normal) distribution centered around that value. you can click the link above to get a deeper mathematical understanding of it but to plot it
Next time i would be discussing the categorical plot types.