# 11. Mosiac Plots

Mosiac plots are a great way to juxtapose and configure multiple plots together. Let’s see how they work.

## 11.1. Housing data

For this example, we will use the housing data.

[1]:

from sklearn.datasets import fetch_california_housing

X, y = fetch_california_housing(return_X_y=True, as_frame=True)

[2]:

X.shape, y.shape

[2]:

((20640, 8), (20640,))


## 11.2. Mosiac plot by string configuration

You can configure a mosaic plot by using codes for each subplot. Here, we want 3 subplots coded a, b and c. Note how a and b take up one row and one column while c takes up two rows and one column.

[3]:

import matplotlib.pyplot as plt
import seaborn as sns

fig = plt.figure(layout='constrained', figsize=(5, 3.5))
ax = fig.subplot_mosaic('''
ac
bc
''')

X['HouseAge'].plot(kind='kde', ax=ax['a'])
sns.kdeplot(X['HouseAge'], ax=ax['b'], cumulative=True, color='g')
X['HouseAge'].plot(kind='box', ax=ax['c'])

ax['a'].set_title('PDF')
ax['b'].set_title('CDF')

[3]:

Text(0.5, 1.0, 'CDF')


## 11.3. Mosiac plot by list configuration

We can also define the subplots more meaningfully by using names placed into a matrix (list of list). Instead of a, b and c, we now use pdf, cdf and box, correspondingly.

[4]:

fig = plt.figure(layout='constrained', figsize=(5, 3.5))
ax = fig.subplot_mosaic([
['pdf', 'box'],
['cdf', 'box']
])

X['HouseAge'].plot(kind='kde', ax=ax['pdf'])
sns.kdeplot(X['HouseAge'], ax=ax['cdf'], cumulative=True, color='g')
X['HouseAge'].plot(kind='box', ax=ax['box'])

ax['pdf'].set_title('PDF')
ax['cdf'].set_title('CDF')

[4]:

Text(0.5, 1.0, 'CDF')


## 11.4. Fancy example

Here’s a fancy example of a mosaic plot where we make multiple scatter plots. The plots in the upper triangle color code the scatter plots according to y (if the corresponding y of the scatter plot point is above, red, or below, blue, the mean of y). The plots in the lower triangle color code the scatter plots according to y with a different criterion (if the corresponding y of the scatter plot point is below one standard deviation of y, blue, or not, red).

[5]:

def get_key(r, c, x_col, y_col):
if r == c:
return '.'
if c < r:
return f'{x_col}_{y_col}_lower'
return f'{x_col}_{y_col}_upper'

def do_plot(r, c, x_col, y_col, ax):
k = get_key(r, c, x_col, y_col)
if '.' == k:
return
elif c < r:
X[y > y.mean() - y.std()].plot(kind='scatter', y=x_col, x=y_col, ax=ax[k], s=1, color='r')
X[y <= y.mean() - y.std()].plot(kind='scatter', y=x_col, x=y_col, ax=ax[k], s=1, color='b')
else:
X[y > y.mean()].plot(kind='scatter', x=x_col, y=y_col, ax=ax[k], s=1, color='r')
X[y <= y.mean() - y.std()].plot(kind='scatter', x=x_col, y=y_col, ax=ax[k], s=1, color='b')

mosiac = [[get_key(r, c, x_col, y_col) for c, y_col in enumerate(X.columns)] for r, x_col in enumerate(X.columns)]

fig = plt.figure(layout='constrained', figsize=(20, 20))
ax = fig.subplot_mosaic(mosiac)

_ = [[do_plot(r, c, x_col, y_col, ax) for c, y_col in enumerate(X.columns)] for r, x_col in enumerate(X.columns)]

/opt/anaconda3/lib/python3.9/site-packages/pandas/plotting/_matplotlib/core.py:1114: UserWarning: No data for colormapping provided via 'c'. Parameters 'cmap' will be ignored
scatter = ax.scatter(