2

When using pandas to_html, I would like to be able to use the background_gradient function with data that is not immediately numeric. Specifically in this instance, I'd like to be able to colour a table according to the mean value, but also displaying "mean ± stdv" to the user. The issue is that "mean ± stdv" is a string. I don't think df.style.applymap is ideal, as I would like a continuous colourmap which requires knowledge of all the values in a column.

As an example:

from pathlib import Path

import pandas as pd

# cannot be easily used with background_gradient
df_with_stdv = pd.DataFrame({"A": ["0.40±0.14", "0.70±0.14", "0.28±0.08"],
                             "B": ["0.13±0.15", "0.40±0.25", "0.23±0.11"]})

# can be used, but missing other information
df = pd.DataFrame({"A": [0.40, 0.7, 0.28],
                   "B": [0.13, 0.4, 0.23]})

s = df.style.background_gradient(cmap='viridis', subset="A")

path = Path("<output path here>")

with open(path, "w") as f:
    f.writelines(f"<!DOCTYPE html><html><body>{s.render()}</body></html>")

I am imagining some sort of formatter that splits on "±" then sorts by first value. If this is not possible, then a solution that takes columns A_mean, A_stdv and then colours by A_mean and displays "A_mean±A_stdv" would also work.

Thanks!

Hemmer
  • 1,366
  • 1
  • 18
  • 33

1 Answers1

0

Closely inspired by Pandas - Style - Background Gradient using other dataframe , I have managed to find a solution

from pathlib import Path


import matplotlib.pyplot as plt
import pandas as pd
from matplotlib import colors

# cannot be easily used with background_gradient
B = pd.DataFrame({"A": ["0.40±0.14", "0.70±0.14", "0.28±0.08"],
                  "B": ["0.13±0.15", "0.40±0.25", "0.23±0.11"]})


def b_g(s, cmap='PuBu', low=0, high=0):
    # take the text preceding the +-
    a = s.apply(lambda x: float(x.split("±")[0]))

    rng = a.max() - a.min()
    norm = colors.Normalize(a.min() - (rng * low),
                            a.max() + (rng * high))
    normed = norm(a.values)
    c = [colors.rgb2hex(x) for x in plt.cm.get_cmap(cmap)(normed)]
    return ['background-color: %s' % color for color in c]


cmap_higher_better = colors.LinearSegmentedColormap.from_list("", ["red", "yellow", "green"])
cmap_lower_better = colors.LinearSegmentedColormap.from_list("", ["green", "yellow", "red"])

s = B.style.apply(b_g, cmap=cmap_higher_better, subset="A"). \
            apply(b_g, cmap=cmap_lower_better, subset="B")

path = Path("<output path here>")

with open(path, "w") as f:
    f.writelines(f"<!DOCTYPE html><html><body>{s.render()}</body></html>")

Output

coloured dataframe with mean±std format

Hemmer
  • 1,366
  • 1
  • 18
  • 33