1

I am trying to format the index name so it can escape latex when using .to_latex(). Using .format_index() works only for the index values but not for the index names.

failed_table

Here is a Minimal, Reproducible Example.

import pandas as pd
import numpy as np
import pylatex as pl

dict1= {
    'employee_w': ['John_Smith','John_Smith','John_Smith', 'Marc_Jones','Marc_Jones', 'Tony_Jeff', 'Maria_Mora','Maria_Mora'],
    'customer&client': ['company_1','company_2','company_3','company_4','company_5','company_6','company_7','company_8'],
    'calendar_week': [18,18,19,21,21,22,23,23],
    'sales': [5,5,5,5,5,5,5,5],
}

df1 = pd.DataFrame(data = dict1)

ptable = pd.pivot_table(
    df1,
    values='sales',
    index=['employee_w','customer&client'],
    columns=['calendar_week'],
    aggfunc=np.sum
)

mystyler = ptable.style
mystyler.format(na_rep='-', precision=0, escape="latex") 
mystyler.format_index(escape="latex", axis=0)
mystyler.format_index(escape="latex", axis=1)

latex_code1 = mystyler.to_latex(
    column_format='|c|c|c|c|c|c|c|',
    multirow_align="t",
    multicol_align="r",
    clines="all;data",
    hrules=True,
)

# latex_code1 = latex_code1.replace("employee_w", "employee")
# latex_code1 = latex_code1.replace("customer&client", "customer and client")
# latex_code1 = latex_code1.replace("calendar_week", "week")

doc = pl.Document(geometry_options=['a4paper'], document_options=["portrait"], textcomp = None) 

doc.packages.append(pl.Package('newtxtext,newtxmath')) 
doc.packages.append(pl.Package('textcomp')) 
doc.packages.append(pl.Package('booktabs'))
doc.packages.append(pl.Package('xcolor',options= pl.NoEscape('table')))
doc.packages.append(pl.Package('multirow'))

doc.append(pl.NoEscape(latex_code1))
doc.generate_pdf('file1.pdf', clean_tex=False, silent=True)

When I replace them using .replace() it works. such as the commented lines. (desired result): desired_table

But I'm dealing with houndreds of tables with unknown index/column names.

The scope is to generate PDF files using Pylatex automatically. So any html option is not helpful for me.

Thanks in advance!

Jaime
  • 23
  • 4
  • 1
    Unrelated to your question, but don't use `booktabs` together with vertical lines, this causes all these gaps. – samcarter_is_at_topanswers.xyz Jun 22 '22 at 14:00
  • @samcarter_is_at_topanswers.xyz Thank you for the observation. I'm aware that nobody uses vertical lines nowadays. But my colleagues are stubborn and say can't read tables properly, so vlines must stay . Since `.to_latex` depends on `booktabs` I haven't find other way. If you know other way I will be happy to hear it :) . – Jaime Jun 22 '22 at 14:20
  • 1
    If vlines must stay, remove the booktabs package (and load the array package, this will make the line joints better) :) – samcarter_is_at_topanswers.xyz Jun 22 '22 at 14:21
  • @samcarter_is_at_topanswers.xyz It worked! no more gaps. I just had to change `rules` to `hlines` within `styler.set_table_styles()` and no need of `booktabs` anymore. Thank you. – Jaime Jun 23 '22 at 07:00
  • You're welcome! And I keep my fingers crossed that you now also get answer to the question you actually asked :) – samcarter_is_at_topanswers.xyz Jun 23 '22 at 07:56

1 Answers1

1

I coded all the Styler.to_latex features and I'm afraid the index names are currently not formatted, which also means that they are not escaped. So there is not a direct function to do what you desire. (by the way its great to see an example where many of the features including the hrules table styles definition is being used). I actually just created an issue on this on Pandas Github.

However, the code itself contains an _escape_latex(s) method in pandas.io.formats.styler_render.py

def _escape_latex(s):
    r"""
    Replace the characters ``&``, ``%``, ``$``, ``#``, ``_``, ``{``, ``}``,
    ``~``, ``^``, and ``\`` in the string with LaTeX-safe sequences.

    Use this if you need to display text that might contain such characters in LaTeX.

    Parameters
    ----------
    s : str
        Input to be escaped

    Return
    ------
    str :
        Escaped string
    """
    return (
        s.replace("\\", "ab2§=§8yz")  # rare string for final conversion: avoid \\ clash
        .replace("ab2§=§8yz ", "ab2§=§8yz\\space ")  # since \backslash gobbles spaces
        .replace("&", "\\&")
        .replace("%", "\\%")
        .replace("$", "\\$")
        .replace("#", "\\#")
        .replace("_", "\\_")
        .replace("{", "\\{")
        .replace("}", "\\}")
        .replace("~ ", "~\\space ")  # since \textasciitilde gobbles spaces
        .replace("~", "\\textasciitilde ")
        .replace("^ ", "^\\space ")  # since \textasciicircum gobbles spaces
        .replace("^", "\\textasciicircum ")
        .replace("ab2§=§8yz", "\\textbackslash ")
    )

So your best bet is to reformat the input dataframe and escape the index name before you do any styling to it:

df.index.name = _escape_latex(df.index.name)
# then continue with your previous styling code
Attack68
  • 4,437
  • 1
  • 20
  • 40