1

I use Python to analyze data in Jupyter Notebooks, which I convert to PDFs to share with coauthors (jupyter nbconvert --to pdf). I often use linearmodels.panel.results.compare() to compare panel regression estimates from the linearmodels package. However, the PDF conversion process converts the compare() output to a fixed-width font that is much too wide for the PDF (I will provide the code below):

Much too wide compare output

Can I pretty print the output of compare() when I convert a Jupyter Notebook to PDF?

A possible solution is to convert the compare() output to a data frame. The option pd.options.display.latex.repr = True pretty prints data frames when I convert to PDF. For example:

Pretty print of data frame

In the notebook, the compare() output formats nicely and looks like a data frame. However, it is not a data frame, and I have failed to convert it to a data frame.

Is there an alternative solution to compare the pretty print the results of linearmodels package output?

Here is the code that generates the tables above (copy and paste into a Jupyter Notebook code cell):

import pandas as pd
from linearmodels.panel import FamaMacBeth
from linearmodels.panel.results import compare

pd.options.display.latex.repr = True

from statsmodels.datasets import grunfeld
df = grunfeld.load_pandas().data
df.set_index(['firm','year'], inplace=True)

display(df.head())

table = {
    '(1)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(2)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(3)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(4)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit()
}

display(compare(table))
Richard Herron
  • 9,760
  • 12
  • 69
  • 116

2 Answers2

3

compare returns a PanelModelComparison. This class has a property summary which returns a linearmodels.compat.statsmodels.Summary which is virtually identical to the Summary objects available in statsmodels. Summary instances have a method as_latex() which converts the table to LaTeX.

import pandas as pd
from linearmodels.panel import FamaMacBeth
from linearmodels.panel.results import compare

pd.options.display.latex.repr = True

from statsmodels.datasets import grunfeld
df = grunfeld.load_pandas().data
df.set_index(['firm','year'], inplace=True)

display(df.head())

table = {
    '(1)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(2)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(3)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit(),
    '(4)': FamaMacBeth.from_formula(formula='value ~ 1 + invest + capital', data=df).fit()
}

display(compare(table))
comparrison = compare(table)
summary = comparrison.summary
print(summary.as_latex())

This prints

\begin{center}
\begin{tabular}{lcccc}
\toprule
                               &         \textbf{(1)}        &         \textbf{(2)}        &         \textbf{(3)}        &         \textbf{(4)}         \\
\midrule
\textbf{Dep. Variable}         &            value            &            value            &            value            &            value             \\
\textbf{Estimator}             &         FamaMacBeth         &         FamaMacBeth         &         FamaMacBeth         &         FamaMacBeth          \\
\textbf{No. Observations}      &             220             &             220             &             220             &             220              \\
\textbf{Cov. Est.}             &  Fama-MacBeth Standard Cov  &  Fama-MacBeth Standard Cov  &  Fama-MacBeth Standard Cov  &  Fama-MacBeth Standard Cov   \\
\textbf{R-squared}             &            0.6964           &            0.6964           &            0.6964           &            0.6964            \\
\textbf{R-Squared (Within)}    &           -1.8012           &           -1.8012           &           -1.8012           &           -1.8012            \\
\textbf{R-Squared (Between)}   &            0.8660           &            0.8660           &            0.8660           &            0.8660            \\
\textbf{R-Squared (Overall)}   &            0.6964           &            0.6964           &            0.6964           &            0.6964            \\
\textbf{F-statistic}           &            248.83           &            248.83           &            248.83           &            248.83            \\
\textbf{P-value (F-stat)}      &            0.0000           &            0.0000           &            0.0000           &            0.0000            \\
\textbf{=====================} & =========================== & =========================== & =========================== & ===========================  \\
\textbf{Intercept}             &            114.16           &            114.16           &            114.16           &            114.16            \\
\textbf{ }                     &           (3.8390)          &           (3.8390)          &           (3.8390)          &           (3.8390)           \\
\textbf{capital}               &            0.1457           &            0.1457           &            0.1457           &            0.1457            \\
\textbf{ }                     &           (0.8510)          &           (0.8510)          &           (0.8510)          &           (0.8510)           \\
\textbf{invest}                &            6.3899           &            6.3899           &            6.3899           &            6.3899            \\
\textbf{ }                     &           (11.618)          &           (11.618)          &           (11.618)          &           (11.618)           \\
\bottomrule
\end{tabular}
%\caption{Model Comparison}
\end{center}

T-stats reported in parentheses
Richard Herron
  • 9,760
  • 12
  • 69
  • 116
Kevin S
  • 2,595
  • 16
  • 22
0

Here is an alternative that uses Kevin S.'s .summary.as_latex(). The function below uses compare().summary to create a data frame, which jupyter nbconvert --to pdf converts to a table.

from io import StringIO
import warnings

def compare_df(x, fit_stats=['Estimator', 'R-squared', 'No. Observations']):
    with warnings.catch_warnings():
        warnings.simplefilter(action='ignore', category=FutureWarning)
        y = pd.read_csv(StringIO(compare(x, stars=True).summary.as_csv()), skiprows=1, skipfooter=1, engine='python')
    z = pd.DataFrame(
        data=y.iloc[:, 1:].values,
        index=y.iloc[:, 0].str.strip(),
        columns=pd.MultiIndex.from_arrays(
            arrays=[y.columns[1:], y.iloc[0][1:]],
            names=['Model', 'Dep. Var.']
        )
    )
    return pd.concat([z.iloc[11:], z.loc[fit_stats]])

The PDF output is:

Pretty table

Note this solution requires pd.options.display.latex.repr = True.

Richard Herron
  • 9,760
  • 12
  • 69
  • 116