How to use comparison operators within pandas query string in a markdown cell of Jupyter Notebook

Question

A similar question is as here - Print Variable In Jupyter Notebook Markdown Cell Python - but they were having issues with the compiling. My issue is in what Python code I should be using so that the variable is viewable when the html is compiled.

I have a dataset df4 with which I want to reference inline in the markdown of my jupyter notebook.

df4['amount'].sum()

works fine. But (all on the same line in Markdown)

df4.groupby(['customer_name'])['amount_due'].sum().reset_index() \
    .query('amount_due > 0')['amount_due'] \
    .sum()

returns the error **SyntaxError**: only a single expression is allowed ()

I could of course define all the variables I need in a Python cell above the Markdown cell in Jupyter Notebook and then refer to them within the Markdown cell only by name. E.g. "The total amount is {{x}}". But since there is this functionality (and also I have already compiled this report in R Markdown with inline R code) - I wanted to use it.

This is an example of what I am trying to achieve within jupyter notebook:

# Python3 cell
import pandas as pd

dt = {'customer_name': ['a','a','b','b','c'], 'amount': [-1,-1,1,1,1000]}
df4 = pd.DataFrame(data = dt)
df4

#### Markdown cell
This is a large amount : {{df4['amount'].sum()}} - let me explain further...

Output: This is a large amount : 1000 - let me explain further...

#### Markdown cell
This is a *larger* amount : {{df4.groupby(['customer_name'])['amount'].sum().reset_index().query('amount > 0')['amount'].sum()}} - let me explain further...

Expected: This is a large amount : 1002 - let me explain further...

When running the last cell the output is **SyntaxError**: only a single expression is allowed ().

The problem is with the ">" operator. It seems that markdown does not like the use of this character. Related issue on GitHub: Python-markdown syntax error if contains < or >.

Btw, does this code work in a normal "code cell"? Because I just searched on GitHub in the Python Markdown repository for the source of this error and found nothing. Instead, I found this error in the source code of pandas: [link](https://github.com/pandas-dev/pandas/blob/master/pandas/core/computation/expr.py#L445). So, could it be that the error has nothing to do with Python Markdown or Jupyter Notebook but with pandas? — Georgy, Jul 19 '19 at 13:05
No, it definitely works in a Python cell within Jupyter Notebook. On the duplicate question issue - anymore information needed to remove? — alicook, Jul 22 '19 at 08:14
I'm out of ideas then. Consider opening an issue on [GitHub](https://github.com/ipython-contrib/jupyter_contrib_nbextensions/issues). The duplicate flag is removed now. — Georgy, Jul 22 '19 at 08:20
It is an issue with the ">" operator. It seems markdown does not like the use of this character. I am trying a workaround using operator.lt(1,3) for 1>3 but not sure how to get it to work for a column within a query() — alicook, Jul 22 '19 at 09:30
How about `.query('amount.__gt__(0)')`? Or `(1).__gt__(3)` for the example from your last comment. — Georgy, Jul 22 '19 at 09:37
rather strange error "TypeError: 'Series' objects are mutable, thus they cannot be hashed" - as though I am trying to rename something, I guess. Will continue to investigate as even {{(1).__gt__(3)}} does not work — alicook, Jul 22 '19 at 09:57
No - still seeing the same error. However, using that for {{(1).\_\_gt\_\_(3)}} does work. — alicook, Jul 22 '19 at 12:30

How to use comparison operators within pandas query string in a markdown cell of Jupyter Notebook

0 Answers0