-1

I am trying to write python code using rmarkdown in bookdown. The python code is ok. The problem is when the book pdf is generated, some long python codes and sometimes some python codes' output are outside of the pdf page and therefore they are not visible. Please see the images below.

In the image you can see the print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1])) function code is not fully visible, but the output is visible. Another case, for new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_') code, the whole code line is not visible and also the output of the code. The same issue is in sns.scatterplot () line of code.

I am just wondering whether there is anyway in bookdown pdf, both the code and the associated output will not be outside of the pdf page.

Note: I tried to write python code in rmarkdown in multiple lines, but it did not work and most cases the codes are not executed when python codes are written in multiple lines in rmarkdown.

pdfoutput1

Here is the code that I used to generate the output in the image

from sklearn import datasets
iris = datasets.load_iris()
iris.keys()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

iris_df.sample(frac = 0.05)
iris_df.shape
print ('The total number of rows and columns in the dataset is {} and {} respectively.'.format(iris_df.shape[0],iris_df.shape[1]))
iris_df.info()

new_col = iris_df.columns.str.replace('\(.*\)','').str.strip().str.upper().str.replace(' ','_')
          
new_col
iris_df.columns = new_col
iris_df.info()

sns.scatterplot(data = iris_df, x = 'SEPAL_LENGTH', y = 'SEPAL_WIDTH', hue = 'TARGET', palette = 'Set2')
plt.xlabel('Sepal Length'),
plt.ylabel('Sepal Width')
plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
plt.show()
shafee
  • 15,566
  • 3
  • 19
  • 47
Sharif
  • 163
  • 1
  • 9
  • You should provide a reproducible example. I am getting errors by running the script not only because of lack of import statements but also due to column names `SEPAL_LENGTH`. – shafee May 17 '23 at 07:26

1 Answers1

1

I do not know why writing python code in multiple lines did not work for your case, whether have you tried in the right way (since you didn't provide much info regarding that).

From the PEP 8 – Style Guide for Python Code

The preferred way of wrapping long lines is by using Python's implied line continuation inside parentheses, brackets and braces. Long lines can be broken over multiple lines by wrapping expressions in parentheses. These should be used in preference to using a backslash for line continuation

So if you write code by following the above suggestion, code should run fine in the rmarkdown (or in bookdown) too.

Also along with that, you can try to reduce the font size a bit for source code and output using latex packages and commands (since your intended output format is pdf). And latex package fvextra provides some nice options for reducing font sizes or even auto line wrapping for long code lines.

Therefore, keeping all of these in mind, try the followings,

(Note that how I have wrapped all of the long lines inside the parenthesis)

intro.Rmd

# Hello bookdown 

```{r setup, include=FALSE}
library(reticulate)
# reticulate::py_install(c("scikit-learn","pandas", "matplotlib", "seaborn"))
use_virtualenv("r-reticulate/")
```


```{python}
import pandas as pd
import seaborn as sns
from sklearn import datasets
import matplotlib.pyplot as plt
```

```{python}
iris = datasets.load_iris()
iris_df = pd.DataFrame (data = iris.data, columns = iris.feature_names)
iris_df['target'] = iris.target

(print(
  'The total number of rows and columns in the dataset is {} and {} respectively.'
  .format(iris_df.shape[0],iris_df.shape[1])))
```


```{python}
new_col = (iris_df.columns
            .str
            .replace('\(.*\)','')
            .str.strip()
            .str.upper()
            .str.replace(' ','_'))
new_col
```

\newpage

```{python}
iris_df.columns = new_col
sns.scatterplot(
  data = iris_df, 
  x = 'SEPAL_LENGTH_(CM)', 
  y = 'SEPAL_WIDTH_(CM)', 
  hue = 'TARGET', 
  palette = 'Set2')
plt.xlabel('Sepal Length'),
plt.ylabel('Sepal Width')
plt.title('Scatterplot of Sepal Length and Width for the Target Variable')
plt.show()

```

And add the lines in your preamble.tex file,

\usepackage{fvextra}
\DefineVerbatimEnvironment{Highlighting}{Verbatim}{commandchars=\\\{\},fontsize=\footnotesize}

\makeatletter
\def\verbatim{\footnotesize\@verbatim \frenchspacing\@vobeyspaces \@xverbatim}
\makeatother

If you need bigger or smaller font size than this, try with small or scriptsize.

Then use that preamble.tex file in the includes in header in the _output.yml file,

bookdown::pdf_book:
  includes:
    in_header: preamble.tex
  latex_engine: xelatex
  citation_package: natbib
  keep_tex: yes

rendered pdf output

page one

page two

shafee
  • 15,566
  • 3
  • 19
  • 47
  • Thanks for your response. Your code works when I knit it and produce pdf, but one problem is that when I try to run a code from rmarkdown, the code is not fully exectued in console. For example - when I run `new_col = ()` from your code (I did not include the arguments in the parenthesis for brevity), the console looks like - `>>> new_col = (iris_df.columns ... ... ` Do you have any idea why this happens and how I can fix it? – Sharif May 17 '23 at 16:33
  • By "code not fully executed in console", do you mean that you get errors? I can run all of the codes fine from rmarkdown! [See here](https://i.stack.imgur.com/LFt3D.png) – shafee May 17 '23 at 16:56
  • Actually by "code not fully executed in console", I mean only the first lines are shown on console and for the remaining lines three dots `...` are shown as I mentioned above- like this - `>>> new_col = (iris_df.columns ... ... `. Please note that I am trying to write a book both in HTML and Pdf using the minimal demo example from here - . I believe you are just writing an rmd file and then knit it into pdf. Thanks for your help. – Sharif May 17 '23 at 17:09
  • No, I am not just rendering an rmd file and knitting it. I am using bookdown and using `render_book()` to render the whole book. And I have just provided solution because the question was asked for pdf only! – shafee May 17 '23 at 17:13
  • And however, I have used `reticulate::repl_python` to run python from rmd. Assuming you are using that too, you should probably look into that why the output got truncated (which is not happening in my case, so probably its a local issue). And nevertheless, code generates the intended pdf output, right? – shafee May 17 '23 at 17:18
  • Thanks. I agree with you that it is a local issue. I am wondering whether your folder of all bookdown files can be shared with me. Thanks – Sharif May 17 '23 at 17:19
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/253703/discussion-between-shafee-and-sharif). – shafee May 17 '23 at 17:28