0

I'm trying to remove the html code that wraps the RichTextField content, I thought I could do it using "raw_data" but that doesn't seem to work. I could use regex to remove it but there must be a wagtail/django way to do this?

for block in post.faq.raw_data:
    print(block['value']['answer'])

Outputs:

<p data-block-key="y925g">The time is almost 4.30</p>

Expected output (just the raw text):

The time is almost 4.30

StructBlock:

class FaqBlock(blocks.StructBlock):
    question = blocks.CharBlock(required=False)
    answer = blocks.RichTextBlock(required=False)
Rob
  • 14,746
  • 28
  • 47
  • 65
squidg
  • 451
  • 6
  • 17

1 Answers1

0

You can do this in Beautiful Soup easily.

soup = BeautifulSoup(unescape(html), "html.parser")
inner_text = ' '.join(soup.findAll(text=True))

In your case, html = value.answer which you can pass into a template_tag

EDIT: example filter:

from bs4 import BeautifulSoup
from django import template
from html import unescape

register = template.Library()

@register.filter()
def plaintext(richtext):
    return BeautifulSoup(unescape(richtext), "html.parser").get_text(separator=" ")

There's the get_text() operator in BeautifulSoup which takes a separator - it does the same as the join statement I wrote earlier. The default separator is null string which joins all the text elements together without a gap.

<h3>Rich Text</h3>
<p>{{ page.intro|richtext }}</p>
<h3>Plain Text</h3>
<p>{{ page.intro|plaintext }}</p>

enter image description here

If you want to retain line breaks, it needs a bit more parsing to replace block elements with a \n. The streamvalue.render_as_block() method does that for you, but there's no method like this for RichTextField since it's just a string. You can find code examples to do this if you need.

  • Is there no wagtail/django way to do this? – squidg Feb 18 '23 at 10:12
  • A RichTextField is just a CharField with a DraftailEditor set as the widget. All that's stored in the db is the rendered HTML. When you open an editor page with RichTextField, in input filed is created with json value representing all the HTML in that field which is parsed by draft.js (I think). I've added an example filter above ^ – Rich - enzedonline Feb 19 '23 at 14:13