0

I use Python-Markdown to render user generated content. I'd like to change pictures from external sources to links.

So i have a list of storages:

storages = ['foo.com', 'bar.net']

and i need to replace

![](http://external.com/image.png)

to something like:

[http://external.com/image.png](http://external.com/image.png)

if host not in storages.

I tried to edit markdown-text before saving to database but it's not good solution as user may want to edit his data and discover data was modified. So i want to do that replacement on render.

Alex Zaitsev
  • 690
  • 6
  • 17
  • for me "best way" is any code which resolves problem - and it doesn't matter if it use `regex` or `find("![](")` or `markdown extension`. – furas Apr 02 '19 at 00:54
  • Note that opinion based questions are off-topic here. I would suggest editing your question to remove the option based question at the end and replace that with what you have tried so far and explain why that doesn't solve your problem. – Waylan Apr 02 '19 at 15:19
  • @Waylan it's done – Alex Zaitsev Apr 02 '19 at 15:57
  • This is a possible duplicate of [Check image urls using python-markdown](https://stackoverflow.com/q/5930542/866026). – Waylan Apr 03 '19 at 18:25

1 Answers1

2

One solution to your question is demonstrated in this tutorial:

from markdown.treeprocessors import Treeprocessor
from markdown.extensions import Extension
from urllib.parse import urlparse


class InlineImageProcessor(Treeprocessor):
    def __init__(self, md, hosts):
        self.md = md
        self.hosts = hosts

    def is_unknown_host(self, url):
        url = urlparse(url)
        return url.netloc and url.netloc not in self.hosts

    def run(self, root):
        for element in root.iter('img'):
            attrib = element.attrib
            if self.is_unknown_host(attrib['src']):
                tail = element.tail
                element.clear()
                element.tag = 'a'
                element.set('href', attrib.pop('src'))
                element.text = attrib.pop('alt')
                element.tail = tail
                for k, v in attrib.items():
                    element.set(k, v)


class ImageExtension(Extension):
    def __init__(self, **kwargs):
        self.config = {'hosts' : [[], 'List of approved hosts']}
        super(ImageExtension, self).__init__(**kwargs)

    def extendMarkdown(self, md):
        md.treeprocessors.register(
            InlineImageProcessor(md, hosts=self.getConfig('hosts')),
           'inlineimageprocessor',
           15
        )

Testing it out:

>>> import markdown
>>> from image-extension import ImageExtension
>>> input = """
... ![a local image](/path/to/image.jpg)
... 
... ![a remote image](http://example.com/image.jpg)
... 
... ![an excluded remote image](http://exclude.com/image.jpg)
... """
>>> print(markdown.markdown(input, extensions=[ImageExtension(hosts=['example.com'])]))
<p><img alt="a local image" src="/path/to/image.jpg"/></p>
<p><img alt="a remote image" src="http://example.com/image.jpg"/></p>
<p><a href="http://exclude.com/image.jpg">an excluded remote image</a></p>

Full disclosure: I am the lead developer of Python-Markdown. We needed another tutorial which demonstrated some additional features of the extension API. I saw this question and thought it would make a good candidate. Therefore, I wrote up the tutorial, which steps through the development process to end up with the result above. Thank you for the inspiration.

Waylan
  • 37,164
  • 12
  • 83
  • 109