38

How can rich text or HTML source code be obtained from the X clipboard? For example, if you copy some text from a web browser and paste it into kompozer, it pastes as HTML, with links etc. preserved. However, xclip -o for the same selection just outputs plain text, reformatted in a way similar to that of elinks -dump. I'd like to pull the HTML out and into a text editor (specifically vim).

I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses. The X clipboard API is to me yet a mysterious beast; any tips on hacking something up to pull this information are most welcome. My language of choice these days is Python, but pretty much anything is okay.

Community
  • 1
  • 1
intuited
  • 23,174
  • 7
  • 66
  • 88

3 Answers3

58

To complement @rkhayrov's answer, there exists a command for that already: xclip. Or more exactly, there's a patch to xclip which was added to xclip later on in 2010, but hasn't been released yet that does that. So, assuming your OS like Debian ships with the subversion head of xclip (2019 edit: version 0.13 with those changes was eventually released in 2016 (and pulled into Debian in January 2019)):

To list the targets for the CLIPBOARD selection:

$ xclip -selection clipboard -o -t TARGETS
TIMESTAMP
TARGETS
MULTIPLE
SAVE_TARGETS
text/html
text/_moz_htmlcontext
text/_moz_htmlinfo
UTF8_STRING
COMPOUND_TEXT
TEXT
STRING
text/x-moz-url-priv

To select a particular target:

$ xclip -selection clipboard -o -t text/html
 <a href="https://stackoverflow.com/users/200540/rkhayrov" title="3017 reputation" class="comment-user">rkhayrov</a>
$ xclip -selection clipboard -o -t UTF8_STRING
 rkhayrov
$ xclip -selection clipboard -o -t TIMESTAMP
684176350

And xclip can also set and own a selection (-i instead of -o).

Stephane Chazelas
  • 5,859
  • 2
  • 34
  • 31
  • 7
    Nice! Any idea why it's not been released yet? – intuited Jun 07 '13 at 21:41
  • 1
    This seems to be the easiest method now to get HTML contents from the clipboard on Unix systems. – xji Jun 23 '18 at 10:28
  • Just an update on the question: [xclip 0.13 was _finally_ released in 2016](https://github.com/astrand/xclip/releases), including all pending changes since version 0.12, released in 2009. – Maëlan Nov 12 '19 at 02:34
  • @Maëlan, thanks, I've now mentioned it in the answer. – Stephane Chazelas Nov 12 '19 at 11:31
  • Very useful to use this with: `thisOutputMarkdown | pandoc -s -f markdown -t html | xclip -selection clipboard -t text/html` to get formatted HTML on clipboard. – Pablo Bianchi Aug 06 '20 at 22:56
25

In X11 you have to communicate with the selection owner, ask about supported formats, and then request data in the specific format. I think the easiest way to do this is using existing windowing toolkits. E,g. with Python and GTK:

#!/usr/bin/python

import glib, gtk

def test_clipboard():
    clipboard = gtk.Clipboard()
    targets = clipboard.wait_for_targets()
    print "Targets available:", ", ".join(map(str, targets))
    for target in targets:
        print "Trying '%s'..." % str(target)
        contents = clipboard.wait_for_contents(target)
        if contents:
            print contents.data

def main():
    mainloop = glib.MainLoop()
    def cb():
        test_clipboard()
        mainloop.quit()
    glib.idle_add(cb)
    mainloop.run()

if __name__ == "__main__":
    main()

Output will look like this:

$ ./clipboard.py 
Targets available: TIMESTAMP, TARGETS, MULTIPLE, text/html, text/_moz_htmlcontext, text/_moz_htmlinfo, UTF8_STRING, COMPOUND_TEXT, TEXT, STRING, text/x-moz-url-priv
...
Trying 'text/html'...
I asked <a href="http://superuser.com/questions/144185/getting-html-source-or-rich-text-from-the-x-clipboard">the same question on superuser.com</a>, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/_moz_htmlcontext'...
<html><body class="question-page"><div class="container"><div id="content"><div id="mainbar"><div id="question"><table><tbody><tr><td class="postcell"><div><div class="post-text"><p></p></div></div></td></tr></tbody></table></div></div></div></div></body></html>
...
Trying 'STRING'...
I asked the same question on superuser.com, because I was hoping there was a utility to do this, but I didn't get any informative responses.
Trying 'text/x-moz-url-priv'...
http://stackoverflow.com/questions/3261379/getting-html-source-or-rich-text-from-the-x-clipboard
rkhayrov
  • 10,040
  • 2
  • 35
  • 40
  • Works well, thanks a bunch! I rolled this functionality into a primordial command-line interface: http://github.com/intuited/clipcli .................................. Any tips on how to parse the TIMESTAMP target? It doesn't seem to be a UNIX timestamp. Presumably there's info in the GTK documentation; I only took time for a cursory search for it. – intuited Jul 17 '10 at 04:49
  • TIMESTAMP type as defined by X11 protocol has nothing to do with seconds since Epoch. This is 32-bit unsigned integer containing time in milliseconds, typically since the X server startup. I don't think it has any direct use for an end-user application. – rkhayrov Jul 17 '10 at 09:45
  • 1
    I would like a KDE version of this. Any suggestions? Searching "kde clipboard API python" turned up nothing useful for me. – MountainX Jun 06 '13 at 03:37
  • I have gotten as far as deciding to use PySide, but I cannot find anything about the KDE clipboard... – MountainX Jun 06 '13 at 04:04
  • still relevant 25 years later – Tom Mar 29 '18 at 02:07
3

Extending the ideas from Stephane Chazelas, you can:

  • Copy from the formatted source.
  • Run this command to extract from the clipboard, convert to HTML, and then (with a pipe |) put that HTML back in the clipboard, again using the same xclip:
xclip -selection clipboard -o -t text/html | xclip -selection clipboard
  • Next, when you paste with Ctrl+v, it will paste the HTML source.

Going further, you can make it a shortcut, so that you don't have to open the terminal and run the exact command each time. ✨

To do that:

  • Open the settings for your OS (in my case it's Ubuntu)
  • Find the section for the Keyboard
  • Then find the section for shortcuts
  • Create a new shortcut
  • Set a Name, e.g.: Copy as HTML
  • Then as the command for the shortcut, put:
bash -c "xclip -selection clipboard -o -t text/html | xclip -selection clipboard"

Note: notice that it's the same command as above, but put inside of an inline Bash script. This is necessary to be able to use the | (pipe) to send the output from one command as input to the next.

  • Set the shortcut to whatever combination you want, preferably not overwriting another shortcut you use. In my case, I set it to: Ctrl+Shift+c

  • After this, you can copy some formatted text as normally with: Ctrl+c
  • And then, before pasting it, convert it to HTML with: Ctrl+Shift+c
  • Next, when you paste it with: Ctrl+v, it will paste the contents as HTML. ✨

Shortcut edition window

tiangolo
  • 1,536
  • 1
  • 15
  • 29
  • "Run this command to extract from the clipboard, convert to HTML, and then (with a pipe |) put that HTML back in the clipboard, again using the same xclip" That's not what the pipeline does. The first command requests html that is already stored in the clipboard -- no conversion. The second command stores the retrieved html content (again, no conversion) _as plain text_, so that it can be managed and returned by programs that only interact with plaintext in the keyboard. So the clipboard contains html tags but thinks it's handling plain text. – alexis Aug 23 '23 at 09:59