9

I have to following setting: Ubuntu 12.04, Mathematica 9 and IntelliJIDEA 12. Every time I copy some text from Mathematica and paste it into IDEA, there are a lot of additional bytes at the end of the pasted text. What first appeared to be a bug in IDEA seems now rather be a bug in java itself. I have appended a minimal java example which shows the behavior.

Therefore, when I type Plot inside Mathematica, select and copy it, and then run the example I get the following output where the first line is the printed form and the second line are the bytes:

enter image description here

As you can see the Plot is followed by a 0 byte and some other, not necessarily zero, bytes. Throughout all of my tests, I found that a valid solution is to use the string until the first 0 is found, but that does not solve the underlying problem. I really want to see this fixed, because I often copy code between Mathematica and IntelliJIDEA, but first I need to know who to blame for this.

Question:

How can I find out whether Mathematica or Java is the doing something wrong here? I can copy Mathematica content to different editors, browsers, etc and I never saw something like this. On the other hand, I never found IntelliJ (Java) copying waste either. What is a good way to find out whether Mathematica is using the clipboard wrong or Java has a bug?

Minimal example

Select some text in Mathematica, press Ctrl+C and run the following

import java.awt.*;
import java.awt.datatransfer.Clipboard;
import java.awt.datatransfer.DataFlavor;

public class CopyPasteTest {

  public static void main(String[] args) {
    final String text;
    try {
      final Clipboard systemClipboard =
        Toolkit.getDefaultToolkit().getSystemClipboard();
      text = (String) systemClipboard.getData(DataFlavor.stringFlavor);
      System.out.println(text);
      for (byte a : text.getBytes()) {
        System.out.print(a + " ");
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Further information requested in comments

Could just take a look at the clipboard contents after the copy-from-Mathematica operation?

Sure. Unfortunately it returns absolutely nothing. When I mark and copy the following something from the browser for instance, like "this here" I get

patrick@lenerd:~$ xclip -out | hexdump -C
00000000  74 68 69 73 20 68 65 72  65                       |this here|
00000009

Edit

I tried the following things where I used always the same copied "Plot" string from Mathematica. First of all, I tried the larger test-class from David as suggested in his comment. With both, the Oracle JRE and the OpenJRE that comes with Ubuntu I got the following output:

===========
Plot[00][7f][00][00]
===========
Obtained transferrable of type sun.awt.datatransfer.ClipboardTransferable
Plot[00][7f][00][00]
===========

My short sniped from above gives the same result (although not in hex representation). Then I tried the different selections from xclip and using the value clipboard brought the following up

patrick@lenerd:~$ xclip -o -verbose -selection clipboard | hexdump -C
Connected to X server.
Using selection: XA_CLIPBOARD
Using UTF8_STRING.
00000000  50 6c 6f 74 00 00 00 00                           |Plot....|
00000008

Important to note, when I don't use verbose output with xclip, I only see "Plot" in the terminal. Above, you see that there are exactly 4 more bytes in the buffer which are probably not shown, because they start with a 00. Additionally, the extra for bytes are 00 00 00 00, at least this is what is displayed. In java we have a 7f (or 127) at second position.

I guess this all suggests that the bug comes from Mathematica since it copies additional stuff in the buffer and Java is just a bit sloppy because it doesn't cut at the first 00.

halirutan
  • 4,281
  • 18
  • 44
  • Could just take a look at the clipboard contents after the copy-from-Mathematica operation? I am am not fully conversant in Ubuntu (i.e. in this case, Gnome), but the "xclip" program should be used for this AFAIK. Examine the "primary clipboard" which is the one used by CTRL-C etc: "xclip -out | hexdump -C" – David Tonhofer Nov 18 '13 at 00:34
  • @DavidTonhofer Sure, but it seems the Ctrl+C content from *Mathematica* is stored somewhere else. Please see my edit at the end of the question. – halirutan Nov 18 '13 at 01:01
  • Subsidiary question to make this clear: So if you copy from Mathematica and do "xlcip -out", nothing is returned, whereas if you copy from e.g. the browser then a valid cliboard content (e.g. "this here") is returned? Please also try with "xclip -out -selection secondary" and "xclip -out -selection clipboard". – David Tonhofer Nov 18 '13 at 14:02
  • @DavidTonhofer I was a bit in a rush yesterday, so I tried only what you asked for. I will test `xclip` more thoroughly tonight and post the results here. Thanks for your ideas. – halirutan Nov 18 '13 at 14:54
  • Sure, tell me how it goes. You may also want to try another JDK (OpenJDK? Oracle JDK?) to see what happens. I added more output to your Clipboard test, too, maybe it will bring something up: http://kyubee.homelinux.org/stackoverflow/CopyPasteTest.java – David Tonhofer Nov 18 '13 at 18:31
  • @DavidTonhofer Please see my edit. Do you agree with my conclusion? – halirutan Nov 19 '13 at 00:37

3 Answers3

2

These conclusions look sound.

If found the following references about behaviour of the X clipboard:

X11r6 Inter-Client Communication Conventions Manual, in particular Peer-to-Peer Communication by Means of Selections, and also a more compressed explanation (and Python test tools) at Developer’s corner: copy-paste in Linux

Thus, the data "Plot[00][7f][00][00]" or maybe "Plot[00][00][00][00]" is the data that is actually provided by Mathematica on request to the application that "reads" the clipboard. I can only imagine that Mathematica says "here is the string with eight bytes" and the reading application tries to deal with this, reading past the end of the actual character array.

It could also be a bug in X (but Ubuntu 12.04 doesn't use Mir yet, so probably not.)

Note that in Java Strings are not NUL-terminated and "Plot[00][7f][00][00]" is a valid string indeed.

A quick glance at the source of xclip (obtained with yumdownloader --source xclip on my Fedora) seems to reveal that it just calls XFetchBuffer or memcpy (not fully sure) to obtain bytes, then calls fwrite on those, so it will happily write the NULs to the output.

David Tonhofer
  • 14,559
  • 5
  • 55
  • 51
  • Thank you very much for your help along the way. This answer is precisely what I asked for and with your help it was possible to show that the extra-bytes can be found with `xclip` too. – halirutan Nov 19 '13 at 15:35
0

It's looks like some issues with string end character(I had similar issues with data modified by c++ dll, and sent through external system). I don't know how to fix the problem, but I think you can make simple workaround to remove invalid chars - simple call trim() method on text.

text = (String) systemClipboard.getData(DataFlavor.stringFlavor);
text = text.trim();
System.out.println(text);
SathOkh
  • 826
  • 2
  • 12
  • 24
  • The catch here is of course, that it will remove every whitespace, which is in the beginning or the end of the string, too. – halirutan Nov 14 '13 at 21:33
0

I guess, it's zero terminated "c-style" string and there is some misunderstanding about it between Matematica and Java. I'd ask somewhere on a Linux forum how the clipboard is supposed to work.

As a workaround, I'd suggest

test.replaceFirst("\u0000(?s:.*)", "");
maaartinus
  • 44,714
  • 32
  • 161
  • 320