0

I'm using Hamcrest for asserting in my tests. The below snippet works for other string comparisons, however below statement is failing because of some random character(?) at 0th place in the object value array as shown in attached image below.

assertThat("failure, Publication did not match", book.getPublication(), is("Bloomsbury Publishing"));

Here is the result:

java.lang.AssertionError: failure, Publication did not match
Expected: is "Bloomsbury Publishing"
     but: was "‎Bloomsbury Publishing"
Expected :Bloomsbury Publishing
Actual   :‎Bloomsbury Publishing

Value array for object returned

If it helps, Book is an extended JPA entity from Product entity where Product has annotation @Inheritance( strategy = InheritanceType.JOINED ).

Product Class

private long id;
private String prodName;
private BigDecimal price; 

Book Class

private String genre;
private String author;
private String publication;

In my test data in data.sql I have:

INSERT INTO PRODUCT(ID, PROD_NAME, PRICE) VALUES (1, 'Harry Potter', 200.55);
INSERT INTO PRODUCT(ID, PROD_NAME, PRICE) VALUES (2, 'Chhawa', 450.45);
INSERT INTO PRODUCT(ID, PROD_NAME, PRICE) VALUES (3, 'Chatrapati Shivaji Maharaj', 1000.00);
INSERT INTO PRODUCT(ID, PROD_NAME, PRICE) VALUES (4, 'Asa Mi Asami', 99.99);
INSERT INTO BOOK(ID, GENRE, AUTHOR, PUBLICATION) VALUES (1, 'Contemporary Fantasy', 'J. K. Rollings', '‎Bloomsbury Publishing');
INSERT INTO BOOK(ID, GENRE, AUTHOR, PUBLICATION) VALUES (2, 'Action', 'Shivaji Savant', 'Mehta Publishing House');
INSERT INTO BOOK(ID, GENRE, AUTHOR, PUBLICATION) VALUES (3, 'Action', 'Krishanrao Arjun Kelusakar', 'Saraswati Publishing Co.Pvt.Ltd');
INSERT INTO BOOK(ID, GENRE, AUTHOR, PUBLICATION) VALUES (4, 'Comedy', 'Pu La Deshpande', 'SANSKRUTI BOOK HOUSE');

And I'm un-marshling the json returned by @GetMapping(path = "/products/{id}") like:

ResponseEntity<Book> response = restTemplate.exchange(
        productBaseUrl,
        HttpMethod.GET,
        null,
        Book.class);

Book book = response.getBody();

Mysteriously, I get this '\u200E' 8206 unicode character only for ID=1

Here is the link to the whole code base: https://bitbucket.org/tyro_02/demo.cart/

tyro
  • 577
  • 8
  • 17

1 Answers1

1

8206, the first character, is the Unicode Left-to-Right mark:

8206 Character

You can replace the character with Java regex support, using the character class:

"\\p{C}"

Java Regex matchers

That is, if you believe this test should PASS. If you think, after your analysis, it should FAIL, then the result does FAIL as it stands. The Book class can return a String with stripped Unicode punctuation in its getPublication() getter also, using a regex replaceAll if you can modify this getter.

See also Wikipedia Control characters U+200E. (Made an edit by the way, this is a Control character.)

user176692
  • 780
  • 1
  • 6
  • 21
  • Thanks @user176692. I'm just un-marshaling JSON response body to Book object. This character is there in the json response hence why is getting into this Book object too. Now I'm looking into why that unicode character is getting in the json response in first place. – tyro Nov 03 '19 at 19:06
  • Looked through your repo a bit, what is the datatype of PUBLICATION? Even if it were a Unicode type, i.e. nchar, don't see why it would insert left-to-right on its own. Curious if this character is in the DB. Can be useful if you are mixing left-to-right languages with right-to-left (like Hebrew and Persian, see https://dba.stackexchange.com/questions/76025/how-to-transfer-right-to-left-text-from-excel-to-sql-server). – user176692 Nov 04 '19 at 14:45
  • It's all english characters so I dont see any reason for that character to be there. PUBLICATION is of type String. I'm using H2 database for testing, and I cant see that character there. – tyro Nov 04 '19 at 14:58
  • Not sure if this matters, but data.sql is saved in UTF-8 encoding. I tried copying the Bloomsbury Publisher letters with the enclosing quotes into Visual Studio and I got the 'Some Unicode characters in this file could not be saved..." warning. Maybe resaving in ANSI format or re-writing in ANSI will prevent them from entering H2? Just a thought. – user176692 Nov 04 '19 at 18:10