3

I work with an SAP system with an IBM DB2 database attached to it. I created a transparent table in SAP system and then I checked how it looked like on the database level. It turned out that the character fields (CHAR, DATS, CUKY, NUMC) are three times bigger than their length specified in SE11. For example CLIENT field of type MANDT has the type VARCHAR(9).

I could understand multiplication of the length by 2 because of the fact that SAP is a Unicode system. But the multiplication by 3? Is anybody able to explain it to me?

Sandra Rossi
  • 11,934
  • 5
  • 22
  • 48
Jagger
  • 10,350
  • 9
  • 51
  • 93

1 Answers1

2

This effect does not depend on the DBMS used (I'm seeing the same effect on Oracle-based systems). It really is a unicode/NUC issue: On a NUC system, the client field is a VARCHAR2(3), on a unicode system of otherwise identical software components, it's a VARCHAR2(9). I can only guess that this is due to the usage of some CESU-8 variant.

vwegert
  • 18,371
  • 3
  • 37
  • 55
  • 1
    UTF-8 can use up to 4 bytes per glyph, but it seems that it's not very common to encounter 4-byte characters so using 3x is a reasonable compromise. – Ian Bjorhovde Oct 18 '12 at 20:29
  • 2
    I have also found a very interesting article on this topic [Forms of Unicode](http://www.icu-project.org/docs/papers/forms_of_unicode/) – Jagger Oct 18 '12 at 21:00