ISO8583 Field lengths with unicode characters

Question

Iso8583 messages can contain variable length fields, such as field 44/an ..25/Additional response data. Currently we are calculating these fields based on the character length, however we have added support for right-double-quote (”) and right-single-quote (’) which require UTF-8. These characters get encoded as three bytes, which means that the byte length and character/string length are no longer equal, and this breaks some of our processes.

My question is - does the Iso8583 standard require that the field length is bytes or characters?

Wikipedia is inconsistent and most other sources I've found aren't really explicit.

Guess you need to calculate binary length of bytes used for transferred string. Actually Unicode or UTF is not a good case to use in ISO8583 unless it is specifically allowed by your specification. — iso8583.info support, Jul 04 '19 at 09:42
We support FPS which allows those two utf/unicode characters so we have to support. In this case is the field length definitely bytes rather than characters? It would make sense given it's a binary format. — appalling22, Jul 04 '19 at 09:48
Do `”` and `’` actually add any business value? If not it might make more sense to replace them with `"` and `'`, which both circumvents this issue and also increases the likelihood that other systems will be able to process the data. — Jeroen Mostert, Jul 04 '19 at 12:36
" is 0x22 and ' is 0x27, both are in ascii. In all 8583 variants I'm aware of the 'AN' specifier restricts you to single byte character sets. You said it's a binary format. Does the field in question use the 'B' specifier? In that case the length is almost definitely specified in bytes/octets. If some provider requires bastardized AN qualifiers with UTF8/16 UCS ..., you really should ask the provides, because we can only guess unless we know which 8583 variant you are speaking. — a2800276, Jul 04 '19 at 12:39
@JeroenMostert `'` and `"` are not permitted in the FPS spec, only the unicode characters. Alas we have to support them because one of our clients supports them and they are allowed as part of the spec. Also we cannot replace them with safe characters as we should not be making any changes to their request. — appalling22, Jul 04 '19 at 14:23
Reasoning purely from the viewpoint that UTF-8 should be a transparent extension for systems that are used to processing single-byte ASCII data (which is the era from which ISO 8583 originally stems), the length should probably be counted in code points, not characters, so existing systems would see `”` as three "characters" that they can pass on without having to understand them. However, if you are implementing this specifically for a client, you should probably check with them, because it will do you no good to reject requests that are "too long" if you must process them, for example. — Jeroen Mostert, Jul 04 '19 at 14:40

Kevin Reeves · Answer 1 · 2023-02-15T10:03:43.210

The lengths are in bytes, and Technically there is nothing that should stop you using Unicode characters as long as it fits into the number of bytes. However, the messages often get translated between character sets like ASCII and EBCDIC as they traverse the payments system(s). As a result of these translations, Unicode characters do not always translate, are not supported, or get lost. Worst case scenario it breaks the process. Being a Financial format, loosing data is often not acceptable.

You should avoid non-printable ASCII characters, and any above decimal 127 (hex 7F). There are also a number of problematic characters that do not translate well between ASCII and EBCDIC cleanly (see https://www.ibm.com/docs/en/iis/11.7?topic=tables-conversion-table-irregularities). As example EBCDIC | translates to ASCII !, and ASCII | does not translate well using standard translation tables.

In my experience, I've found that the following table of ASCII characters is safe

Character(s)	Decimal	HEX
Space	32	20
&	38	26
'	39	27
*	42	2A
-	45	2D
.	46	2E
/	47	2F
0 - 9	48 - 57	30 - 39
A - Z	65 - 90	41 - 5A
a - z	97 - 122	61 - 7A

score 0 · Answer 2 · answered Jun 27 '22 at 11:34

if you want to have Unicode characters in an special field, the ISO8583 standard does not limit you for this.

you can develop a new field parser which gets byte stream/array and converts it to string and then sets that string to field data, in C# convertion to string is enough(because .net engine uses Unicode as it's primary encoding to store strings)

Here is an example which uses OpenIso8583Net library:

using System.Text;

namespace OpenIso8583Net.Formatter
{
    /// <summary>
    ///   UTF8 Field Formatter
    /// </summary>
    public class UTF8Formatter : IFormatter
    {
        #region IFormatter Members

        /// <summary>
        ///   Format the string and return as an encoded byte array
        /// </summary>
        /// <param name = "value">value to format</param>
        /// <returns>Encoded byte array</returns>
        public byte[] GetBytes(string value)
        {
            return Encoding.UTF8.GetBytes(value);
        }

        /// <summary>
        ///   Format the string and return as an encoded byte array
        /// </summary>
        /// <param name = "value">value to format</param>
        /// <param name="length"> </param>
        /// <returns>Encoded byte array</returns>
        public byte[] Pack(string value, int length)
        {
            return Encoding.UTF8.GetBytes(value.PadLeft(length, '0'));
        }

        /// <summary>
        ///   Length for UTF8 is not Implemented Completely!
        ///   Format the string and return as an encoded byte array 
        /// </summary>
        /// <param name = "value">value to format</param>
        /// <param name="length"> </param>
        /// <returns>Encoded byte array</returns>
        public byte[] GetBytes(string value, int length)
        {
            return Encoding.UTF8.GetBytes(value);
        }

        /// <summary>
        ///   Takes the byte array and converts it to a string for use
        /// </summary>
        /// <param name = "data">Data to convert</param>
        /// <returns>Converted data</returns>
        public string GetString(byte[] data)
        {
            return Encoding.UTF8.GetString(data);
        }

        /// <summary>
        ///   Gets the packed length of the data given the unpacked length
        /// </summary>
        /// <param name = "unpackedLength">Unpacked length</param>
        /// <returns>Packed length of the data</returns>
        public int GetPackedLength(int unpackedLength)
        {
            return unpackedLength;
        }

        #endregion
    }
}

ISO8583 Field lengths with unicode characters

2 Answers2