1

There is Firebird table with 2 blob fields- blob_binary field(subtype=0) and blob_Text field(subtype=1,utf-8). DB has utf-encoding. Connection has utf encoding. Version of Delphi is 10.2.3. I use FireDac components for data access. Server is Firebird 3.

enter image description here

App must write data from text-file(utf-8) to both blob-fields of "Content" table. Text file, what I must write in blobs, contains text on English, Russian and Georgian languages(see image).

Project and DB files, with editing permission

Code below writes text in binary blob field but characters are strange(not ??? simbols. Maybe Ansi characters?).

Code for save text-file in Blob_Binary field:

ID:=Query1.FieldByName('Content_id').asInteger;
OpenDialog1.Execute;
Query1.Close;
Query1.SQL.Text := 'SELECT * FROM content where Content_id=:id';
Query1.Params[0].AsInteger:=ID;
Query1.open;

Query1.Edit;
(Query1.FieldByName('BLOB_BINARY') as TBlobField).LoadFromFile(OpenDialog1.FileName);
Query1.Post;

When I save text file in binary blob field then:

1) if I saved text file in encoding utf-BOM I get in binary blob normal text and

2) strange characters if I choose for text file encoding utf.

But when I use the same code for writing data in text blob field data appears strange like chinese characters (see image).

enter image description here

What do I wrong? How to correct this code to write in both fields utf characters?

I tried another solutions but result is the same. For example:

ID:=Query1.FieldByName('Content_id').asInteger;
OpenDialog1.Execute;
Query1.Close;
Query1.SQL.Text := 'Update content set Blob_Text=:Blob_Text where 
Content_id=:id';
Query1.Params[0].DataType := ftBlob;
Query1.Params[0].AsStream := TFileStream.Create(OpenDialog1.FileName, fmOpenRead);
Query1.Params[1].AsInteger:=ID;
Query1.ExecSQL;

Update1: As I realised, if I save txt-file as "unicode" in noteped(or ucs-2 LE BOM in noteped++) it is saved fine in text blob, chines characters disappeared. Similarly, txt-file in binary blob is saved fine if it is in utf-BOM encoding. Although it's very uncomfortable not be able to save file in utf-8.

enter image description here

Arioch 'The
  • 15,799
  • 35
  • 62
basti
  • 399
  • 2
  • 10
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/188471/discussion-on-question-by-basti-how-to-write-txt-file-in-blob). – Samuel Liew Feb 15 '19 at 10:32
  • More screenshots and discussion of this question by this topicstarter: https://www.sql.ru/forum/1308971-a/krakozyabry-i-ieroglify-pri-chtenii-i-zapisi-iz-tekstovogo-fayla-ili-memo-v-blob-i-obratno – Arioch 'The Feb 18 '19 at 10:04

2 Answers2

1

What you're seeing is known as mojibake, caused by interpreting text in a different encoding than the one it was originally written in. When you get random CJK (Chinese/Japanese/Korean) characters, it usually comes from incorrectly interpreting 8-bit (ASCII, ANSI, UTF-8, etc) encoded text as UTF-16. Have a look at your string types and the string types going into and coming out of the database, and check for compiler warnings about ANSI and Unicode string type mismatches, and you should be able to get to the bottom of this fairly quickly.

Mason Wheeler
  • 82,511
  • 50
  • 270
  • 477
  • Mason Wheeler. Thank you for you answer but I tried all what I know: DB is created as utf, tables fields are in utf, firedac connection definition is in utf. text file is created as utf. I haven't slept for 3 days. What may be the reason? – basti Feb 13 '19 at 00:56
  • 1
    @basti If you haven't slept in three days, maybe you need to step away, get some sleep and look back at the problem with fresh eyes. – Mark Rotteveel Feb 13 '19 at 14:14
  • 1
    `8-bit (ASCII, ANSI, UTF-8, etc) encoded text as UTF-16` - should be typical exactly with 2009 and newer versions of Delphi (aka "Unicode Delphi"), and also AFAIR with Java and C#, when one does not specify charset connection or specifies OCTET binary one. I once even persuaded Mark to prohibit implicit NONE-charset connections, sadly not for long. – Arioch 'The Feb 13 '19 at 15:51
0

I have a same bug with ADOQuery and Firebird 2.5 Blob Field Sub_Type 1 (Text) String fields are converted fine, blobs are not.

If I change connection to IBX, all works fine

Solved by:

SettingsTEXT.AsString := UTF8Decode(SettingsTEXT.AsString)

AdoDBBug