I'm a novice perl programmer trying to use DBI to write a buffer of text that contains an email with umlauts and other non-ASCII characters to a joomla database and having a problem.
DBD::mysql::st execute failed: Incorrect string value: '\xD6sterl...' for column `lsv5webstage`.`xuxgc_content`.`fulltext` at row 1 at /home/alerts/scripts_linstage/AdvisoryTest.pm line 373.
I'm not familiar enough with how encoding works to fully understand what the problem is. This is a fedora29 system with mariadb-10.3.12 and joomla-3.9.
Apparently the '\xD6' is an O with an umlaut in "Sebastian �sterlund". I read something about utf8 not being able to handle 4-char, but I don't fully understand.
I found the following reference online which talks about changing the encoding type from utf8 to utf8mb4, but the tables all appear to already be using that encoding:
> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR
Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name | Value |
+--------------------------+--------------------+
| character_set_client | utf8mb4 |
| character_set_connection | utf8mb4 |
| character_set_database | utf8mb4 |
| character_set_filesystem | binary |
| character_set_results | utf8mb4 |
| character_set_server | utf8mb4 |
| character_set_system | utf8 |
| collation_connection | utf8mb4_unicode_ci |
| collation_database | utf8mb4_unicode_ci |
| collation_server | utf8mb4_unicode_ci |
+--------------------------+--------------------+
I'm not sure it's helpful, but this is the insert statement I'm using in my perl code:
my $sql = <<EOF;
INSERT INTO xuxgc_content (title, alias, introtext, `fulltext`, state, catid, created, created_by, created_by_alias, modified, modified_by, checked_out, checked_out_time, publish_up, publish_down, images, urls, attribs, version, ordering, metakey, metadesc, metadata, access, hits, language)
VALUES ($title, "$title_alias", $introText, $fullText, $state, $catid, $created, $created_by, $created_by_alias, $modified, $modified_by, $checked_out, $checked_out_time, $publish_up, $publish_down, $images, $urls, $attribs, $version, $ordering, $metakey, $metadesc, $metadata, $access, $hits, $language);
EOF
my $sth = $dbh->prepare($sql);
$sth->execute();
db_disconnect($dbh);
The $fullText variable is populated from a buffer that contains the body of the email. I'm running it through quote() before performing the INSERT.
$fullText = $dbh->quote($fullText);
I also tried using "SET NAMES utf8mb4;INSERT INTO Mytable ...;" and it just didn't like the format.
Here's the full function that's used to connect to the database:
sub db_connect () {
my %DB = (
'host' => 'myhost',
'db' => 'mydb',
'user' => 'myuser',
'pass' => 'mypass',
);
return DBI->connect("DBI:mysql:database=$DB{'db'};host=$DB{'host'}", $DB{'user'}, $DB{'pass'}, { mysql_enable_utf8mb4 => 1 });
}
I don't recall having this problem in the past, and this script has been in use for quite a while.