I have a SQL dump via DBeaver that ends up looking like this, but with 200k+ rows in 10 row segments:
INSERT INTO addresses (address_uuid,parent_address_uuid,`address.building`,`address.house_num`,`address.predir`,`address.qual`,`address.pretype`,`address.name`,`address.suftype`,`address.sufdir`,`address.ruralroute`,`address.extra`,`address.city`,`address.state`,`address.country`,`address.postcode`,`address.box`,`address.unit`,location,tenancy,units,created_at,updated_at) VALUES (...)
I'm attempting to utilize this file in a seeder and having all kinds of string formatting issues, but I've narrowed it down to only having a small diamond with a question mark: � (thought it might be a utf-8 encoding isssue, but that is seeming less and less likely) before INSERT INTO addresses
that is throwing a PDOException::("SQLSTATE[42601]: Syntax error: 7 ERROR: syntax error at or near "INSERT"
. Here's my code thus far:
// Get the file and remove line breaks and back ticks...
$sql = preg_replace("/\r|\n|`/", "", file_get_contents('some-file.sql'));
// Create array of INSERTS with 10 rows at a time...
$statements = array_filter(array_map('trim', explode(';', $sql)));
// Iterate and execute per segment
foreach ($statements as $stmt) {
\DB::statement($stmt);
}
As you can see I'm already removing line breaks and back ticks, but not sure what is the best way to remove the �. If I call utf8_encode()
it adds some other garbage before the INSERT
: 
I'm very open to "better" solutions to this whole process i.e., flysystem/filesystem
, fopen
, etc. The resources on it are sparse and mixed, despite it seeming to be a fairly straightforward and common need.
Also, using postgres and have the charset to utf8 in my config:
'pgsql' => [
'driver' => 'pgsql',
'host' => env('DB_HOST', '127.0.0.1'),
'port' => env('DB_PORT', '5432'),
'database' => env('DB_DATABASE', 'forge'),
'username' => env('DB_USERNAME', 'forge'),
'password' => env('DB_PASSWORD', ''),
'charset' => 'utf8',
'prefix' => '',
'prefix_indexes' => true,
'schema' => 'public',
'sslmode' => 'prefer',
],
And SHOW client_encoding;
verifies that is is UTF8
. The encoding of my file is "UTF-8"
when I dump $enc = mb_detect_encoding($file, mb_list_encodings(), true);
. Not sure if this is relevant, but \Illuminate\Support\Facades\Input::get('encoding');
dumps null
;
Oh and DBeaver's bloody Encoding
is set to UTF-8
as well: