I am able to write utf8mb4 characters (e.g ) into a MySQL db table if I first run the command: SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'
on the connection:
connection = DriverManager.getConnection(url, prop)
val config: PreparedStatement = connection.prepareStatement("SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci'");
config.executeUpdate();
val insert: PreparedStatement = connection.prepareStatement("INSERT INTO test "
+ "(ID, NAME) VALUES (?,?)");
insert.setInt(1, 20);
insert.setString(2, "test ");
insert .executeUpdate();
However, I need to write a Dataframe to the db using Spark, and the API doesn't give me access to the connection. I can only pass Properties that will be used to establish the connection (or pass them on the url):
val prop = new Properties()
prop.put("user", username)
prop.put("driver", "com.mysql.cj.jdbc.Driver")
prop.put("characterEncoding", "UTF-8") // doesn't accept utf8mb4
prop.put("connectionCollation", "utf8mb4_unicode_ci")
df.write.mode("overwrite").jdbc(s"jdbc:mysql://$dbUrl/$dbName", dbTable, prop)
And this doesn't work. I can't pass utf8mb4 as the encoding because the Java lib throws an error, and these settings prevent my app from crashing but it saves ??? to the db instead of the characters. Any idea how to solve this?