I am trying to read from an oracle db which stores data in Windows-1252 encoding. I am reading that data using jdbc and writing to an xml file with UTF-8 encoding.
while writing to these files, I am getting '?' characters instead of the latin characters e.g. instead of í, i get a ?
'Coquí' is being written to XML as 'Coqu?'
I use this file to upload to solr later on. I have only put the relevant code here and not the whole code since its a long method (legacy code that i have inherited) which is complicated.
BufferedWriter result = new BufferedWriter(new FileWriter(OUTPUT_FILE));
stmt = conn.createStatement(ResultSet.TYPE_SCROLL_SENSITIVE, ResultSet.CONCUR_READ_ONLY);
rst = stmt.executeQuery(sql);
if (rst.getFetchSize() < 1)
return;
rst.beforeFirst();
while (rst.next()) {
Profile p = new Profile();
p.business_name = rst.getString("business_name");
p.business_name_sort = rst.getString("business_name_sort");
result.write(p.business_name;
result.write(p.business_name_sort);
}