I am using the AWS Polly service for text to speech. But if the text contains some special characters, it is returning the wrong start and end numbers.
For example if the text is : "Böylelikle" it returns : {"time":6,"type":"word","start":0,"end":11,"value":"Böylelikle"}
But it should start from 0 and end to 10.
I've searched AWS Documentation and they say for the start and end values, the offset in bytes not characters.
My question is how can I convert this byte value to the character.
My code is:
builder.continueOnSuccessWith { (awsTask: AWSTask<NSURL>) -> Any? in
if builder.error == nil {
if let url = awsTask.result {
do {
let txtData = try Data(contentsOf: url as URL)
if let txtString = String(data: txtData, encoding: .utf8) {
let lines = txtString.components(separatedBy: .newlines)
for line in lines {
let jsonData = Data(line.utf8)
let pollyVoiceSentence = try JSONDecoder().decode(PollyVoiceSentence.self, from: jsonData)
voiceSentences.append(pollyVoiceSentence)
}
}
} catch {
print("Could not parse TXT file")
}
}
} else {
print("ParseJSON: \(builder.error!)")
}
completionHandler(voiceSentences)
return nil
}
And to highlight words:
let start = pollyVoiceSentence.start
var end = pollyVoiceSentence.end
let voiceRange = NSRange(location: start, length: end - start)
print("RANGE: \(voiceRange) - Word: \(pollyVoiceSentence.value)")
Thanks.