import scala.io.Source
def checkCodec(filename:String): String = {
val bufferedSource = Source.fromFile(filename)
val codec:String = (bufferedSource.codec).toString
println("bufferedSource.codec - " +bufferedSource.codec)
bufferedSource.close
if(codec.equalsIgnoreCase("UTF-8")){
return filename + " " + codec
}
else{
return "CodecErrorDetected"
}
}
val validFile = checkCodec(fileName)
println("The file is - "+validFile)
This function runs fine and gives "UTF-8" as the result even when the file type is .zip, incorrect file format or some corrupted file (used https://pinetools.com/corrupt-file-generator). How can I distinguish atleast the corrupted file (for eg: I changed a pdf file to .pddssee format which even doesn't exist, it is still recognized as a UTF-8 file). Need help in understanding how can I distinguish a corrupted file using scala. Is this the correct way I am checking for corrupt file?
Will appreciate your valuable input.