Java Microsoft Azure speech-to-text from bytes?

Question

I am thinking about building the code more efficiently. I am using Discord JDA and the Microsoft Azure speech service. Is it possible to recognize speech directly from bytes, not from a file? I mean, skipping writing bytes to a temporary file and then recognizing the file. Or maybe in some other, better way I can do it? The current method seems inappropriate to me.

My AudioReceiveHandler:

@Override
public void handleUserAudio(@NotNull UserAudio userAudio) {
    User user = userAudio.getUser();
    
    if (!BYTES.containsKey(user))
        BYTES.put(user, new ArrayList<>());
    
    ArrayList<byte[]> userBytes = BYTES.get(user);
    
    userBytes.add(userAudio.getAudioData(1));
}

Converting speech to text:

private void read(ArrayList<byte[]> userBytes) {
        int length = 0;

        for (byte[] bytes : userBytes) {
            length += bytes.length;
        }

        byte[] decodedData = new byte[length];

        int i = 0;

        for (byte[] bytes : userBytes) {
            for (byte sampleByte : bytes) {
                decodedData[i++] = sampleByte;
            }
        }

        String filePath = "[...]/temp.wav";

        try {
            AudioSystem.write(new AudioInputStream(new ByteArrayInputStream(decodedData),
                            new AudioFormat(48000f, 16, 2, true, true), decodedData.length),
                    AudioFileFormat.Type.WAVE,
                    new File(filePath));
        } catch (IOException exception) {
            exception.printStackTrace();
        }

        SpeechConfig speechConfig = SpeechConfig.fromSubscription("-", "-");

        speechConfig.setSpeechRecognitionLanguage("pl-PL");

        AudioConfig audioConfig = AudioConfig.fromWavFileInput(filePath);

        SpeechRecognizer recognizer = new SpeechRecognizer(speechConfig, audioConfig);

        Future<SpeechRecognitionResult> task = recognizer.recognizeOnceAsync();

        try {
            SpeechRecognitionResult result = task.get();

            Logger.info("RECOGNIZED: " + result.getText());
        } catch (Exception exception) {
            exception.printStackTrace();
        }
    }

Java Microsoft Azure speech-to-text from bytes?

0 Answers0