0

I have an audio file in which there are 2 speakers, the first one recites a sentence & then the other one translates it.

I want to save an audio file for each sentence recited by speaker A & B.

Example: file -> book_translation.mp3 ( 1minute audio )

Speaker A: "How are you"

Speaker B: "wie gehts"

Speaker A: "I'm good"

Speaker B: "Mir geht's gut"

Output expected: 4 mp3 files ->
A_01.mp3 B_01.mp3 A_02.mp3 B_02.mp3

Lakhani Aliraza
  • 435
  • 6
  • 8
  • You want to seperate the audio of both speakers? – Raj May 01 '20 at 06:40
  • yes, I want to split each sentence of both the speakers into different files – Lakhani Aliraza May 01 '20 at 06:41
  • Seems like a cocktail party problem. But if both the speakers do not speak simultaneously you can just split the audio file when the second person starts speaking. – Raj May 01 '20 at 06:45
  • yes both the person doesn't speak simultaneously. – Lakhani Aliraza May 01 '20 at 06:48
  • problem is I have more than 20hours of such audio file & I can't listen & split them – Lakhani Aliraza May 01 '20 at 06:50
  • 1
    Take a look at this: https://datascience.stackexchange.com/questions/33291/audio-analysis-segment-audio-based-on-speaker-recognition – Raj May 01 '20 at 06:59
  • 1
    In general this is called Speaker Diarization, and can be complicated. But if there is periods of not speaking in between, you can cut them up based on the volume. Once you got those clips, then you can probably use clustering on Audio Embeddings to put them into two categories. – Jon Nordby May 01 '20 at 19:02
  • 2
    Here is a list of resources for Speaker Diarization, https://github.com/wq2012/awesome-diarization – Jon Nordby May 01 '20 at 19:10
  • @jonnor I am new to this, I tried using pyannote-audio but couldn't achieve the result because of less patience, I am looking for an easy breakthrough, also any tutorial with this will help – Lakhani Aliraza May 02 '20 at 02:58
  • @Raj Thanks for the resource, will look into this – Lakhani Aliraza May 02 '20 at 03:01
  • Hey @Raj thanks a lot for your help, at least I'm done with prototyping, have to work on 2 problems now, 1) Low Accuracy , 2) Noise added in sub audio files – Lakhani Aliraza May 02 '20 at 05:23

0 Answers0