0

I'm trying to create an Android app which will get the lyrics of an mp3 from the ID3V2 tag of it. My question is, is it possible to get the lyrics automatically highlighted as the song plays? Like using speech processing or things like that. I've looked into the previous similar questions but all of them requires manual input. Need an ASAP feedback. Thank you.

2 Answers2

7

This kind of thing is possible on Hollywood movie sets, using technology similar to those image enhancements that reconstruct a face using a 4-pixel square as input.

Okay, so your request is theoretically more feasible, but no current phone technology I know of could do this on the fly. You might need a Delorean, flux capacitor and some plutonium.

Also, detecting vocals over music is a much harder problem than speaking a text message into your phone:

  1. Sung lyrics do not usually follow natural speech rhythm;
  2. The frequency spectrum of music tends to conflict with the frequency spectrum of voice;
  3. The voice varies in pitch, making it much harder to isolate and detect phonetic features;
  4. The vocals are often mixed at a level equal to all other musical instruments;
  5. IwannahuhIwannahuhIwannahuhIwannahuhIwannaReallireallirealliwannaZigaZiggUHH.
paddy
  • 60,864
  • 6
  • 61
  • 103
  • I'd love to see a speech-to-text engine do #5 – siebz0r Aug 13 '12 at 03:09
  • The [ID3V2 spec](http://www.id3.org/id3v2.3.0) referenced by the poster does allow sync'ed lyrics embedding in the MP3. (Not sure if that falls in the "speech processing or things like that" category, though.) – martijno Aug 13 '12 at 15:25
  • 1
    @paddy How about feeding the program with the actual lyrics of the song and then expecting the program to synchronize the lyrics with the music? Is there a library that you know of, available for doing that? – Kevin May 26 '14 at 10:02
3

You might take a look at this paper LyricSynchronizer: Automatic Synchronization System Between Musical Audio Signals and Lyrics for a possible solution. Nothing implemented in Java for Android, but with the NDK you might take any C code and finagle it to work. ;-)

This paper describes a system that can automatically synchronize polyphonic musical audio signals with their corresponding lyrics. Although methods for synchronizing monophonic speech signals and corresponding text transcriptions by using Viterbi alignment techniques have been proposed, these methods cannot be applied to vocals in CD recordings because vocals are often overlapped by accompaniment sounds. In addition to a conventional method for reducing the influence of the accompaniment sounds, we therefore developed four methods to overcome this problem: a method for detecting vocal sections, a method for constructing robust phoneme networks, a method for detecting fricative sounds, and a method for adapting a speech-recognizer phone model to segregated vocal signals. We then report experimental results for each of these methods and also describe our music playback interface that utilizes our system for synchronizing music and lyrics.

Best of luck in your implementation!

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Norman H
  • 2,248
  • 24
  • 27