-1

I have a not so simple question about Java Sound ( javax.sound package ).

I am implementing MP3 player with cross fade and smooth volume and seek controls.

I am reading sound as stream in 4096byte chunks and calculate the position in miliseconds manually.

When I want to seek() ( change base position from where the stream will be red ) I hear a really ugly "jump" in sound wave. I tried examining JLayer and other MP3 APIs but they don't have a seek() function at all or they have this "ugly sound jump" too.

My question is: How can I make this jump from one sound wave chunk to the other smoother? I tried interpolation but a reasonable ammount of time to "not-hear the jump" is 300ms and thats too long for seek() function.

Have you encountered this problem?

Do you know the solution?

I will paste a code sample here just to be sure.

public void seek( long pPosition )
{
  sourceDataLine.flush();

  seekIndex = ( sourceDataLine.getMicrosecondPosition() / 1000 ) - currentPositionInMilliseconds;

}

public long getPositionInMilliseconds()
{ return ( sourceDataLine.getMicrosecondPosition() / 1000 ) - seekIndex; }

the "position in milliseconds" is needed because of DataLine API of javax.sound

Thanks I'm frustrated...

Danubian Sailor
  • 1
  • 38
  • 145
  • 223
Jan Cajthaml
  • 403
  • 3
  • 13

2 Answers2

0

You can't really create a smooth transition if the chunks you want to transition are too short for cross-fading, but you can eliminate the worst of the artifacts from the boundaries.

The bad artifact Im refering to often sounds like a click or pop, but if there are many in short succession it might sound like a thrashing sound or it may even introduce a specific pitch of its own, if the intervals are regular. This kind of artifact is a result of creating arbitrary blocks of audio because the amplitude of the audio at the boundaries may jump from one block to the next, or from the end of the block to silence. There are a few ways to eliminate it, the most common of which is to move the boundary from the arbitrary location to the nearest 'zero crossing' so that there is no longer a jump or discontinuity. Alternatively, since your blocks are rot on top of each other, you could do something to find some place where the value of the blocks cross each other, preferably going in the same direction.

Bjorn Roche
  • 11,279
  • 6
  • 36
  • 58
  • I tried that. I tried overlaping and "averaging" the transition between each chunks, I tried even leveling the "pitch" jump with master gain modulation but the core of the problem is that the DataLine is a InputStream pipe (thats how Java made it and since that 3 native libraries in JDX13 are signed JVM would not let me acess sound card any other way). So I cannot interpolate because "I don't know what will come next" its a stream. And If I "read that, do the interpolation and then drain the source". It takes time (about 10ms) which makes a really heavy sound jumps in fast forwarding... – Jan Cajthaml Jul 14 '13 at 20:44
  • And that is also why I don't "out source" the seek function to native function written in C, because JVM won't let me access these adresses for safety reasons. – Jan Cajthaml Jul 14 '13 at 20:45
  • Mabye I don't know some noise modulation that would cancel out "too much jumpy" waves of something like that. Something smart what would keep the API itself lightweight. I know it could be done heavyweight but thats not the solution that I seek... – Jan Cajthaml Jul 14 '13 at 21:01
  • I am not suggesting interpolation: I am suggesting you redefine the boundaries of the block you play back based on their amplitude. – Bjorn Roche Jul 16 '13 at 14:25
  • what are the methods to search for the zero crossing? – nutella_eater Oct 03 '20 at 22:11
  • @nutella_eater: a zero crossing is where the signal goes from positive to negative or vice versa, so to find the zero crossing, you simply need to iterate through the values — every time the sign changes, that’s a zero crossing. – Bjorn Roche Oct 04 '20 at 02:31
0

The only way I know to do this is working directly on the data at the per-frame level. You have to "open up" the sounds to get at the bytes and directly do your computations. Most built in Java controls have a granularity that is hindered by the size of the buffer, i.e., can only process one volume change, in effect, per sound data buffer.

Even when you are working at the per-frame level, there are problems to surmount with Java's lack of real time guarantees. But they are surmountable.

I made a "clip slicer," for example, that uses the equivalent of a clip as source sound. It takes random slices of the sample and strings them together. As little as 16 frames of overlapping interpolation works to keep the sound flowing smoothly. Using 1/10th of a second slices with 16-frame overlaps worked well for making an endlessly streaming brook from a 4-second recording.

I made a Theremin that takes mouse-motion listener locations for volume and pitch. I got it to work quite smoothly with about 30 or 40 frame latency. The trick was time-stamping the mouse-motion-listener outputs, and basing the controls on the calculations made on that data, as the events do not arrive or get processed smoothly in real time, creating zippering or other discontinuities.

Another thing to consider, the range on the data does not map well to decibels. So a small volume differential at the low end is much more discontinuous (and prone to clicks) than the same volume interval at the high end. I solved this by making a mapping of the audio data to decibel volumes, and powering the amount of volume change based on the amplitude mapping. I hope some of these ideas prove helpful!

Phil Freihofner
  • 7,645
  • 1
  • 20
  • 41