I think you might try looking into "granular synthesis" for some basic concepts.
If you can break the core sound into a "granule" and place these end-to-end (probably with a bit of overlapping with interpolation to help smooth them out), this might be "good enough" for what you are trying to do. To do this yourself would probably take being knowledgeable enough to edit PCM data of your audio files.
Looping a Java "Clip" will probably not work, as there will most likely be either clicks or silences at the joins. However, I wrote a clip looper that allows overlaps--as part of a simplistic mixer I'm working on. You are welcome to try using that. http://www.java-gaming.org/topics/simple-audio-mixer-2nd-pass/27943/view.html. The key tool that would help would be the PFClipLooper. But the PFClipShooters could work nicely for the short sounds because they allow playback at different pitches--allowing some crude approximations of inflection.
Are you working with Audacity or another DAW? You might be able to take your sound and edit it into something very short that can be looped there, to make a long "uuuuuu". Given long vowel recordings, you could have the playback of that vowel check a boolean that is set by the keystrokes. (Are you familiar with Java's playback code?) You'd have to write a clean volume taper for when the sound is stopped. Also, this approach would only work up to the length of the recorded vowel.
Overall, it would take a bit of work, that's for sure. If you are looking for an existing library, or "easy way" then I don't know. I'm also not familiar with all the ins and outs of Processing and how easy it is to use Java libraries. Can it tap into a tool like libpd (pure data) or csgrain or another synth tool? Something like that would be a lot more polished than the raw data manipulation I am brainstorming. But maybe my suggestions would work. I wonder if my library can be called from Processing? I never tested that. The source code is included so you can look at the logic for looping clips there, and for playing back at different speeds.