A possible approach is:
- Take a single period of the sample (identified visually). It is important that it begins and ends at 0 value (or very close) to avoid cracking noises caused by discintinuities at the each endpoints.
- Upsample or downsample the waveform extracted in step 1 period as necessary to get the desired pitch. Upsampling decreases pitch, downsampling increases it. You can use
resample
function for instance.
- Patch a lot of those periods one after the other until you get the desired duration of 5 seconds. You can use
repmat
function for that.
- Multiply that 5-second waveform element-by-element by a time-envelope with the desired shape. The envelope shape will typically be a fast attack in the form of a linear ramp from 0 to 1 , then a long constant value and then a decreasing ramp towards 0.
For increased realism you could introduce slow amplitude variations in the "constant" part of the envelope (tremolo effect). You could also extract in step 1 a piece of signal containing not one but several periods of the waveform. Those periods will not be exactly the same, and that will add "warmth" to the sound.