Recently I had to do same: synthesize recordings at specific positions in space with a moving sound source. I was able to simulate the system in Python using NumPy (and a tiny bit of SciPy which can easily be done in NumPy if you wish).
In my case, the sound source moves with constant velocity in a straight line. The basic concept of the simulation is that for each sample of the original recording, I calculate how much time that sample / part of the sound wave would take to reach each of the microphones. Taking into account both the speed of sound and velocity of the sound source, this should account for both the Doppler effect and the delay in time for the sound reaching each microphone.
When we have arrays describing the arrival times for each sample of the original recording at each of the synthetic microphone locations, we must resample the original recording at a fixed sample rate, which we can do with NumPy's interp
function.
The code looks something like this:
# Read the original recording and create an array with the time of each sample
sample, sample_rate, sample_duration = read_wav(my_file)
sample_length = int(sample_duration * sample_rate)
sample_time = np.linspace(sample_start_secs, sample_end_secs, sample_length)
# Configure our soundscape
speed_of_sound = 343
source_start = np.asarray([1, 0.5, 0])
# We could model this with an acceleration if we wanted to
velocity = np.broadcast_to([[-3, -4, 0]], (sample_length, 3))
position = source_start + velocity * (sample_time - sample_time[0])[:, np.newaxis]
mic_locs = np.asarray([
[0, 3, 0],
[3, 0, 0],
[0, -3, 0],
[-3, 0, 0]
])
num_mics = mic_locs.shape[0]
# Calculate the arrival times for each microphone
unit_vectors = mic_locs - position[:, np.newaxis, :]
unit_vectors /= np.linalg.norm(unit_vectors, axis=2, keepdims=True)
# This einsum is basically a dot product between the velocity and the unit vectors
# for each point in time
approach_speeds = np.einsum('ij,ikj->ik', velocity, unit_vectors)
# cdist is from scipy.spatial.distance, but could easily be implemented in NumPy
distances = cdist(position, mic_locs)
arrival_times = sample_time[:, np.newaxis] + distances / (approach_speeds + speed_of_sound)
# Resample the original recording to get the synthetic recordings
recordings_start = arrival_times[0, :].max()
recordings_end = arrival_times[-1, :].min()
recording_length = recordings_end - recordings_start
recording_samples = int(recording_length * sample_rate)
recordings = np.zeros((num_mics, recording_samples))
recordings_time = np.linspace(recordings_start, recordings_end, recording_samples)
for i in range(num_mics):
recordings[i, :] = np.interp(recordings_time, arrival_times[:, i], sample, left=0, right=0)