0

I have spectrograms which I acquired without the original sound files. Those are greyscale images, where the x axis represents time and the y axis represents frequency, which each pixel value represents volume (or so I believe).

I am pretty certain the files are those of a few songs and I need to be able to identify which songs those are. There are many files like these, so I need to be able to convert them in bulk.

Is there a way to convert them back to an mp3? How will this be done? I understand that it won't contain all the original information, but for my purposes any conversion will do.

Gilthans
  • 1,656
  • 1
  • 18
  • 23

1 Answers1

1

The answer is: it depends on your needs and resources. It's possible but you may be not satisfied. I understand that you have it in some image files. You should have separate real and imaginary spectrums. Otherwise you lack of all the phase information. But the record should be still 'understable'. Linear scale of frequency domain is desired. Other problem is a resolution.
For audible data you need at least 4k samples/s, so each second of your record should have at least 4000px/Fpx in time domain, where Fpx is amount of pixels in frequency domain.. Assuming the Fpx is 400, each second of your record should have 10px of width. For HiFi it's about 10 times more.

I doubt that the amplitude information - mapped to RGB (or Black-White) is reliable. You will get probably a few bits per sample, where 'the nice' starts at 12bits per sample.

  • I understand the quality will not match the original; I only need to be able to identify the song from which the spectrogram was extracted, so low quality should be good enough. Is there a way to circumvent the phase information problem? – Gilthans Apr 20 '16 at 18:57
  • So, yes, it's possible. Much better if frequency domain of these spectrograms is in linear scale. Logarithmic scale would be harder. Another thing: is the color map applied - easy to revert to amplitudes? If it's in greyscale - it's easy then. But if some sophisticated color map has been used, then it may be problem. What is the file format of theese images? – Mikaelblomkvistsson Apr 20 '16 at 22:26
  • It is greyscale. Pretty sure it is linear too. – Gilthans Apr 21 '16 at 04:52