0
   function [ samples,y, energies] = energy( speech, fs )
   window_ms = 200;
   threshold = 0.75;

   window = window_ms*fs/1000;
   speech = speech(1:(length(speech) - mod(length(speech),window)),1);
   samples = reshape(speech,window,length(speech)/window);
   energies = sqrt(sum(samples.*samples))';

   vuv = energies > threshold;
   y=vuv;

I have this matlab code and I need to write this code in c#. However I couldn't understand the last part of the code. Also i think speech corresponds to a data List or array according to first part of code. If it does not, please can someone explain what this code is doing. I just want to know logic. fs = 1600 or 3200;

Amro
  • 123,847
  • 25
  • 243
  • 454
Blast
  • 955
  • 1
  • 17
  • 40

3 Answers3

2

The code takes an array representing a signal. It then breaks it into pieces according to a window of specified length, compute the energy in each segment, and finds out which segments have energy above a certain threshold.

Lets go over the code in more details:

speech = speech(1:(length(speech) - mod(length(speech),window)),1);

the above line is basically making sure that the input signal's length is a multiples of the window length. So say speech was an array of 11 values, and window length was 5, then the code would simply keep only the first 10 values (from 1 to 5*2) removing the last remaining one value.

The next line is:

samples = reshape(speech,window,length(speech)/window));

perhaps it is best explained with a quick example:

>> x = 1:20;
>> reshape(x,4,[])
ans =
     1     5     9    13    17
     2     6    10    14    18
     3     7    11    15    19
     4     8    12    16    20

so it reshapes the array into a matrix of "k" rows (k being the window length), and as many columns as needed to complete the array. So the first "K" values would be the first segment, the next "k" values are the second segment, and so on..

Finally the next line is computing the signal energy in each segment (in a vectorized manner).

energies = sqrt(sum(samples.*samples))';
Amro
  • 123,847
  • 25
  • 243
  • 454
  • Thanks a lot Amro but i have two more question. x is one segment of array isn't it and what does the `[]`mean? I know it is equal to the speech lenght dived by window but what it is exactly doing? – Blast Sep 15 '13 at 20:23
  • that was just an example to show you how the array is reshape into a matrix, with each 4 values in a column, `x` is supposed to be `speech`. When using `reshape`, if you specify `[]` as one of the dimensions, you are telling MATLAB to compute it automatically (after all the number of elements will not changed when reshaping). – Amro Sep 16 '13 at 03:20
  • I have read the bytes from file and stored it into a list then i used `list.GetRange(0,(list.Count - (list.Count % window)));` in c#. Now i am gonna determine the `samples` using a method is equivalent to reshape. I hope i can handle this. :) Thank you so much for your answer once again. – Blast Sep 16 '13 at 07:32
1
List<int> speech = new List<int>();

int window = 0;

int length = speech.Count();

int result = length % window;

int r = length - result;

// speech = speech(1: r, 1)
Sam Leach
  • 12,746
  • 9
  • 45
  • 73
  • Thank your for your reply. It looks correct but why you write //speech = speech(1:r,1). What does it mean? – Blast Sep 13 '13 at 12:29
  • To run this code i should open .wav file in matlab. `[s,fs]=audioread('1_h_1.wav');` using this code i open 1_h_1.wav file. Then i have two outputs s and fs. fs takes value = 1600 and when i plot the s i see a wave form. – Blast Sep 14 '13 at 07:26
0

This:

(length(speech) - mod(length(speech),window)

is a formula

([length of speech] - [remainder of (speech / window)])

so try

(length(speech) - (length(speech) % window))

% is the symbol equivalent to mod(..)

EDIT I should say that I assume that is what mod(..) is in your code :)

iabbott
  • 873
  • 1
  • 8
  • 23
  • Out of context I don't know. It looks like `speech` is a method, and those are the arguments being passed to it... based on your comment under the question, it is assigning one of the list of integers to a variable – iabbott Sep 13 '13 at 12:06