Vectorization of matlab code for faster execution

Question

My code works in the following manner:

1.First, it obtains several images from the training set

2.After loading these images, we find the normalized faces,mean face and perform several calculation.

3.Next, we ask for the name of an image we want to recognize

4.We then project the input image into the eigenspace, and based on the difference from the eigenfaces we make a decision.

5.Depending on eigen weight vector for each input image we make clusters using kmeans command.

Source code i tried:

clear all
close all
clc
% number of images on your training set.
M=1200;

%Chosen std and mean. 
%It can be any number that it is close to the std and mean of most of the images.
um=60;
ustd=32;

%read and show images(bmp);
S=[];   %img matrix

for i=1:M
    str=strcat(int2str(i),'.jpg');   %concatenates two strings that form the name of the image
    eval('img=imread(str);');


       [irow icol d]=size(img); % get the number of rows (N1) and columns (N2)
       temp=reshape(permute(img,[2,1,3]),[irow*icol,d]);     %creates a (N1*N2)x1 matrix
    S=[S temp];         %X is a N1*N2xM matrix after finishing the sequence
                        %this is our S
end


%Here we change the mean and std of all images. We normalize all images.
%This is done to reduce the error due to lighting conditions.
for i=1:size(S,2)
    temp=double(S(:,i));
    m=mean(temp);
    st=std(temp);
    S(:,i)=(temp-m)*ustd/st+um;
end

%show normalized images

for i=1:M
    str=strcat(int2str(i),'.jpg');
    img=reshape(S(:,i),icol,irow);
    img=img';

end


%mean image;
m=mean(S,2);   %obtains the mean of each row instead of each column
tmimg=uint8(m);   %converts to unsigned 8-bit integer. Values range from 0 to 255
img=reshape(tmimg,icol,irow);    %takes the N1*N2x1 vector and creates a N2xN1 matrix
img=img';       %creates a N1xN2 matrix by transposing the image.

% Change image for manipulation
dbx=[];   % A matrix
for i=1:M
    temp=double(S(:,i));
    dbx=[dbx temp];
end

%Covariance matrix C=A'A, L=AA'
A=dbx';
L=A*A';
% vv are the eigenvector for L
% dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx';
[vv dd]=eig(L);
% Sort and eliminate those whose eigenvalue is zero
v=[];
d=[];
for i=1:size(vv,2)
    if(dd(i,i)>1e-4)
        v=[v vv(:,i)];
        d=[d dd(i,i)];
    end
 end

 %sort,  will return an ascending sequence
 [B index]=sort(d);
 ind=zeros(size(index));
 dtemp=zeros(size(index));
 vtemp=zeros(size(v));
 len=length(index);
 for i=1:len
    dtemp(i)=B(len+1-i);
    ind(i)=len+1-index(i);
    vtemp(:,ind(i))=v(:,i);
 end
 d=dtemp;
 v=vtemp;


%Normalization of eigenvectors
 for i=1:size(v,2)       %access each column
   kk=v(:,i);
   temp=sqrt(sum(kk.^2));
   v(:,i)=v(:,i)./temp;
end

%Eigenvectors of C matrix
u=[];
for i=1:size(v,2)
    temp=sqrt(d(i));
    u=[u (dbx*v(:,i))./temp];
end

%Normalization of eigenvectors
for i=1:size(u,2)
   kk=u(:,i);
   temp=sqrt(sum(kk.^2));
    u(:,i)=u(:,i)./temp;
end


% show eigenfaces;

for i=1:size(u,2)
    img=reshape(u(:,i),icol,irow);
    img=img';
    img=histeq(img,255);

end


% Find the weight of each face in the training set.
omega = [];
for h=1:size(dbx,2)
    WW=[];    
    for i=1:size(u,2)
        t = u(:,i)';    
        WeightOfImage = dot(t,dbx(:,h)');
        WW = [WW; WeightOfImage];
    end
    omega = [omega WW];
end


% Acquire new image
% Note: the input image must have a bmp or jpg extension. 
%       It should have the same size as the ones in your training set. 
%       It should be placed on your desktop
ed_min=[];

srcFiles = dir('G:\newdatabase\*.jpg');  % the folder in which ur images exists
for b = 1 : length(srcFiles)
    filename = strcat('G:\newdatabase\',srcFiles(b).name);
    Imgdata = imread(filename);

        InputImage=Imgdata;

InImage=reshape(permute((double(InputImage)),[2,1,3]),[irow*icol,1]);
temp=InImage;
me=mean(temp);
st=std(temp);
temp=(temp-me)*ustd/st+um;
NormImage = temp;
Difference = temp-m;

p = [];
aa=size(u,2);
for i = 1:aa
    pare = dot(NormImage,u(:,i));
    p = [p; pare];
end


InImWeight = [];
for i=1:size(u,2)
    t = u(:,i)';
    WeightOfInputImage = dot(t,Difference');
    InImWeight = [InImWeight; WeightOfInputImage];
end
noe=numel(InImWeight);


% Find Euclidean distance
e=[];
for i=1:size(omega,2)
    q = omega(:,i);
    DiffWeight = InImWeight-q;
    mag = norm(DiffWeight);
    e = [e mag];

end

ed_min=[ed_min MinimumValue];

theta=6.0e+03;
%disp(e)

z(b,:)=InImWeight;

end


IDX = kmeans(z,5);
clustercount=accumarray(IDX, ones(size(IDX)));

disp(clustercount);

Running time for 100 images:Elapsed time is 103.947573 seconds.

QUESTIONS:

1.It is working fine for M=50(i.e Training set contains 50 images) but not for M=1200(i.e Training set contains 1200 images).It is not showing any error.There is no output.I waited for 10 min still there is no output.What is the problem?Where i was wrong?

Can you be more specific? What do you mean by **not working fine**? What errors do you get? To answer your second question, you can save the variable `S` to file using the `save` command. That way all you have to do is load it up using `load` before you start your analysis. — rayryeng, May 31 '14 at 15:14
@rayryeng It is not showing any error.There is no output.I waited for 10 min still there is no output. I think it is going infinite loop.What is the problem?Where i was wrong? — prashanth, May 31 '14 at 15:46
That isn't an error. That just means your code is taking too long to compute. Do you have to push Ctrl+C / Command+C to exit your code? — rayryeng, May 31 '14 at 15:51
Yes iam using Ctrl+C to stop.What is the solution to excute faster? — prashanth, May 31 '14 at 15:54
You can't accelerate the first `for` loop as you have to serially read all of the images one at a time. You can **definitely** accelerate the second `for` loop. Replace your `for` loop with this: `tempS = double(S); meanS = mean(S,1); stdS = std(S,0,1); S = (tempS - meanS) ./ stdS;` This will vectorize that loop so that you're standard normalizing your rows of `S` rather than looping through each column one at a time. You can apply the same principle in other areas of your code. — rayryeng, May 31 '14 at 16:00
@rayryeng Iam having total 2989 images in my database.So i have to use 40% of total images i.e 1200 images in training set.Is there any solution for this? — prashanth, May 31 '14 at 16:03
As I said, you need to go through areas of your code that need vectorizing. I'll make an answer that will modify areas of your code that can definitely be vectorized. — rayryeng, May 31 '14 at 16:03
Sorry about that. I forgot to come back during the StackOverflow outage. Looking through this code.... it's not worth my time and effort. There is **too** much to modify. What I'll leave for you is to go through the code like I did and modify it with the same principles that I have mentioned earlier. Good luck. — rayryeng, Jun 01 '14 at 16:20
@rayryeng tempS = double(S); meanS = mean(S,1); stdS = std(S,0,1); S = (tempS - meanS) ./ stdS; I tried this code what you told?But iam getting error.Error i got:y = sqrt(var(varargin{:})); — prashanth, Jun 01 '14 at 20:05
That error has nothing to do with my code. It says that it has an error in the `sqrt` function. I didn't invoke that function in the code I wrote. — rayryeng, Jun 01 '14 at 20:54

Adarsh Chavakula · Accepted Answer · 2014-06-03T11:55:55.040

To answer your second question, you can simply 'save' any generated variable as a .mat file in your working directory (Current folder) which can be accessed later. So in your code, if the 'training eigenfaces' is given by the variable 'u', you can use the following:

save('eigenface.mat','u')

This creates a .mat file with the name eigenface.mat which contains the variable 'u', the eigenfaces. Note that this variable is saved in your Current Folder.

In a later instance when you are trying out with your test data, you can simply 'load' this variable:

load('eigenface.mat')

This automatically loads 'u' into your workspace. You can also save additional variables in the same .mat file if necessary

save('eigenface.mat','u','v'...)

The answer to the first question is that the code is simply not done running yet. Vectorizing the code (instead of using a for loop), as suggested in the comment section above can improve the speed significantly.

[EDIT]

Since the images are not very big, the code is not significantly slowed down by the first for loop. You can improve the performance of the rest of the code by vectorizing the code. Vectorized operations are faster than for-loops. This link might be useful in understanding vectorization:

http://www.mathworks.com/help/matlab/matlab_prog/vectorization.html

For example, the second for-loop can be replaced by the following vectorized form, as suggested in the comments

tempS = double(S); 
meanS = mean(S,1); 
stdS = std(S,0,1); 
S = (tempS - meanS) ./ stdS;

Use MATLAB's timer functions tic and toc for finding how long the first for-loop alone is taking to execute. Add tic before the for loop and toc after it. If the time taken for 50 images is about 104 seconds, then it would be significantly more for 1200 images.

Thanx for your answer.I need answer for 1st question.Please help me. — prashanth, Jun 03 '14 at 08:15
What are the typical sizes of your images? If they are very large, then your imread itself will take a lot of time to read so many large images... — Adarsh Chavakula, Jun 03 '14 at 09:31
@ Adarsh Chavakula Image file size is 946 bytes.Image height and width is 50x50 pixels — prashanth, Jun 03 '14 at 14:25
Anyone knows matlab thoroughly please help me.Since iam new to matlab i dont know how to do. — prashanth, Jun 03 '14 at 14:34

Vectorization of matlab code for faster execution

1 Answers1