There are 4 possible colors, so you need 2 bits per pixel.
Your original thought it correct, 2^2=4, and 2 bits can represent 4 different values:
In binary: 00, 01, 10, 11 (in decimal: 0, 1, 2, 3).
Each value represents an index in a color map.
The color map is used for mapping 0 to (255, 255, 255), 1 tp (0, 255, 0) 2 to (255, 120, 0) and 3 to (255, 255, 0).
Image is 8x8 pixels, so total bits needed are 8*8*2 = 128 bits (16 bytes).
When you need do decompress the image, there are two possibilities:
- First option: the decoder knows the 4 colors from advance - no need to store the color map with the compressed image.
- Second option: You need to store the color map with the compressed image, so additional bits are needed for storing the color map.
As your question is formulated, I am sure, the first option is true (you don't need to store the additional "bits" for the color map).
To make thinks more interesting, I coded the following MATLAB sample:
RGB = imread('peppers.png'); %Read input RGB image.
RGB = imresize(RGB, [64, 64]); %Reduce size to 64x64 (jsut for the example).
%Convert image to indexed image with 4 color.
[X, cmap] = rgb2ind(RGB, 4);
J = ind2rgb(X, cmap);
%Replace indices of color map:
cmap(1, 1:3) = [255, 255, 255]/255; %Fist color is white (255, 255, 255)
cmap(2, 1:3) = [0, 255, 0]/255; %Second color is green (0, 255, 0)
cmap(3, 1:3) = [255, 255, 0]/255; %Second color is yellow (255, 255, 0)
cmap(4, 1:3) = [255, 120, 0]/255; %Second color is orange (255, 120, 0)
K = ind2rgb(X, cmap);
figure;imshow(RGB);
figure;imshow(J);
figure;imshow(K);
RGB input image (true color):

Indexed image with only 4 color ("compressed" image):

Image after replacing color to white, green, yellow and orange (with arbitrary order):

Illustration for 8x8 image with 4 colors, as 8x8 matrix:
3,3,3,3,3,3,3,3
3,3,3,3,3,0,3,3
3,0,0,0,0,0,0,0
3,0,0,2,2,2,1,1
1,2,1,2,2,1,1,1
1,1,2,1,0,1,1,2
1,0,0,3,3,3,3,2
3,3,3,3,3,3,3,3
Illustration of color map:
0 --> (255 255 255)
1 --> ( 0 255 0)
2 --> (255 255 0)
3 --> (255 120 0)
For storing the color map, you need 4*3*8 = 96 bits (assuming a value like 255 requires 8 bits).