There's a way to make your index addressing static for synthesis.
First, based on the loop we can tell position
must have a value within the range of shift_aux
, otherwise you'd end up with null slices (IEEE Std 1076-2008 8.5 Slice names).
That can be shown in the entity declaration:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to N + M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
What's changed is the addition of a range constraint to the port declaration of position
. The idea is to support simulation where the default value of can be integer is integer'left
. Simulating your shift_register
would fail on the rising edge of en_s
if position
(the actual driver) did not provide an initial value in the index range of shift_aux
.
From a synthesis perspective an unbounded integer requires you take both positive and negative integer values in to account. Your for loop is only using positive integer values.
The same can be done in the declaration of the variable i
in the process:
variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
To address the immediate synthesis problem we look at the for loop.
Xilinx support issue AR# 52302 tells us the issue is using dynamic values for indexes.
The solution is to modify what the for loop does:
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to N + M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to N + M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop;
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
If i
becomes a static value when the loop is unrolled in synthesis it can be used in calculation of indexes.
Note this gives us an N + M input multiplexer where each input is selected when i = position
.
This construct can actually be collapsed into a barrel shifter by optimization, although you might expect the number of variables involved for large values of N and M might take a prohibitive synthesis effort or simply fail.
When synthesis is successful you'll collapse each output element in the assignment into a separate multiplexer that will match Patrick's
barrel shifter.
For sufficiently large values of N and M we can defined the depth in number of multiplexer layers in the barrel shifter based on the number of bits in a binary expression of the integer range of distance.
That either requires a declared integer type or subtype for position
or finding the log2 value of N + M. We can use the log2 value because it would only be used statically. (XST supports log2(x) where x is a Real for determining static values, the function is found in IEEE package math_real). This gives us the binary length of position
. (How many bits are required to to describe the shift distance, the number of levels of multiplexers).
architecture barrel_shifter of shift_register is
begin
process (en_s)
use ieee.math_real.all; -- log2 [real return real]
use ieee.numeric_std.all; -- to_unsigned, unsigned
constant DISTLEN: natural := integer(log2(real(N + M))); -- binary lengh
type muxv is array (0 to DISTLEN - 1) of
unsigned (N + M - 1 downto 0);
variable shft_aux: muxv;
variable distance: unsigned (DISTLEN - 1 downto 0);
begin
if en_s'event and en_s = '1' then
distance := to_unsigned(position, DISTLEN); -- position in binary
shft_aux := (others => (others =>'0'));
for i in 0 to DISTLEN - 1 loop
if i = 0 then
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(unsigned(cod_result), 2 ** i);
else
shft_aux(i) := unsigned(cod_result);
end if;
else
if distance(i) = '1' then
shft_aux(i) := SHIFT_LEFT(shft_aux(i - 1), 2 ** i);
else
shft_aux(i) := shft_aux(i - 1);
end if;
end if;
end loop;
shift_result <= std_logic_vector(shft_aux(DISTLEN - 1));
end if;
end process;
end architecture barrel_shifter;
XST also supports **
if the left operand is 2 and the value of i
is treated as a constant in the sequence of statements found in a loop statement.
This could be implemented with signals instead of variables or structurally in a generate statement instead of a loop statement inside a process, or even as a subprogram.
The basic idea here with these two architectures derived from yours is to produce something synthesis eligible.
The advantage of the second architecture over the first is in reduction in the amount of synthesis effort during optimization for larger values of N + M.
Neither of these architectures have been verified lacking a testbench in the original. They both analyze and elaborate.
Writing a simple case testbench:
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity shift_register_tb is
end entity;
architecture foo of shift_register_tb is
constant N: integer := 6;
constant M: integer := 6;
signal clk: std_logic := '0';
signal din: std_logic_vector (N + M - 1 downto 0)
:= (0 => '1', others => '0');
signal dout: std_logic_vector (N + M - 1 downto 0);
signal dist: integer := 0;
begin
DUT:
entity work.shift_register
generic map (
N => N,
M => M
)
port map (
en_s => clk,
cod_result => din,
position => dist,
shift_result => dout
);
CLOCK:
process
begin
wait for 10 ns;
clk <= not clk;
if now > (N + M + 2) * 20 ns then
wait;
end if;
end process;
STIMULI:
process
begin
for i in 1 to N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
end architecture;
And simulating reveals that the range of position
and the number of loop iterations only needs to cover the number of bits in the multiplier and not the multiplicand. We don't need a full barrel shifter.
That can be easily fixed in both shift_register architectures and has the side effect of making the shift_loop architecture much more attractive, it would be easier to synthesize based on the multiplier bit length (presumably M) and not the product bit length (N+ M).
And that would give you:
library ieee;
use ieee.std_logic_1164.all;
entity shift_register is
generic (
N: integer := 6;
M: integer := 6
);
port (
en_s: in std_logic;
cod_result: in std_logic_vector (N + M - 1 downto 0);
position: in integer range 0 to M - 1 ; -- range ADDED
shift_result: out std_logic_vector(N + M - 1 downto 0)
);
end entity shift_register;
architecture shift_loop of shift_register is
begin
process (en_s)
variable shift_aux: std_logic_vector(N + M - 1 downto 0);
-- variable i: integer range 0 to M - 1 := 0; -- range ADDED
begin
if en_s'event and en_s = '1' then
-- i := position;
shift_aux := (others => '0');
for i in 0 to M - 1 loop
-- shift_aux(N + M - 1 downto i) := cod_result(N + M - 1 - i downto 0);
if i = position then -- This creates an N + M - 1 input MUX
shift_aux(N + M - 1 downto i)
:= cod_result(N + M - 1 - i downto 0);
end if;
end loop; -- The loop is unrolled in synthesis, i is CONSTANT
shift_result <= shift_aux;
end if;
end process;
end architecture shift_loop;
Modifying the testbench:
STIMULI:
process
begin
for i in 1 to M loop -- WAS N + M loop
wait for 20 ns;
dist <= i;
din <= std_logic_vector(SHIFT_LEFT(unsigned(din),1));
end loop;
wait;
end process;
gives a result showing the shifts are over the range of the multiplier value (specified by M):

So the moral here is you don't need a full barrel shifter, only one that works over the multiplier range and not the product range.
The last bit of code should be synthesis eligible.