I have reached a position in my design in which we need to massively increase parallelisation, but we have many resources to spare in the FPGA.
To that end, I have the type defined as
type LargeByteArray is array(0 to 10000) of std_logic_vector(7 downto 0);
I have two of these that I want to "byte-wise" average in as few operations as possible, as well as shift right to divide by two. So for example, avg(0) should be an 8bit standard logic vector which is a_in(0) + b_in(0) / 2. avg(1) should be a_in(1) + b_in(1) / 2 and so on. Assume for the moment we don't care that two 8 bit numbers add to a 9 bit. And I want to be able to do the entire 10000 operations in parallel.
I think I need to use an intermediate step to be able to bitshift like this, using the Signal "inter".
entity Large_adder is
Port ( a_in : LargeByteArray;
b_in : LargeByteArray;
avg_out : LargeByteArray);
architecture arch of Large_adder is
SIGNAL inter : LargeByteArray;
begin
My Current code looks a bit like this;
inter(0) <= std_logic_vector((unsigned(a_in(0)) + unsigned(b_in(0))));
inter(1) <= std_logic_vector((unsigned(a_in(1)) + unsigned(b_in(1))));
10000 lines later...
inter(10000) <= std_logic_vector((unsigned(a_in(10000)) + unsigned(b(10000))));
And a similar story for finally assigning the output with the bit shift
avg_out(0) <= '0' & inter(0)(7 downto 1);
avg_out(1) <= '0' & inter(1)(7 downto 1);
All the way down to 10000.
Surely there is a more space efficient way to specify this.
I have tried
inter <= std_logic_vector((unsigned(a_in) + unsigned(b)));
but I get an error about found '0' matching definitions for <= operator.
Now obviously the number could be decreased from 10000 in case this question looks stupid in what I'm trying to achieve, but in general, how do you write these sort of operations elegantly without a line for every element of my Type?
If I had to guess I would say we can describe to the "<=" operator what to do when met with LargeByteArray types. But I do not know how to do so or where to define this behaviour.
Thanks