I am trying to implement a pipelined cache access as an optimization technique to increase my cache bandwidth for my I-cache which is a L-1 cache. I need to do this in verilog. The cache size is 64 KB, and two-way associative with a block size of 4 words.
I am still not clear on how does a pipelined cache access work. Will be really helpful if any explanation can be given theoretically or any link provided to have a better understanding. I have already researched on the net, and could not find any good read. I want to know what are the 2 stages in the pipelined cache access and how does it improve bandwidth?