How does one for example make the best use of retiming and/or c-slow to make the most of a given pipeline.
With retiming, some modules get better results by putting the shift registers on the inputs (forward register balancing), while other modules do better with shift registers on the output (backward register balancing).
For now I use the following method:
- code hdl (in verilog)
- create timing constraints for the specific module
- synthesize, map, place & route (using ISE 13.1)
- look at post place & route timings for the module-to-be-improved, and at the maximum number of logic levels.
- take this number of logic levels, and make an educated guess for the number of flip-flops to insert.
- insert flip-flops, enable register balancing, hope for the best
As it stands, this method is hit & miss. Sometimes it gets pretty good results, sometimes it's crap. So, what is a good way to improve the success ratio of such retiming?
Are there any tools that can aid in this? Also, links, papers and book recommendations would be much appreciated.