USB, even at its slowest spec (1.1), can transfer data at up to 12MB/sec, provided you use the proper transfer mode. USB will process 1000 "frames" per second. The frames contain control and data information, and various portions of each frame are used for various purposes, and thus the total information content is "multiplexed" amongst these competing requirements.
Low speed devices will use just a few bytes in a frame to send or receive their data. Examples are modems, mice, keyboards, etc. The so-called Full Speed devices (in USB 1.1) can achieve up to 12 MB/sec by using isochronous mode transfers, meaning they get carved out a nice big chunk of each frame, and can send that much data (a fixed size) each time a frame comes along. This is the mode audio devices use to stream the relatively data-intensive music to USB speakers, for example.
If you are able to do a little bit of local buffering, you may be able to use isochronous mode to get your 64 bytes of data sent at 1 KHz, but with 2 or 3 periods (at 2.5KHz) worth of data in the USB frame transfer. You'd want to reserve 64 x 3 = 192 bytes of data (plus maybe a few extra bytes for control info, such as how many chunks are present: 2 or 3?). Then, as the USB frames come by, you'd put your 2 chunks or 3 chunks of data onto the wire, and the receiving end would then get that data, albeit in a more bursty way than just smoothly at a precise 2.5KHz rate. However, this way of transferring the data would more than keep up, even with USB 1.1, and still only use a fraction of the total available USB bandwidth.
The problem, as I see it, is whether your system design can tolerate a data delivery rate that is "bursty"... in other words, instead of getting 64 bytes at a rate of 2.5KHz, you'll be getting (on average) 160 bytes at a 1 KHz rate. You'll actually get something like this:

So, I think with USB that will be the best you can do -- get a somewhat bursty delivery of either 2 or 3 of your device's data packet per 1 mSec USB frame rep rate.
I am not an expert in USB, but I have done some work with it, including debugging a device-to-host tunneling protocol which used USB "interrupts", so I have seen this kind of implementation on other systems, to solve the problem of matching the USB frame rate to the device's data rate.