I just picked up a Google Coral USB Accelerator to learn ML without spending a lot of money on a new PC/GPU (I still have a Macbook Air from 2014 and it's laughably slow)
I'd like to try to build on the work done on this video shot detector model and train something for a related use case: https://arxiv.org/pdf/1705.08214.pdf
That model is a lot different from any of the Coral CV examples that I've been playing around with so far. Those that I have run all take single frame as input, but this approach requires a group of frames to be passed to the model together.
I'm trying to figure out if this sort of thing is even supported on the EdgeTPU - I found this page of operations to compare with the table on pg 3 of the PDF:
https://coral.ai/docs/edgetpu/models-intro/#supported-operations
The "Conv2D" listed seems pretty explicit that it's 2D, and there is no equivalent 3D operation listed. So does that mean I'm out of luck here?
I wonder if anybody has any other ideas / prior art I should look into on this sort of video analysis that would take advantage of the Coral Edge TPU?
Thanks!