I am currently running a Triton server in production on AWS Cloud using a standard GPU enabled EC2 (very expensive).
I have seen these new GPU enabled Graviton instances can be 40% cheaper to run. However, they run on ARM (not AMD). Does this mean I can run the standard version of Triton server on this instance?
Looking at Triton server release notes, I have seen it can run on jetson nano, which is nvidia gpu ARM https://github.com/triton-inference-server/server/releases/tag/v1.12.0
Does this method reduce my costs? Can I run Triton server on these graviton instances?
Does performance drop using these instances?