Provided your ETL jobs are not very long or complex (i.e., standard SQL is sufficient for the transformations needed), I think Presto can do a reasonable job. As you pointed out, there is no mid-query fault tolerance so you need a mechanism to restart your queries upon failure. Hopefully, the speed of Presto will offset an occasional restart. One additional strategy is to break long complex queries into a series of shorter/simpler ones and create temporary tables in between to effectively achieve manual checkpointing. Such strategy was leveraged by Facebook when they migrated some of their batch Hive jobs to Presto: https://www.slideshare.net/kbajda/presto-at-hadoop-summit-2016
One additional recommendation I would make is to spin a separate Presto cluster for ETL to avoid any resource contention with your interactive Presto workload.
In terms of instance types, it obviously depends on your queries. Most often you want a good balance of RAM and CPU. Starting with R4 instance type is a good bet. Once you observe your workload at runtime, you can either add more nodes to speed up the ETL process or explore other instance types (e.g., if CPU is fully loaded, moving to C4/5 instance type might be a good bet).
More generally the Presto-Users mailing list is a good source of information: https://groups.google.com/group/presto-users.
Also, learning from the community members at events such as Presto Summit (https://www.starburstdata.com/technical-blog/presto-summit-2018-recap/).