I'm trying to build a translator app which would be able to replace foreign text in the real-time, but after exploring possible approaches got a bit cornered. Even though I was able to extract words images using Vision, I couldn't replace them in place in ARKit scene. Then I tried using ARReferenceImage and image tracking, but it needs to know the physical width of the target image which I can not guarantee, as the text could be on any surface from a book to a billboard. Am I missing something? What would you guys suggest?
Asked
Active
Viewed 85 times
0
-
1Sneaky trick: you don't necessarily need an understanding of 3D space (that is, what ARKit provides) in order to do something like this. When you get a text rect from Vision, you have the perspective projection of a 3D rect into 2D. To render your (translated text) overlay, draw it into a rect of the same proportions and apply a perspective distortion to match the original projection. – rickster Sep 25 '18 at 20:52
-
@rickster you should write this as an answer - I'm sure it's something other people will find useful and your answers are always comprehensive. – Jordan Sep 26 '18 at 08:56
-
@rickster hmm, but what would I base the distortion on? I imagine it could work if the camera moves towards the text and I recalculate stuff. Not sure what will happen If I approach the text from and angle, probably vision wouldn't even recognize it as a text. – bitemybyte Sep 26 '18 at 09:13