I have a video with 3 persons speaking and I would like to annotate the location of people's eyes during it. I know that the Google Video Intelligence API has functionalities for object tracking, but it's possible to handle such an eye-tracking process using the API?
Asked
Active
Viewed 321 times
2 Answers
1
There is a detailed (Python) example from Google on how to track objects and print out detected objects afterward. You could combine this with the AIStreamer live object tracking feature, to which you can upload a live video stream to get results back.
Some ideas/steps you could follow:
- Recognize the eyes in the first frame of the video.
- Set/highlight a box around the eyes you are tracking.
- Track the eyes as an object in the next frames.

Cloudkollektiv
- 11,852
- 3
- 44
- 71
1
Google Video Intelligence API represents Face detection feature, which gives you opportunity to perform face detection from within video frames as well as special face attributes.
In general, you need to adjust FaceDetectionConfig throughout videos.annotate
method, supplying includeBoundingBoxes
and includeAttributes
arguments in JSON request body:
{
"inputUri":"string",
"inputContent":"string",
"features":[
"FACE_DETECTION"
],
"videoContext":{
"segments":[
"object (VideoSegment)"
],
"faceDetectionConfig":{
"model":"string",
"includeBoundingBoxes":"true",
"includeAttributes":"true"
}
},
"outputUri":"string",
"locationId":"string"
}