Is there a way to use both an image and text data for precision image detection TensorFlow

Question

I am working on a basic image detection neural network in Tensorflow. It has been trained to identify foods with around 94% accuracy. I have wondered whether it is possible to supply text data with the image to the neural net to improve accuracy. For example, if a sugary sweet were to be identified, the text input could give information about the general age that eats that type of food (mainly children), any health effects that may be caused by it (diabetes), and other general information. Is this possible in TensorFlow? If so, what libraries are used? I have searched online and have found nothing.

Thank you

score 1 · Answer 1 · answered Jun 15 '21 at 03:20

yes it is possible. You will need to create a model with 2 inpus one for the image and one for the text. The image will serve as input to a convolutional neural network. The text input needs to be processed using natural language processing. There are many tutorials available on that. Now within you model you will need to concatenate the last layer of your convolutional network with the last layer of your NLP network. Note these must have the proper dimensions to do so.

Hi! Can you share a tutorial for that? – Ege Can Jul 06 '22 at 13:48 — Ege Can, Jul 06 '22 at 13:48

Is there a way to use both an image and text data for precision image detection TensorFlow

1 Answers1