How do I make a bounding box match the image when zooming the image?

Question

I am creating a app where we can fetch image from gallery and text will be recognised from it, now upon adding zooming feature the detected text stays static, so is there a way i can track the text from image live, i was using google_ml_kit so far, so any idea on how to do is appreciated. Thank you.

import 'dart:async';
import 'dart:io';
import 'package:flutter/material.dart';
import 'package:image_picker/image_picker.dart';
import 'package:google_ml_kit/google_ml_kit.dart';
import 'package:photo_view/photo_view.dart';
import 'package:photo_view/photo_view_gallery.dart';

void main() {
  runApp(MaterialApp(home: TextDetectionApp()));
}

class TextDetectionApp extends StatefulWidget {
  @override
  _TextDetectionAppState createState() => _TextDetectionAppState();
}

class _TextDetectionAppState extends State<TextDetectionApp> {
  File? _pickedImage;
  List<TextElement> _detectedText = [];
  StreamController<List<TextElement>> _streamController =
  StreamController<List<TextElement>>();
  double _zoomLevel = 1.0;

  @override
  void dispose() {
    _streamController.close();
    super.dispose();
  }

  Future<void> _pickImage() async {
    final imagePicker = ImagePicker();
    final pickedImage = await imagePicker.pickImage(source: ImageSource.gallery);

    if (pickedImage != null) {
      setState(() {
        _pickedImage = File(pickedImage.path);
        _detectedText.clear();
        _zoomLevel = 1.0; // Reset zoom level when a new image is selected.
      });

      _detectText();
    }
  }

  Future<void> _detectText() async {
    if (_pickedImage == null) return;

    final textDetector = GoogleMlKit.vision.textRecognizer();
    final inputImage = InputImage.fromFile(_pickedImage!);

    try {
      final RecognizedText recognisedText =
      await textDetector.processImage(inputImage);

      if (recognisedText.blocks.isEmpty) {
        ScaffoldMessenger.of(context).showSnackBar(
          SnackBar(content: Text('No text detected')),
        );
      } else {
        final List<TextElement> elements =
            recognisedText.blocks[0].lines[0].elements;
        _streamController.sink.add(elements);
      }
    } catch (e) {
      ScaffoldMessenger.of(context).showSnackBar(
        SnackBar(content: Text('An error occurred during text detection')),
      );
      print(e);
    } finally {
      textDetector.close();
    }
  }

  void _zoomIn() {
    setState(() {
      _zoomLevel += 0.1;
    });
  }

  void _zoomOut() {
    setState(() {
      if (_zoomLevel > 0.1) {
        _zoomLevel -= 0.1;
      }
    });
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(
        title: Text('Text Detection App'),
        actions: [
          IconButton(
            icon: Icon(Icons.zoom_in),
            onPressed: _zoomIn,
          ),
          IconButton(
            icon: Icon(Icons.zoom_out),
            onPressed: _zoomOut,
          ),
        ],
      ),
      body: Column(
        children: <Widget>[
          ElevatedButton(
            onPressed: _pickImage,
            child: Text('Scan Text'),
          ),
          Expanded(
            child: StreamBuilder<List<TextElement>>(
              stream: _streamController.stream,
              builder: (context, snapshot) {
                if (snapshot.hasData) {
                  _detectedText = snapshot.data!;

                  return _buildPhotoView();
                } else {
                  return Center(
                    child: Text('No image selected'),
                  );
                }
              },
            ),
          ),
        ],
      ),
    );
  }

  Widget _buildPhotoView() {
    return PhotoView.customChild(
      backgroundDecoration: BoxDecoration(color: Colors.black),
      minScale: PhotoViewComputedScale.contained,
      maxScale: PhotoViewComputedScale.covered * 2,
      initialScale: _zoomLevel,
      child: Stack(
        children: <Widget>[
          if (_pickedImage != null)
            Image.file(
              _pickedImage!,
              fit: BoxFit.contain,
            ),
          for (var element in _detectedText)
            Positioned(
              left: element.boundingBox.left * _zoomLevel,
              top: element.boundingBox.top * _zoomLevel,
              child: Text(
                element.text,
                style: TextStyle(
                  color: Colors.red,
                  fontSize: 20,
                ),
              ),
            ),
        ],
      ),
    );
  }
}

this is purely a GUI issue. you need to know the transform from "image space" to viewport. then transform the bounding box likewise, then draw it. this has *nothing* to do with "tracking" of any kind. that term is reserved for actual computer vision. you don't do that. you already have your bounding box. — Christoph Rackwitz, Aug 24 '23 at 18:06

How do I make a bounding box match the image when zooming the image?

0 Answers0