0

As mentioned above, I created a custom model using form-recognizer studio and calling it with python via SDK. The model worked fine last week but it failed all of a sudden this week. The API calls are still successful however the Key-Value pairs it retuns are empty.

Logs for calls reponse:

2023-03-17 19:08:17,558 |  INFO Request URL: 'https://pedimento.cognitiveservices.azure.com/formrecognizer/documentModels/PedimentoModelCopy:analyze?pages=1&locale=en&stringIndexType=unicodeCodePoint&api-version=2022-08-31'
Request method: 'POST'
Request headers:
    'Content-Type': 'application/octet-stream'
    'Accept': 'application/json'
    'x-ms-client-request-id': '0502a4f2-c4b4-11ed-b9d7-5414f3f87608'
    'User-Agent': 'azsdk-python-ai-formrecognizer/3.2.0 Python/3.9.1 (Windows-10-10.0.19041-SP0)'
    'Ocp-Apim-Subscription-Key': 'REDACTED'
A body is sent with the request
2023-03-17 19:08:19,520 |  INFO Response status: **202**
Response headers:
    'Content-Length': '0'
    'Operation-Location': 'https://pedimento.cognitiveservices.azure.com/formrecognizer/documentModels/PedimentoModelCopy/analyzeResults/590eb38b-87b0-44d5-bb5e-f719e94e52cb?api-version=2022-08-31'
    'x-envoy-upstream-service-time': '378'
    'apim-request-id': '590eb38b-87b0-44d5-bb5e-f719e94e52cb'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'x-ms-region': 'REDACTED'
    'Date': 'Fri, 17 Mar 2023 11:08:18 GMT'
2023-03-17 19:08:19,565 |  INFO Request URL: 'https://pedimento.cognitiveservices.azure.com/formrecognizer/documentModels/PedimentoModelCopy/analyzeResults/590eb38b-87b0-44d5-bb5e-f719e94e52cb?api-version=2022-08-31'
Request method: 'GET'
Request headers:
    'x-ms-client-request-id': '0502a4f2-c4b4-11ed-b9d7-5414f3f87608'
    'User-Agent': 'azsdk-python-ai-formrecognizer/3.2.0 Python/3.9.1 (Windows-10-10.0.19041-SP0)'
    'Ocp-Apim-Subscription-Key': 'REDACTED'
No body was attached to the request
2023-03-17 19:08:19,835 |  INFO Response status: **200**
Response headers:
    'Content-Length': '106'
    'Content-Type': 'application/json; charset=utf-8'
    'x-envoy-upstream-service-time': '20'
    'apim-request-id': '0cc1299e-0c64-4540-998d-250691b456e3'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'x-ms-region': 'REDACTED'
    'Date': 'Fri, 17 Mar 2023 11:08:19 GMT'
2023-03-17 19:08:24,847 |  INFO Request URL: 'https://pedimento.cognitiveservices.azure.com/formrecognizer/documentModels/PedimentoModelCopy/analyzeResults/590eb38b-87b0-44d5-bb5e-f719e94e52cb?api-version=2022-08-31'
Request method: 'GET'
Request headers:
    'x-ms-client-request-id': '0502a4f2-c4b4-11ed-b9d7-5414f3f87608'
    'User-Agent': 'azsdk-python-ai-formrecognizer/3.2.0 Python/3.9.1 (Windows-10-10.0.19041-SP0)'
    'Ocp-Apim-Subscription-Key': 'REDACTED'
No body was attached to the request
2023-03-17 19:08:25,136 |  INFO Response status: **200**
Response headers:
    'Content-Length': '106'
    'Content-Type': 'application/json; charset=utf-8'
    'x-envoy-upstream-service-time': '34'
    'apim-request-id': 'd51bd641-e9e1-4771-b408-ad2d0757e79a'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'x-ms-region': 'REDACTED'
    'Date': 'Fri, 17 Mar 2023 11:08:24 GMT'
2023-03-17 19:08:30,142 |  INFO Request URL: 'https://pedimento.cognitiveservices.azure.com/formrecognizer/documentModels/PedimentoModelCopy/analyzeResults/590eb38b-87b0-44d5-bb5e-f719e94e52cb?api-version=2022-08-31'
Request method: 'GET'
Request headers:
    'x-ms-client-request-id': '0502a4f2-c4b4-11ed-b9d7-5414f3f87608'
    'User-Agent': 'azsdk-python-ai-formrecognizer/3.2.0 Python/3.9.1 (Windows-10-10.0.19041-SP0)'
    'Ocp-Apim-Subscription-Key': 'REDACTED'
No body was attached to the request
2023-03-17 19:08:30,886 |  INFO Response status: **200**
Response headers:
    'Content-Length': '104423'
    'Content-Type': 'application/json; charset=utf-8'
    'x-envoy-upstream-service-time': '47'
    'apim-request-id': '7c29a2a6-ccf0-48aa-a072-af374bb32794'
    'Strict-Transport-Security': 'max-age=31536000; includeSubDomains; preload'
    'x-content-type-options': 'nosniff'
    'x-ms-region': 'REDACTED'
    'Date': 'Fri, 17 Mar 2023 11:08:29 GMT'

I've tested the same documents in the test tab of form-recognizer studio and its able to capture all the key values correctly which stumps me why the poller result of it when called on SDK is empty. If anybody has encountered this error before.

Analyzed document bit of the poller.result():Only the first field had a value in it however its the wrong one.

documents=[AnalyzedDocument(doc_type=******, bounding_regions=[BoundingRegion(page_number=1, polygon=[Point(x=0.0, y=0.0), Point(x=8.5, y=0.0), Point(x=8.5, y=11.0), Point(x=0.0, y=11.0)])], spans=[DocumentSpan(offset=0, length=3233)], fields={'Total Impuestos': DocumentField(value_type=string, value='8000 8036', content=8000
8036, bounding_regions=[BoundingRegion(page_number=1, polygon=[Point(x=3.0062, y=3.3237), Point(x=3.7402, y=3.3237), Point(x=3.7402, y=3.3762), Point(x=3.0062, y=3.3762)])], spans=[DocumentSpan(offset=1867, length=9)], confidence=0.246), 'CVE': DocumentField(value_type=string, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.619), 'Num Pedimento': DocumentField(value_type=string, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.255), 'Modalid': DocumentField(value_type=string, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.638), 'Aduana': DocumentField(value_type=string, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.488), 'Fecha': DocumentField(value_type=date, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.617), 'Valor Aduana': DocumentField(value_type=string, value=**None**, content=None, bounding_regions=[], spans=[], confidence=0.635)}, confidence=0.001)])

Form recognizer test-tab: the model is able to pick up the key-value pairs correctly. redacted the image coz POC however the confidence values are high and it shows the relevant data

The get document model response :

DocumentModelDetails(model_id=PedimentoModel, description=Mexcio Ops | Novum Imports | Pedimento Forms, created_on=2023-02-27 07:54:29+00:00, api_version=2022-08-31, tags={}, doc_types={'PedimentoModel': DocumentTypeDetails(description=None, build_mode=neural, field_schema={'Num Pedimento': {'type': 'string'}, 'CVE': {'type': 'string'}, 'Aduana': {'type': 'string'}, 'Modalid': {'type': 'string'}, 'Valor Aduana': {'type': 'string'}, 'Fecha': {'type': 'date'}, 'Total Impuestos': {'type': 'string'}}, field_confidence={})})
kim16
  • 1
  • 1
  • Hi @KimKim16 can you share the API version that you are using in the Form Recognizer Studio? You should find it on the upper left corner of the page in a dropdown menu. – Krista Mar 21 '23 at 00:04
  • Hi @Krista. API version is :: 2022-08-31 (General Availability). I'm using the DocumentAnalysisClient package from azure.ai.formrecognizer – kim16 Mar 22 '23 at 03:47
  • 1
    Thanks. Would you also be able to share the output from DocumentModelAdministrationClient.get_document_model()? – Krista Mar 22 '23 at 16:49
  • Hi @Krista, shared the response above. Thanks! – kim16 Mar 29 '23 at 07:04
  • I'm having trouble reproducing this in the SDK. What I tried: trained a neural model using FR studio (2022-08-31) with 5 documents and 6 fields labeled. I then used the model_id to analyze a document, but I'm seeing the fields and their values returned correctly. In your test tab screenshot, as far as I can tell, the values for the fields are not being returned either (similar to how you see `None` in the SDK). For example, the model I trained showed the values under the Field labels: https://user-images.githubusercontent.com/31998003/229250142-6bb92ab5-5b48-4635-aafe-fc12bc6c7ee6.png. – Krista Mar 31 '23 at 23:37

0 Answers0