0

I am using Google Document A.I for recognition of different types of forms such as U.S Government forms, W2 Forms, W9 Forms, Invoice Forms, Receipt Forms.

And I am getting this error from Google Document A.I when I try to process the form:

Unable to find a document of type 'w2_', found 1 other document types

OR

Unable to find a document of type 'w9_', found 1 other document types

got this error when I tried to process a W2 Form PDF file, also getting this error on some other type of PDF.

Here's my code:


const download_pdf = (bucketName, fileName) => {
  return new Promise(async (resolve, reject) => {
    //console.log("finalGS URL", `gs://${bucketName}/${fileName}`, "bucket:", bucketName, "fileName: ", fileName)

    const storage = new Storage()
    const bucket = storage.bucket(bucketName)
    const file = bucket.file(fileName)
    try {

      //download the file in memory
      const res = await file.download()

      let pdfBuffer = res?.flat()?.[0] || res?.[0]
      if (pdfBuffer) {
        resolve(pdfBuffer)
      } else {
        throw new Error(`Something went wrong when downloading file!`)
      }
    } catch (e) {
      console.error("error When Downloading Fiile ", e)
      reject(e)
    }

  })

}
    const encodedImage = await download_pdf(bucket_name, file_name)
              
                const request = {
                    name,
                    document: {
                        content: encodedImage,
                        mimeType: 'application/pdf',
                    }

                };
                
                // Recognizes text entities in the PDF document
                const [result] = await client.processDocument(request);
                document = result?.document;

At first, I thought something is wrong with my file, but then when I am uploading it directly to the google cloud platform It works there, but not when I try to process it from my code, and the error is not permanent it only happens randomly, sometimes it works and I get the recognized data but most of the time I am getting this error.

I have reviewed my code twice and it is according to google's document and I couldn't find any mistake.

Thank you in advance!.

PDF FILE: https://pdfhost.io/v/3~UcB6x0w_W9.pdf

UPDATE: It looks like that the problem is on google's side as of now I am getting this same error when uploading the file to Google Cloud Platform Document AI, still waiting for a response from google team.

Kashan Haider
  • 1,036
  • 1
  • 13
  • 23
  • Your code looks fine. I was able to reproduce your error when I used a w2 processor and used a non w2 file as input. Did you use w2 processor for a w9 file? Since your file is w9 you should use w9 processor as well. – Ricco D Aug 17 '21 at 06:04
  • I am using both parsers W2 and W9 and uploading the correct files, I have attached an example file in my question. – Kashan Haider Aug 17 '21 at 07:43
  • So your use case is that you used a W9 processor for a W9 file you have provided and got an error? – Ricco D Aug 17 '21 at 08:03
  • The error message is a bit confusing since it shows that the W2 processor was used to process a non W2 file. So I assumed that W9 file was used in a W2 processor. Can you clarify on how did you do the testing? – Ricco D Aug 17 '21 at 08:07
  • I uploaded W2 form to a W2 Parser and earlier the error was random, It was randomly working, but now I don't know what changed the error is permanent on both parsers W9 and W2. I am uploading the correct w9 file to the w9 form parser, and the correct w2 file to the w2 from the parser and both of these parsers giving me the same errors, in W9 i get errors that says that my form is not a w9 form. – Kashan Haider Aug 17 '21 at 08:50
  • I created a public issue tracker for your error. You can add details like, sample files that errors out so it can be reproduced easily by the Document AI/Vision Engineering team. See issue tracker https://issuetracker.google.com/197056017 – Ricco D Aug 18 '21 at 08:24
  • If you have a support package for GCP, you can also file a support ticket. – Ricco D Aug 20 '21 at 07:31

1 Answers1

0

As mentioned in the comments above, this error occurs when you send a document to a specialized processor that doesn't support that specific type.

E.g. Sending a W2 file to a W9/1099 Parser or vice-versa

This message will also occur if you send a document that is an unsupported Form Version.

This is listed on the Error Messages page in the documentation.

If you look at the Documentation, the W2 Parser supports years 2018, 2019, 2020 and the W9 Parser supports Form (Rev. 10-2018, Rev. 11-2017)

I wasn't able to reproduce the behavior with your sample document when sending it to a W9 parser. The models have been updated since this was originally posted, so this could have been a transient model issue originally.

Holt Skinner
  • 1,692
  • 1
  • 8
  • 21