Wednesday, December 15, 2021

AI Based PDF OCR using Microsoft Azure Form Recognizer

In real world, we will have many PDF files to read the content and prefill the forms in web application. To automatically read this PDF and predict the values, Microsoft offering cognitive service called Form Recognizer. Using this service, we can pass our PDF file and get the extracted OCR values as JSON back with bounding box coordinates. 

Ref: https://azure.microsoft.com/en-in/services/form-recognizer/#features

We can use some custom libraries to highlight the bounding box coordinates in UI over the image.  For this we can convert PDF into images and display it in UI as well.


For ex: 

https://www.w3schools.com/tags/tag_map.asp


Sample JSON extracted: Highlighted sample bounding box coordinates.

{"status":"succeeded","createdDateTime":"2021-02-23T05:09:00Z","lastUpdatedDateTime":"2021-02-23T05:09:11Z","analyzeResult":{"version":"2.1.0","readResults":[{"page":1,"angle":0,"width":1700,"height":2200,"unit":"pixel","lines":[{"text":"CONTOSO LTD.","boundingBox":[114,134,466,134,466,175,115,175],"words":[{"text":"CONTOSO","boundingBox":[115,135,333,134,333,176,115,176],"confidence":0.994},{"text":"LTD.","boundingBox":[357,134,465,134,465,176,358,176],"confidence":0.994}],"appearance":{"style":{"name":"other","confidence":0.878}}},{"text":"INVOICE","boundingBox":[1410,114,1601,115,1601,155,1410,155],"words":[{"text":"INVOICE","boundingBox":[1411,115,1593,115,1592,156,1411,155],"confidence":0.995}],"appearance":{"style":{"name":"other","confidence":0.878}}},.......}


Monday, September 13, 2021

Browser - Change Current Geo Location

In order to test different Geo Location in browser, we have option to change location in browser. Follow below steps,

Browsers used: Chrome, IE Edge

  1. Go to Developer Tools (Click F12)
  2. Click ... Settings Icon -> More Tools -> Sensors
  3. Now in Sensors Window, Change the location as needed. You can also manage for new locations.