Make the data speak itself: Microsoft Azure Vision API

Vision API is one of Cognitive Services APIs provided by Microsoft Azure in order to help AI developers to build their own dedicated Machine Learning model or whether they are going to use a pre-canned or pre-trained version. Developer can add Machine Learning features into their applications without much direct AI or data science knowledge.

Vision API includes computer vision, face recognition, content moderator, a video indexer and customer vision. In this post, We are trying to explore how to use Azure APIs to extract hidden data in the images with computer vision API and face recognition API.

Prerequisites

Create a Microsoft account - Create one here

Create a valid subscription key for computer vision and face detection. Create one for free. The free trial is available in 30 days before the update.

Create Azure Cognitive Services Resource - Create one here. The resource give us key and endpoint URL that allow us to call APIs

Now , go to Azure portal, login with your Microsoft account, list created resources, you are able to something similar to the below screen:

We are going to call those APIs with Python and Jupiter Notebook

Call Computer Vision Service

- Prepare an image: It can be located in your computer or in internet. I choose an Eiffel tower in Paris where I pass by every morning.

- Get the API Key and endpoint that you already created in previous step to authenticate your applications and start sending calls to the service.

- Once we got the endpoint, we build complete URI request to access computer vision service:

- Setup the request header with subscription key and URL request, parameters and data object. They are structured as dictionary.

- Lets send the request to server with the help of request - a Python package

- Notice that we got model response as data type and response code of 200. It means request is successfully sent and we receive data in the response.

- We're going to sign analysis to response.json. And now, we're going to print the type of analysis which is going to be a dictionary this returned and we're going to print the analysis itself as well

- The returned data consists of categories, tags, descriptions, captions and metadata of image. The picture is categorized a building with confident score 36%. In the description, some information is extracted such as outdoor, city, grass, sheep, view, standing, white, river, group, large, water, building,..With those information, a caption is generated as 'a large body of water with a city in the background' with confidence of 90%. Sound funny!!

- Lets use the metadata to plot the picture. We're going see what is large body in a city

View Python code and Jupiter Notebook on Github

Make the data speak itself

Thursday, April 9, 2020

Microsoft Azure Vision API - Computer Vision

Prerequisites

Call Computer Vision Service

No comments:

Post a Comment