Mayflower May 7, 2016 at 17:41

We use the undocumented site API captionbot.ai

Tutorial

In this article we will discuss how to get and use the site API if there is no documentation on it or it is not yet officially open. The guide is written for beginners who have not yet tried to revert a simple API. For those who were engaged in similar things, there is nothing new here.

We will analyze using the API service https://www.captionbot.ai/ that Microsoft recently opened (thanks to them for this). Many could read about him in an article on Geektimes . The site uses ajax requests in JSON format, so copying them will be easy and pleasant. Go!

KDPV

We analyze requests

First of all, we open the developer tools and analyze the requests that the site sends to the server.

In our case, all requests that interest us have a base URL https://www.captionbot.ai/api

Initialization

When you first open the site there is a GET request for /api/initno parameters.
The answer is Content-Type: application/json, while in the body of the answer we get just a line of the form:

"54cER5HILuE"

Remember this and move on.

URL Submission

We have two ways to upload an image: through the URL and through the file upload. For the test, we take the URL of the Lena image from the wiki and send it. In network activity, a POST request appears /api/messagewith the following parameters:

{
    "conversationId": "54cER5HILuE",
    "waterMark": "",
    "userMessage": "https://upload.wikimedia.org/wikipedia/ru/2/24/Lenna.png"
}

Yeah, we tell ourselves, it means the method initreturned us a string for conversationId, and userMessageour link hit. What is waterMarkstill unclear. We look at the response data:

"{\"ConversationId\":null,\"WaterMark\":\"131071012038902294\",\"UserMessage\":\"I am not really confident, but I think it's a woman wearing a hat\\nand she seems . \",\"Status\":null}"

For some reason, they encoded JSON twice, but oh well. In human form, it looks like this:

{
    "ConversationId": null,
    "WaterMark": "131071012038902294",
    "UserMessage": "I am not really confident, but I think it's a woman wearing a hat\\nand she seems .",
    "Status": null
}

All parameters along the way changed the way of writing, but these are trifles of life. So, some value was returned to us, for some WaterMarkreason it’s empty ConversationId, actually the signature for the photo in the field UserMessageand some empty status.

Image upload

Further, without closing the tab, we try the same operation with loading photos from a local file. We see the POST request /api/uploadin the format multipart/form-datawith the field name file:

-----------------------------50022246920687
Content-Disposition: form-data; name="file"; filename="Lenna.png"
Content-Type: image/png

In response, we get the URL string of our downloaded file, we can go through it and verify this:

"https://captionbot.blob.core.windows.net/images-container/2ogw3q4m.png"

Then we send a request we are already familiar with /api/message:

{
    "conversationId": "54cER5HILuE",
    "waterMark": "131071012038902294",
    "userMessage": "https://captionbot.blob.core.windows.net/images-container/2ogw3q4m.png"
}

That came in handy waterMarkfrom the previous answer, and the URL is the one that the method returned to us upload. The response data is similar to the previous ones.

Writing a wrapper

To use this knowledge with convenience, we make a simple wrapper in your favorite programming language. I will do it in Python. For requests to the site I use requests, as it is convenient and it has sessions that store cookies for me. The site uses SSL, but by default requests will swear on the certificate:

hostname 'www.captionbot.ai' doesn't match either of '*.azurewebsites.net', '*.scm.azurewebsites.net', '*.azure-mobile.net', '*.scm.azure-mobile.net'

This is solved by setting the verify = False flag for each request.

Full source

import json
import mimetypes
import os
import requests
import logging
logger = logging.getLogger("captionbot")
class CaptionBot:
    BASE_URL = "https://www.captionbot.ai/api/"
    def __init__(self):
        self.session = requests.Session()
        url = self.BASE_URL + "init"
        resp = self.session.get(url, verify=False)
        logger.debug("init: {}".format(resp))
        self.conversation_id = json.loads(resp.text)
        self.watermark = ''
    def _upload(self, filename):
        url = self.BASE_URL + "upload"
        mime = mimetypes.guess_type(filename)[0]
        name = os.path.basename(filename)
        files = {'file': (name, open(filename, 'rb'), mime)}
        resp = self.session.post(url, files=files, verify=False)
        logger.debug("upload: {}".format(resp))
        return json.loads(resp.text)
    def url_caption(self, image_url):
        data = json.dumps({
            "userMessage": image_url,
            "conversationId": self.conversation_id,
            "waterMark": self.watermark
        })
        headers = {
            "Content-Type": "application/json"
        }
        url = self.BASE_URL + "message"
        resp = self.session.post(url, data=data, headers=headers, verify=False)
        logger.debug("url_caption: {}".format(resp))
        if not resp.ok:
            return None
        res = json.loads(json.loads(resp.text))
        self.watermark = res.get("WaterMark")
        return res.get("UserMessage")
    def file_caption(self, filename):
        upload_filename = self._upload(filename)
        return self.url_caption(upload_filename)

The source code is on Github , plus the finished package in pip .

Tags: