Ryder95 October 8, 2017 at 22:20

What to do if Instagram did not give access to the API? Addition

Hello again! I read it and it seemed to me that it could be continued.

It's no secret that the most popular and profitable platform for advertising, business and other things is Instagram. Why it became just a service in which at the beginning it was possible to upload only pictures of a certain size (aspect ratio is meant) and there was absolutely nothing that was in the then social networks - it is completely unclear, but the fact is a fact. In view of what everyone is trying to get into the Instagram platform and capture the largest number of audiences from there, and they do this, of course, not manually. And then it follows that Instagram severely blocks access for bots, spammers, and so on, so that the network remains clean.

The most useful functions (posting and deleting posts) are available only from the Instagram mobile application, emulation of requests is difficult, because you need to remove the key from the application, which is updated with each new version.
The web version is cropped, but it’s nice that it has the ability to like, comment and delete comments
There is an API, but the procedure for obtaining it is depressingly long, and spammers and bots do not exactly shine this way. Plus there were many moments when the conventions in the API changed, which is not always convenient.

Although I didn’t contact Instagram to create another spam bot that can subscribe and like, I really didn’t want to mess with getting the Instagram API, so I had to write my own library to interact with Instagram.

I want to say that working with the Web version of Instagram is very pleasant for two reasons:

1. You can get brief information about any page if you send a GET request of the form:

https://instagram.com/zuck/?__a=1

And the answer is JSON with available information, the first 10 posts of the page and more. Very nice.

2. If there is not enough brief information, then there is one more good news. You can upload photos, subscriptions, comments using a specific request of the form

https://www.instagram.com/graphql/query/?query_id=17888483320059182&variables=...

, where variables are passed to variables for processing in JSON format. The answer is also JSON. Anyway, it’s obvious that this all works on GraphQL, so you can even google to understand how queries are processed.

Based on this knowledge, the entire library is built. I will briefly describe how it can be used, maybe it will be useful to someone. By the way, I pointed out the BSD 3 license there in the repository. Tell me, maybe I should change it so that there are no difficulties?

Installation

You do not need to install it. More precisely, I was too lazy to prescribe all sorts of setup.py or to pack when the library consists of only one file. Therefore, there is simply an instagram.py file that connects like this:

import instaparser

How to use it?

Interaction with Instagram is possible either with authorization or without. Without authorization, there are no functions for viewing subscriptions and subscribers, and, obviously, it is impossible to like something, comment something, and so on. The only restrictions with authorization: it is impossible to post posts and delete them.

I will give an example of interaction without authorization:


from instaparser.agents import Agent
from instaparser.entities import Account 
agent=Agent()
account=Account("zuck")
agent.update(account)
media=agent.get_media(account, count=100)
for m in media:
    print(m)

As you understand, this script downloads information about Mark Zuckerberg’s page, downloads the last 100 posts from his page and displays them on the screen.

I want to say that if I did not write

agent.update(account)

then it wouldn’t work to download the posts, since no information about Zuckerberg’s page was known.

And here is an example with authorization:


from instaparser.agents import AgentAccount
from instaparser.entities import Account
agent=AgentAccount("oleg_yurchik", "imasuperpassword")
agent.update()
account=Account("zuck")
agent.update(account)
# and etc.

This is the so-called Hello, world! . Or a quick start.

And now I’ll tell you more:

Instagram, in fact, has only 6 entities:

Account
Fast
Geolocation
A comment
Hashtag
Storis

Everything else is just lists of these entities, such as likes, subscriptions, subscribers, and more. And for each entity there is a class. For accounts - Account, posts - Media, geolocation - Location, comments - Comment, hashtags - Tag, stories - Story. And each of them (except for comments) needs to be updated before working with it. That is, if you want to download all your posts, click through them and get a list of geolocations, then you need to do the following:


from instaparser.agents import AgentAccoun
agent=AgentAccount('oleg.yurchik', 'anothersuperpassword')
agent.update()
media=agent.get_media(count=agent.media_count)
locations=[]
for m in media:
    agent.like(m)
    agent.update(m)
    if m.location:
        locations.append(m.location)

And if later you need to get the last 10 posts on a specific geolocation, then you will need to do the following:


agent.update(location)
media=agent.get_media(location, count=10)

I had to remove the function of updating the account from the initialization, since if it was necessary to get all the subscribers, for example, the program would update each of the accounts, and this is not good.

The library is based on the requests library, and one of the things I think is that you can also pass additional parameters to requests in methods. This idea came to me when I first received a 429 error from Instagram. It was necessary to use a proxy.

For example, you can do this:


media=agent.get_media(count=agent.media_count, settings={'proxies': {'https': '127.0.0.1:80'}})

where 127.0.0.1:80 - you can specify your proxy.

Also, another catch, I think, might be error trapping.

The classes Agent and AgentAccount (those that communicate with Instagram) have a dictionary organized like a tree, it is called exception_actions . In it, in the form of keys, exception classes are stored, and in the form of values, functions. If any error suddenly occurs, it is intercepted and a function from the dictionary is executed. An exception object and parameters with which the request was executed are passed to this function. It can perform some action and return changed (or not) query parameters. The request will be repeated again. And it will be repeated as many times as specified in the Agent.repeats parameter. The default is 1.

And you don’t have to worry about memory overflow.

The class of each entity has a dictionary in which all the objects of this class are stored (or even objects of the subclass). Thus, if you accidentally create, for example, an account that has already been created, the designer will return you a link to the previously created account.

If you accidentally missed a link to the repository in the text, then here it is again .

And finally, I’ll say that because of some solutions, some problems appeared:

For example, a problem when re-creating an object. If you suddenly want to use your account as a workplace and interact through it, and it was previously created as a regular account, then creating it again fails. I don’t know how to solve it yet.
Error trapping sometimes behaves very strangely and is not fully tested.

I really hope that maybe this solution will be useful to someone, I hope for any useful comments and help in completing this thing. An article I mentioned gave an example of such a script in PHP, but it only collected information and, in my opinion, it worked only with the old version of the Instagram web-interface.

Thanks for attention.

Tags:

What to do if Instagram did not give access to the API? Addition

Installation

How to use it?

Also popular now: