Export comments and ratings from Google Play for analysis

  • Tutorial
Not everyone knows, but on Google Play there is a completely legal opportunity to upload all comments and ratings for your application to a separate CSV file, and then do some analysis that is not available from the Google system. But this is done using the external gsutil utility written in Python. So in this post there will be a small instruction on how to do this.

  1. Install python 2.6 or 2.7 if it is not already on the system. Installation instructions on the official website , there is nothing complicated for any OS.

  2. Download and unpack to any gsutil folder ( direct link to zip , direct link to tar.gz )

  3. We open the console in the folder with gsutil and continue to work in it. (For windows - shift + right-click in this folder and select “Open command window”, users of the more popular ones themselves know how to do this, since they managed to install them)

  4. Run the command
    gsutil.py update

    for obvious purposes. Usually in the archive there is already a fresh version, but anything happens.

  5. Run the command
    gsutil.py config

    We get about this message. What to do, again, is obvious - copy the link to the browser.

  6. By clicking on the link we will be taken to a page on which we allow the application to work with our account.

  7. And copy the code to the console

  8. In response, we get another message with a proposal to select the default project in the Google Developer Console , but this is not necessary, especially since the project we need may simply not be there. But you need to enter something into the script, it will not accept an empty string. So just enter “1” for example and finish the job.

  9. On the Ratings and Reviews page of your application, you need to find the report segment identifier starting with pubsite_prod_rev_ , for example pubsite_prod_rev_1234567890123456789 .

  10. Now no one is stopping us from downloading all the reports. To do this, just run the command
    gsutil.py -m cp gs://pubsite_prod_rev_1234567890123456789/reviews/*.* Адрес_папки_куда_копировать

    In our case:
    gsutil.py -m cp gs://pubsite_prod_rev_1234567890123456789/reviews/*.* C:\Python27\texts

    -m - flag to copy to several parallel streams
    cp - command to copy files.
    Details about the utility commands can be found here .

  11. As a result, we will get a bunch of files with names in the folder according to the scheme reviews_ [application_package_name] _YYYYMM (Y - year, M - month). And for all applications assigned to your account right away.

    Of course, if you need reports for only one application, you can download data by query of the form
    gsutil.py -m cp gs://pubsite_prod_rev_1234567890123456789/reviews/* com.new_program*.* C:\Python27\texts

    But, I think, the principle of filtration is already clear.

  12. In general, everything is already fine, but working with a bunch of separate files is inconvenient, so we glue them into a single file with a simple Python script. Of course, it would be possible to combine the files into one with a simple copy command , but then we would have a duplicate header, which is unpleasant.

    import os
    import codecs
    files = os.listdir(".")
    csvs = filter(lambda x: x.endswith(".csv") and x!="all_csv.csv", files)
    file_write = codecs.open('all_csv.csv', 'w','utf-16')
    header_writed = False
    for file_name in csvs:
        file_read = codecs.open(file_name,'r','utf-16')
        for line in file_read:
            if (lines_count == 1):
                if (header_writed == False):
                    header_writed = True

Well, that seems to be all. The resulting file in Excel looks something like this:

And what kind of analytics to create with the received file - see for yourself. :)

List of references


Also popular now: