Backing up web projects to Yandex.Disk

In the early childhood, I did not understand the importance of backing up data. But as they say, understanding comes with experience. Often the experience is very bitter. In my case, hosting twice killed the base of the MathInfinity website , created back in my student days.

Large projects can afford to set aside entire servers for backup. However, there are a huge number of small projects that work only on your enthusiasm. These projects also need backups.

The idea of ​​creating archives on services like Dropbox, Ubuntu One, Yandex Disk, Google Drive, etc. has long attracted my attention. Dozens of gigabytes of free space, which theoretically can be used to reserve data.

Now this idea has received my first embodiment. Yandex Disk was chosen as a service for creating archives.

I do not pretend to the genius of the idea. And, of course, the invention of the bicycle began with the search for turnkey solutions on the Internet. All found code either didn’t work anymore or had a completely unreadable look. I prefer to understand how my applications work.

I will not say that the Yandex services APIs have excellent documentation. However, there are examples and references to specific standards. That was enough.

After studying the problem, the data backup task fell into the following points:

  1. Application Registration
  2. Authorization in Yandex with OAuth
  3. Operations with Yandex.Disk
  4. Creating and sending a backup to Yandex disk
  5. Perform crown copy


The last two points are a matter of technology, but still I decided to include them in the description.

I have been using the Limb framework for a long time. And in order not to reinvent the wheels to your bike, class codes
using this framework will be given below . All classes and functions with the lmb prefix are standard Limb classes and functions.

Application Registration


First you need to register your application. The application registration process is very simple. This procedure is described in the Yandex Documentation .
You are required to fill out a simple form in which, among other things, you must give permission to use your Yandex drive application. As a result of filling out the form fields, you will be given the application id and application password. They must be used to obtain a token. This process took me 3 minutes.

Authorization in Yandex with OAuth


To perform disk operations, you must specify an OAuth token. The OAuth standard describes several options for obtaining a token. Tu decided to go the easiest way. In accordance with the OAuth standard p. 4.3.2, the token can be obtained by direct request to the service using the username and password from the Yandex account (any account can be).
A small search on the documentation allowed writing the following class:

Token Receive Class Code
class YaAuth
{
  protected $token;
  protected $error;
  protected $create_time;
  protected $ttl;
  protected $app_id;
  protected $conf;
  protected $logger;
  function __construct($conf,$logger)
  {
    $this->logger = $logger;
    $this->app_id = $conf->get('oauth_app_id');
    $this->clear();
    $this->conf = $conf;
  }
  function getToken()
  {
    if($this->checkToken())
      return $this->token;
    $url = $this->conf->get('oauth_token_url');
    $curl = lmbToolkit::instance()->getCurlRequest();
    $curl->setOpt(CURLOPT_HEADER,0);
    $curl->setOpt(CURLOPT_REFERER,$this->conf->get('oauth_referer_url'));
    $curl->setOpt(CURLOPT_URL,$url);
    $curl->setOpt(CURLOPT_CONNECTTIMEOUT,1);
    $curl->setOpt(CURLOPT_FRESH_CONNECT,1);
    $curl->setOpt(CURLOPT_RETURNTRANSFER,1);
    $curl->setOpt(CURLOPT_FORBID_REUSE,1);
    $curl->setOpt(CURLOPT_TIMEOUT,4);
    $curl->setOpt(CURLOPT_SSL_VERIFYPEER,false);
    $post = 'grant_type=password&client_id='.$this->conf->get('oauth_app_id').
            '&client_secret='.$this->conf->get('oauth_app_secret').
            '&username='.$this->conf->get('oauth_login').
            '&password='.$this->conf->get('oauth_password');
    $header = array(/*'Host: oauth.yandex.ru',*/
                    'Content-type: application/x-www-form-urlencoded',
                    'Content-Length: '.strlen($post)
                   );
    $curl->setOpt(CURLOPT_HTTPHEADER,$header);
    $json = $curl->open($post);
    if(!$json)
    {
      $this->error = $curl->getError();
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    $http_code = $curl->getRequestStatus();
    if(($http_code!='200') && ($http_code!='400'))
    {
      $this->error = "Request Status is ".$http_code;
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    $result = json_decode($json, true);
    if (isset($result['error']) && ($result['error'] != ''))
    {
      $this->error = $result['error'];
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    $this->token = $result['access_token'];
    $this->ttl = (int)$result['expires_in']; 
    $this->create_time = (int)time();
    return $this->token;
  }
  function clear()
  {
    $this->token = '';
    $this->error = '';
    $this->counter_id = '';
    $this->create_time = 0;
    $this->ttl = -1;
  }
  function checkToken()
  {
    if ($this->ttl <= 0) return false;
    if (time()>($this->ttl+$this->create_time))
    {
      $this->error = 'token_outdated';
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    return true;
  }
  function getError()
  {
    return $this->error;
  }
}



All parameters required for authorization are taken out in the config. Any object supporting get and set methods can act as a config.
In order to be able to maintain a log of the actions to be taken, an object is passed to the class constructor for maintaining the work log. Its code can be found in the archive with an example.
The class itself has two main methods getToken and checkToken. The first executes a cUrl request for a token, and the second checks if the token is out of date.

Operations with Yandex.Disk


After receiving the token, you can perform operations with Yandex disk.
Yandex disk allows you to perform many different requests. The following operations are necessary for my purposes:
  • Create folder
  • Uploading a file to Yandex disk
  • Delete a file from Yandex drive
  • Downloading a file from Yandex disk
  • Getting a list of objects contained in a folder
  • Determining the existence of an object on disk and its type

All operations are performed using cUrl. Of course, all this can be done using sockets, but the simplicity of the code is important to me. All operations with Yandex disk comply with the WebDav protocol. The Yandex Disk API documentation contains detailed examples of query execution and response to these queries. The class code for working with the disk is shown below:
Drive class code
class YaDisk
{ 
  protected $auth;
  protected $config;
  protected $error;
  protected $token;
  protected $logger;
  protected $url;
  function __construct($token,$config,$logger)
  {
    $this->auth = $auth;
    $this->config = $config; 
    $this->token = $token;
    $this->logger = $logger;
  } 
  function getCurl($server_dst)
  {
    $curl = lmbToolkit::instance()->getCurlRequest();
    $curl->setOpt(CURLOPT_SSL_VERIFYPEER,false);
    $curl->setOpt(CURLOPT_PORT,$this->config->get('disk_port'));
    $curl->setOpt(CURLOPT_CONNECTTIMEOUT,2);
    $curl->setOpt(CURLOPT_RETURNTRANSFER,1);
    $curl->setOpt(CURLOPT_HEADER, 0);
    $curl->setOpt(CURLOPT_HTTP_VERSION,CURL_HTTP_VERSION_1_1);
    $uri = new lmbUri($this->config->get('disk_server_url'));
    $uri = $uri->setPath($server_dst)->toString();
    $curl->setOpt(CURLOPT_URL,$uri);
    $header = array('Accept: */*',
                    "Authorization: OAuth {$this->token}"
                   );
    $curl->setOpt(CURLOPT_HTTPHEADER,$header);
    return $curl;
  }
  function getResult($curl, $codes = array())
  {
    if($curl->getError())
    {
      $this->error = $curl->getError();
      echo $this->error;
      $this->logger->log('','ERROR', $this->error);
      return false;
    } 
    else
    {
      if (!in_array($curl->getRequestStatus(),$codes))
      {
        $this->error = 'Response http error:'.$curl->getRequestStatus();
        $this->logger->log('','ERROR', $this->error);
        return false;
      }
      else
      {
        return true;
      }
    }
  }
  function mkdir($server_dst)
  {
    $curl = $this->getCurl($server_dst);
    $curl->setOpt(CURLOPT_CUSTOMREQUEST,"MKCOL");
    $response = $curl->open();
    return $this->getResult($curl, array(201,405));//405 код коЕвращается если папка уже есть на сервере
  }
  function upload($local_src,$server_dst)
  {
    $local_file = fopen($local_src,"r");
    $curl = $this->getCurl($server_dst);
    //$curl->setOpt(CURLOPT_CUSTOMREQUEST,"PUT");
    $curl->setOpt(CURLOPT_PUT, 1);
    $curl->setOpt(CURLOPT_INFILE,$local_file);
    $curl->setOpt(CURLOPT_INFILESIZE, filesize($local_src));
    $header = array('Accept: */*',
                    "Authorization: OAuth {$this->token}",
                    'Expect: '
                   );
    $curl->setOpt(CURLOPT_HTTPHEADER,$header);
    $response = $curl->open();
    fclose($local_file);
    return $this->getResult($curl, array(200,201,204));    
  }
  function download($server_src,$local_dst)
  {
    $local_file = fopen($local_dst,"w");
    $curl = $this->getCurl($server_src);
    $curl->setOpt(CURLOPT_HTTPGET, 1);
    $curl->setOpt(CURLOPT_HEADER, 0);
    $curl->setOpt(CURLOPT_FILE,$local_file);
    $response = $curl->open();
    fclose($local_file);
    return $this->getResult($curl, array(200));    
  }
  function rm($server_src)
  {
    $curl = $this->getCurl($server_src);
    $curl->setOpt(CURLOPT_CUSTOMREQUEST,"DELETE");
    $response = $curl->open();
    return $this->getResult($curl, array(200));    
  }  
  function ls($server_src)
  {
    $curl = $this->getCurl($server_src);
    $curl->setOpt(CURLOPT_CUSTOMREQUEST,"PROPFIND");
    $header = array('Accept: */*',
                    "Authorization: OAuth {$this->token}",
                    'Depth: 1',
                   );
    $curl->setOpt(CURLOPT_HTTPHEADER,$header);
    $response = $curl->open();
    if($this->getResult($curl, array(207)))
    {
      $xml = simplexml_load_string($response,"SimpleXMLElement" ,0,"d",true);
      $list = array();
      foreach($xml as $item)
      {
        if(isset($item->propstat->prop->resourcetype->collection))
          $type = 'd';
        else
          $type = 'f';
        $list[]=array('href'=>(string)$item->href,'type'=>$type);
      }
      return $list; 
    }
    return false;    
  }
  //Ugly. 
  function exists($server_src)
  { 
    $path = dirname($server_src);
    $list = $this->ls($path);
    if($list === false)
    {
      $this->error = 'Не могу получить список файлов';
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    foreach($list as $item)
      if(rtrim($item['href'],'/')==rtrim($server_src,'/'))
        return true;
    return false;
  }
  //Ugly.
  function is_file($server_src)
  { 
    $path = dirname($server_src);
    $list = $this->ls($path);
    if($list === false)
    {
      $this->error = 'Не могу получить список файлов';
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    foreach($list as $item)
      if( (rtrim($item['href'],'/')==rtrim($server_src,'/') ) && ($item['type']=='f') )
        return true;
    return false;
  }
  //Ugly. 
  function is_dir($server_src)
  { 
    $path = dirname($server_src);
    $list = $this->ls($path);
    if($list === false)
    {
      $this->error = 'Не могу получить список файлов';
      $this->logger->log('','ERROR', $this->error);
      return false;
    }
    foreach($list as $item)
      if( (rtrim($item['href'],'/')==rtrim($server_src,'/') ) && ($item['type']=='d') )
        return true;
    return false;
  }
}


All class methods have speaking names mkdir, upload, download, ls, rm, so we will not dwell on them in detail. It all comes down to forming and executing a query using cUrl. For each request, you need to add the token obtained above.
To make a complete analysis of the answer, to be honest, to do it was laziness. Therefore, the response simply checks the status of the request, if it matches the expected one, then we consider the operation completed successfully. Otherwise, write the error to the log.
The implementation of the methods is_dir, is_file, exists is terrible, but I am not going to work with folders with more than 10 files. That is why they are implemented using the ls method.
Now I have a disk management tool at my disposal. Although it is a little flawed, it is still a tool.

Creating and sending a backup to Yandex disk


We will create a backup using the following algorithm:
  1. We remove excess backups from Yandex drive. If more than n backups have accumulated on the disk, then delete the old ones., Take the number n from the config.
  2. In a temporary folder, create a dump of the Mysql database. In my code, this is done by calling the mysqldump command.
  3. In the same folder, copy the files you want to save.
  4. We archive the folder with the created files.
  5. The resulting archive is copied to Yandex Disk
  6. Delete temporary files

Variations of the last set of actions are possible. Here the flight of imagination is not limited. The specified set is enough for me.
These actions can be performed using the following class.

Create an archive and send it to disk
class YaBackup
{
  protected $disk;
  protected $db;
  protected $logger;
  protected $backup_number;  
  function __construct($backupconfig)
  {
    $config = lmbToolkit::instance()->getConf('yandex');
    $this->logger = YaLogger::instance();
    $auth = new YaAuth($config,$this->logger);
    $token = $auth->getToken();
    if($token == '') throw Exception('Не могу получить токен');
    $this->disk = new YaDisk($token,$config,$this->logger);
    $this->db = $backupconfig->get('db');
    $this->folders = $backupconfig->get('folders');
    $this->tmp_dir = $backupconfig->get('tmp_dir');
    $this->project = $backupconfig->get('project');
    $this->backup_number = $backupconfig->get('stored_backups_number');
    $this->server_dir = $backupconfig->get('dir');
    $time = time();
    $this->archive = date("Y-m-d",$time).'-'.$time;
  }
  function execute()
  {
    $this->logger->log("Начат бекап проекта ".$this->project,"START_PROJECT");
    $this->_clean();
    $this->logger->log("Удаление старых копий");
    $this->_deleteOld();
    $this->logger->log("Создание дампа базы");
    $this->_makeDump();
    $this->logger->log("Копирование необходимых файлов"); 
    $this->_copyFolders();
    $this->logger->log("Создание архива"); 
    $this->_createArchive();
    $this->logger->log("Копирование на Яндекс.Диск");
    $this->_upload();
    $this->logger->log("Удаление временных файлов"); 
    $this->_clean();
    $this->logger->log("Бекап проекта ".$this->project." завершен", "END_PROJECT");
  }
  protected function _clean()
  { 
    lmbFs::rm($this->getProjectDir());
  }
  protected function _deleteOld()
  {
    $list = $this->disk->ls($this->server_dir.'/'.$this->project);
    $paths=array();
    $n=0;
    foreach($list as $item)
    {
      //Имена архивов имеют вид Y-m-d-timestamp.tar.gz. В качестве ключа массива используем timestamp.
      $parts = explode('-',basename(rtrim($item['href'],'/')));
      if(isset($parts[3]) && ($item['type']=='f'))
      { 
        $tm = explode('.',$parts[3]);
        $paths[(integer)$tm[0]] = $item['href'];
        $n++;
      }
    }
    ksort($paths);//сортируем массив по ключам от меньшего к большему
    for($i=$n;$i>$this->backup_number-1;$i--)
    {
      $item = array_shift($paths);
      $this->logger->log("Удаление ".$item);
      $this->disk->rm($item); 
    }    
  }
  protected function _upload()
  {
    $archive = $this->archive.'.tar.gz';
    //создаем дирректории на яндекс диске 
    $this->logger->log("Создаем папки на Яндекс.Диске"); 
    $this->disk->mkdir($this->server_dir);
    $res = $this->disk->mkdir($this->server_dir.'/'.$this->project);
    //Копируем архив    
    $this->logger->log("Копируем архив на Яндекс.Диск"); 
    $this->disk->upload($this->getProjectDir().'/'.$archive,$this->server_dir.'/'.$this->project.'/'.$archive);
    if($res) 
      $this->logger->log("Копирование на Яндекс.Диск завершено успешно"); 
    else
      $this->logger->log("Копирование на Яндекс.Диск завершено завершено с ошибкой"); 
  }
  protected function getProjectDir()
  {
    return $this->tmp_dir.'/'.$this->project;
  }
  protected function _copyFolders()
  {
    lmbFs:: mkdir($this->getProjectDir() . '/folders');
    $folders = $this->folders;
    foreach($folders as $key => $value)
    {
      lmbFs:: mkdir($this->getProjectDir() . '/folders/' . $key);
      lmbFs:: cp($value, $this->getProjectDir() . '/folders/' . $key);
    }
  }
  protected function _createArchive()
  {
    $archive = $this->archive;
    $dir = $this->getProjectDir();
    //переписать через system
    `cd $dir && find . -type f -exec tar rvf "$archive.tar" '{}' \;`;  
    `cd $dir && gzip $archive.tar`;
  }  
  protected function _makeDump()
  {
    $host = $this->db['host'];
    $user = $this->db['user'];
    $password = $this->db['password'];
    $database = $this->db['database'];
    $charset = $this->db['charset'];
    lmbFs:: mkdir($this->getProjectDir() . '/base');
    $sql_schema = $this->getProjectDir() . '/base/schema.mysql';
    $sql_data = $this->getProjectDir() . '/base/data.mysql';
    //создаем дамп
    $this->mysql_dump_schema($host, $user, $password, $database, $charset, $sql_schema);
    $this->mysql_dump_data($host, $user, $password, $database, $charset, $sql_data);
  }
  //Следующие методы лучше вынести в отдельный файл
  protected function mysql_dump_schema($host, $user, $password, $database, $charset, $file, $tables = array())
  {
    $password = ($password)? '-p' . $password : '';
    $cmd = "mysqldump -u$user $password -h$host " .
           "-d --default-character-set=$charset " .
           "--quote-names --allow-keywords --add-drop-table " .
           "--set-charset --result-file=$file " .
           "$database " . implode('', $tables);
    $this->logger->log("Начинаем создавать дамп базы в '$file' file...");
    system($cmd, $ret);
    if(!$ret)
      $this->logger->log("Дамп базы создан (" . filesize($file) . " bytes)");
    else
      $this->logger->log("Ошибка создания дампа базы");;
  }
  protected function mysql_dump_data($host, $user, $password, $database, $charset, $file, $tables = array())
  {
    $password = ($password)? '-p' . $password : '';
    $cmd = "mysqldump -u$user $password -h$host " .
           "-t --default-character-set=$charset " .
           "--add-drop-table --create-options --quick " .
           "--allow-keywords --max_allowed_packet=16M --quote-names " .
           "--complete-insert --set-charset --result-file=$file " .
           "$database " . implode('', $tables);
    $this->logger->log("Начинаем создавать дамп данных в '$file' file...");
    system($cmd, $ret);
    if(!$ret)
      $this->logger->log("Дамп данных создан! (" . filesize($file) . " bytes)");
    else
     $this->logger->log("Ошибка создания дампа базы");;
  }
}



I did not comb the code of the last class. I think an interested reader will be able to add, remove or change methods to his own needs. Work with boils down to loading the config into the class via the constructor and executing the execute method

Perform crown copy


It so happened that I implement all the tasks of the crown as the heirs of the class:

Cronjob
abstract class CronJob
{
  abstract function run();
}


Comments are redundant here.
For each project, I create a class with something like this:
Scheduled task launch class
class YaBackupJob extends CronJob
{
  protected $conf;
  protected $conf_name = 'adevelop';
  function __construct()
  {
    $this->conf = lmbToolkit::instance()->getConf($this->conf_name);
  }
  function run()
  {
    $backup = new YaBackup($this->conf);
    $backup->execute();
  }
}



Here, as elsewhere, the standard mechanism of configuration files from Limb is used. In principle, a class can be made abstract, but it’s convenient for anyone.
There was a launch issue. The task itself is launched using the cron_runner.php script. Which connects the file with the task class, creates an object of this class and ensures that the same task is not simultaneously executed by two processes (the latter is implemented based on file locks).
cron_runner.php
set_time_limit(0);
require_once(dirname(__FILE__) . '/../setup.php');
lmb_require('limb/core/src/lmbBacktrace.class.php');
lmb_require('limb/fs/src/lmbFs.class.php');
lmb_require('ya/src/YaLogger.class.php');
new lmbBacktrace;
function write_error_in_log($errno, $errstr, $errfile, $errline)
{
  global $logger;
  $back_trace = new lmbBacktrace(10, 10);
  $error_str = " error: $errstr\nfile: $errfile\nline: $errline\nbacktrace:".$back_trace->toString();
  $logger->log($error_str,"ERROR",$errno);
}
set_error_handler('write_error_in_log');
error_reporting(E_ALL);
ini_set('display_errors', true);
if($argc < 2)
  die('Usage: php cron_runner.php cron_job_file_path(starting from include_file_path)' . PHP_EOL);
$cron_job_file_path = $argv[1];
$logger = YaLogger::instance();
$lock_dir = LIMB_VAR_DIR . '/cron_job_lock/';
if(!file_exists($lock_dir))
  lmbFs :: mkdir($lock_dir, 0777);
$name = array_shift(explode('.', basename($cron_job_file_path)));
$lock_file = $lock_dir . $name;
if(!file_exists($lock_file))
{
  file_put_contents($lock_file, '');
  chmod($lock_file, 0777);
}
$fp = fopen($lock_file, 'w');
if(!flock($fp, LOCK_EX + LOCK_NB))
{
  $logger->logConflict();
  return;
}
flock($fp, LOCK_EX + LOCK_NB);
  try {
    lmb_require($cron_job_file_path);
    $job  = new $name;
    if(!in_array('-ld', $argv))
      $logger->log('',"START");
    ob_start();
      echo $name . ' started' . PHP_EOL;
      $result = $job->run();
      $output = ob_get_contents();
    ob_end_clean();
    if(!in_array('-ld', $argv))
      $logger->log($output,"END",$result);
  }
  catch (lmbException $e)
  {
    $logger->logException($e->getNiceTraceAsString());
    throw $e;
  }
flock($fp, LOCK_UN);
fclose($fp);
if(in_array('-v', $argv))
{
  echo $output;
  var_dump($logger->getRecords());
}



The command is written in crontab:
  php /path/to/cron_runner.php ya/src/YaBackupJob.class.php

As an argument to the script, we pass the path relative to include_path to the file with the class. The script determines the name of the class with the task by the file name.

Conclusion


I would be glad if this code comes in handy. Links to a complete working example are provided below.
Constructive criticism is welcome. Waiting for your comments and feedback.

References and Sources



Also popular now: