Import sites from different CMS on Drupal

    I think that everyone who uses Drupal in their work periodically has the problem of transferring sites running on other CMS, or simply the task of importing data to the Drupal platform.

    Such tasks periodically arise for me, but before I did all the import by writing a script in php, which simply writes the necessary information directly to the Drupal database. Of course, I knew that there are mechanisms that allow you to add data using the Drupal API, but somehow it was too lazy to deal with them, and the script to write directly to the database is written quite quickly.

    When a site on Drupal is quite simple and it does not use any complex modules (and there are few of them), then this import principle (direct recording to the database) justifies itself. But what to do when you need to transfer data to a very complex site with many modules and their complex settings?

    In this case, knowledge of the Drupal API will greatly help, since all the work on the correct update of all interconnected tables, taking into account all the tricky settings, will be done for us by Drupal.

    As it turned out, using the Drupal API is not so simple, but very simple. About this and will be today's article.

    So, we have a site on Drupal, in which there are several types of content, for each type an additional field with a picture is added (CCK is used), a lot of views are made (Views are used), and ImageCache is used to cut pictures (a very buggy thing, but better so far there is nothing). The site runs on Drupal 6. I think that with other versions it will be similar, only most likely you will have to slightly tweak the code, because The API for them is slightly different.

    The data import script will be located in the root of the site and called through an http-request, something like hxxp: //site.ru/import.php. How you will transfer data for import (using other databases, reading files on disk or through POST data) is already your business, it does not change the essence.

    First of all, we’ll place this piece of code at the very beginning of our script:

    require_once 'includes/bootstrap.inc';
    drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);


    This piece of code loads the Drupal core and makes all the necessary settings to connect to the site database.

    Next, we need to connect to our data sources and download the necessary data.

    Suppose we need to add news to the site, for this we write the following code:

    $node = new stdClass();
    $node->title = "Заголовок новости";
    $node->body = "

    HTML-код новости

    "; $node->teaser = $node->body; $node->type = "news"; $node->created = time(); // дата создания $node->changed = $node->created; // дата обновления $node->status = 1; // нода опубликована $node->format = 1; // используется фильтр Filtered HTML $node->comment = 2; // комментарии разрешены $node->uid = 0; // ноду добавил "Гость", можно поставить uid=1, тогда ноду добавит админ сайта $node->language = 'ru'; // нода на русском языке node_save($node); $new_id = $node->nid;


    To create a new node, we must create an instance of the stdClass class and fill it with the necessary data. In this example, the headline of the news, its body, teaser is indicated. Type of content (type), I indicated "news", so on my site the news is indicated. The type can be any other, because almost everything in Drupal is done through the concept of a node.

    Actually, the use of the Drupal API consists of a single line - a call to the node_save method, into which the filled class instance for the node data is passed. This method records in the node, node_revisions, and possibly other related tables, you no longer need to think about it.

    If you want to get the identifier of the recorded node, then immediately after calling node_save read the value in the variable $ node-> nid (the function itself will add a new property and write the value there).

    Now we need to add an additional CCK field to our news. In my case, this will be the field_img field, which is used to display the picture for the news, and ImageCache is used when you need to display either thumbnail or a slightly reduced copy of the picture, because Pictures can be of different sizes, then all of them through ImageCache are adjusted to a given size.

    To add a new field to our node, you need to add a picture as a “file” in Drupal, and then, using the received “file” identifier, write down all the necessary data for the CCK field.

    $file = new stdClass();
    $file->uid = 0;
    $file->filename = "newsimage.jpeg";
    $file->filepath = "files/newsimage.jpeg";
    $file->filemime = file_get_mimetype($file->filename);
    $file->filesize = filesize($filepath);
    $file->status = 1;
    $file->timestamp = time();
    $file->origname = "";
    drupal_write_record('files', $file);
    $file_id = $file->fid;


    Here filename is the name of the file, filepath is the name of the file with a relative path (from the root of the site), in this case we assume that all of our pictures are in the files folder. Next comes the function call to indicate the mime type of the picture (you can simply specify “image / jpeg” with your hands) and calculate the file size. After that, the api function drupal_write_record is called, which simply writes the $ files structure to the files system table. It turns out that this is a kind of wrapper analogue around the function of writing data to the Drupal database.

    Now we add all the data for the CCK field to the node and save it:

    $node->field_img[0]['fid'] = $file->fid;
    $node->field_img[0]['data']['alt'] = $node->title;
    $node->field_img[0]['data']['title'] = $node->title;
    node_save($node);


    And finally, I’ll give a solution to a very strange and unpleasant glitch with ImageCache. The module itself works perfectly and is very convenient, only Drupal does not want to call it at the right time, i.e. after importing news to the site and updating the page, we won’t see thumbnails from ImageCache.

    The thing is that ImageCache is called only at the moment when a 404 error occurs when accessing a file with a picture. The idea is that ImageCache should catch this error, and if you are accessing a nonexistent picture, then generate it, and then display the picture itself with the 200th code. But this does not happen.

    I rummaged through a bunch of forums, tried to delve into the code myself, but did not understand why the picture was not generated. Therefore, I decided right at the time of importing the news to generate images via ImageCache directly using the Drupal API.

    I call this piece of code immediately after saving the CCK field with the picture:

    // генерируем thumbnail-картинку
    $preset = imagecache_preset_by_name("thumb");
    $dst = imagecache_create_path($preset['presetname'], $file->filepath);
    imagecache_build_derivative($preset['actions'], $filepath, $dst);
    // генерируем preview-картинку
    $preset = imagecache_preset_by_name("preview");
    $dst = imagecache_create_path($preset['presetname'], $file->filepath);
    imagecache_build_derivative($preset['actions'], $filepath, $dst);


    After that, the right pictures are created in the right folders and this famous glitch disappears.

    Here is the complete code:

    require_once 'includes/bootstrap.inc';
    drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
    $node = new stdClass();
    $node->title = "Заголовок новости";
    $node->body = "

    HTML-код новости

    "; $node->teaser = $node->body; $node->type = "news"; $node->created = time(); // дата создания $node->changed = $node->created; // дата обновления $node->status = 1; // нода опубликована $node->format = 1; // используется фильтр Filtered HTML $node->comment = 2; // комментарии разрешены $node->uid = 0; // ноду добавил "Гость", можно поставить uid=1, тогда ноду добавит админ сайта $node->language = 'ru'; // нода на русском языке node_save($node); $new_id = $node->nid; $file = new stdClass(); $file->uid = 0; $file->filename = "newsimage.jpeg"; $file->filepath = "files/newsimage.jpeg"; $file->filemime = file_get_mimetype($file->filename); $file->filesize = filesize($filepath); $file->status = 1; $file->timestamp = time(); $file->origname = ""; drupal_write_record('files', $file); $file_id = $file->fid; $node->field_img[0]['fid'] = $file->fid; $node->field_img[0]['data']['alt'] = $node->title; $node->field_img[0]['data']['title'] = $node->title; node_save($node); // генерируем thumbnail-картинку $preset = imagecache_preset_by_name("thumb"); $dst = imagecache_create_path($preset['presetname'], $file->filepath); imagecache_build_derivative($preset['actions'], $filepath, $dst); // генерируем preview-картинку $preset = imagecache_preset_by_name("preview"); $dst = imagecache_create_path($preset['presetname'], $file->filepath); imagecache_build_derivative($preset['actions'], $filepath, $dst);


    Materials used:
    api.drupal.org/api/drupal/6 - official documentation
    www.drupal.ru - search the site for “data import”

    PS: the use of the method proposed by me is not limited only to the Drupal API, it is also possible to call the functions of connected modules directly, as demonstrated by the example of ImageCache functions.

    Also popular now: