rekby July 18, 2014 at 09:50

Creating a script for publishing

From time to time a task appears: to make a script for publication, which needs to be updated but cannot be changed. For example, it can be an initialization script wired inside the image of a virtual machine or a script to install a site engine (published by the developer of the engine).

In the article I will talk about the techniques that I use to create such scripts, they will help to avoid some rakes, to keep the scripts simple and flexible. The approach is suitable for those scripts whose behavior should change depending on the needs of the author, updates, etc. The approach is NOT suitable for scripts that should work autonomously (without communication with the author’s system).

I use this approach in bash scripts, but the general principle can be applied regardless of the language.

Short background (skip):

A few years ago, I began to make VDS server templates for mass use by clients. After creating the template, changing it will not work anymore, i.e. the only way to fix the error: publish a new template that costs time and a lot of gigabytes of space (template + copies of it on all servers). Over the past few years, it has been possible to make some uncomfortable mistakes, as a result, now for many more years you will have to maintain templates with uncomfortable behavior at the start.

A similar situation can happen with scripts for installing / configuring control panels, site engines, just programs that should be downloaded, installed and configured with a single command. Despite the fact that the program itself can be updated, and everyone who has downloaded it already has no access to fixing the installation script.

Part of what I use was seen in similar scripts, for example, ispsystem, brew installers. Some colleagues suggested, and some acquired their own bitter experience.

General point

Each time, download and execute all the code from your server.

General code structure

Published script - only downloads the first executable code file, they don’t do anything else.
The first executable code file - immediately does what is needed (in simple cases) or determines something in common and uploads new files.
The remaining files are organized in any convenient way, it can be changed already in the process of work.

What should and should not be published script

In no case should the published script perform the task for which it is published and should not even have a hint of its solution. The only task of this script is to find a way to connect to the developer's server and download the code for execution from there. This code should be as simple as possible with a minimum of external dependencies, as they will have to be supported throughout the life of the script.

Here is the code I came to as a result

function try()
{
# Выполнить команду, если вернулась ошибка - продолжать пробовать столько раз, сколько указано в первом параметре
... код немного громоздкий, опущен для упрощения
}
# Execute init.sh from panel
function execute_init()
{
    local EVAL_CODE=`curl http://panel.1gb.ru/minimal/init.sh`
    if [ "${EVAL_CODE#\#\!/bin/bash}" != "$EVAL_CODE" ] && [ "${EVAL_CODE%\#BashScriptEnd}" != "$EVAL_CODE" ]; then
        eval "$EVAL_CODE"
        return 0
    else
        return 1
    fi
}
try 100000 execute_init

I put this code in a template with a previously known environment, in particular, I know for sure that there is curl there, and many attempts are needed. when the server starts, the network may not work or the http server may temporarily give an error. Other scripts may work in different environments and curl may not be there. This is a good place to try connecting to the server in many ways, if necessary many times.

It is necessary to check that the code for execution has been fully loaded - a long line with if does the verification: it checks that the script starts with #! / Bin / bash and ends with #BashScriptEnd. So you can be sure that the HTML error code or half script will not be executed by trimming rm -rf / tmp / my-downloads to rm -rf /

I did not intend to do more complex checks in this place - TCP in this case provides acceptable protection against data corruption, and then any complication of the external interface will then have to be maintained forever.

This script has only one external dependency - the URL. In the future, even it was then necessary to change it - with the improvement of the organization of the code, but the dependence is so simple that it is easy to maintain. Moreover, new scripts also use this particular path, which has developed historically, and not the new “correct” one. Because in case of changes in the future, two URLs would have to be supported, etc.

The script should not have any attempts to determine the environment and load for example init_linux.sh or init_freebsd.sh instead of init.sh - there was such an attempt too, it turned out to be inconvenient and now you have to support stubs for older versions of scripts.

What to put in a downloadable script

Here, freedom can already be changed more without touching the already published part. So if everything is simple, you can immediately place the code that will be executed here. If something gets complicated, it's easy to change in the future.

If the complexity of the executable code is more than 1 file, I recommend placing it here:
1. Function for downloading new files. It will redefine exactly how to connect to the server and in all scripts it is necessary to use it. It is not suitable for a shared library, as that is not loaded yet.
2. Call this function once / several times to connect and execute all the necessary files: a common code library, specific code for the environment found, etc. It can be seen which command is passed in the arguments and load the code to execute this command.

What to look for

Always check downloaded code before executing. Otherwise, you can execute a half script or some kind of error
In a published script, a minimum (ideally one) external dependency is the address from which the main code is loaded. Always the same
Always make several attempts to download each file. Even if the work is on the local network, there may be temporary server errors when it returns something wrong, for example, an error. Or generally the connection will not accept. If you have one downloadable file and the commands are executed manually, this can be no big deal. But if the commands are built into the automation and there are a lot of connected files, the probability that the error will happen is quite real. Just as real is not loading the whole file.
Before publishing a script, it is mandatory to check it - it’s a shame when it is sealed in one character and because of this you need to redo a lot of work on preparing / checking the template or interact with those to whom you recently gave the script

Disadvantages and point of view on them

The need to connect to a network (or the Internet). In the context of applying this approach, you still have to download something from the developer's server (including from your own if the system is used internally) and the network will still be required. The script will not be able to configure the local network if it does not work, or set an access password if it cannot receive it from the network. Also, the script will not be able to deploy the software / site from the developer's server if it is unavailable
Executing predefined code If this is your code, then everything is clear. If it’s someone else’s code, it will still be launched / executed at the level of trust that it is supposed to have. If this is the installation of the client application - in the client environment, and still the developer can do anything inside his software and no one will check the entire code (if someone will - they are unlikely to use auto-deployment from other resources at all - only their codes from their servers )

Tags: