Proxy files from AWS S3 using nginx

    It would seem that the task of implementing the frontend for AWS on nginx sounds like a typical case for StackOverflow - after all, there can be no problems with proxying files from S3? In fact, it turned out that a ready-made solution is not so easy to find, and this article should correct this situation.



    Why would you even need this?


    1. Control access to files using nginx - relevant for the concept of IaC (infrastructure as code). All changes related to access will be made only in the configs that are in the project.
    2. If you give your files through nginx, there is a possibility of their cache and thus save on the requests to S3.
    3. Such a proxy will help to ignore the type of file storage for different application installations (after all, there are other solutions besides S3).

    We formulate the framework


    • The source bucket must be private - you cannot allow anonymous users to download files directly from S3. If this restriction does not work in your case, then just use it proxy_passand you may not read further.
    • Tuning by AWS should be one-time on a “tuned and forgot” basis to simplify operation.

    We are looking for a solution in the forehead


    If your original bucket is public, then no difficulties threaten you, proxy requests for S3 and everything will work. If it is private, then you will have to authenticate with S3 somehow. What do colleagues from the Internet offer us:

    1. There are examples of implementing the authentication protocol using nginx. The solution is good, but unfortunately, it is designed for an outdated authentication protocol ( Signature v2 ), which does not work in some Amazon data centers . If you try to use this solution, for example, in Frankfurt, you will get the error "The authorization mechanism you have provided is not supported. AWS4-use Please the HMAC-the SHA256 » . A more recent version of the protocol ( Signature v4 ) is much more difficult to implement, but there are no ready-made solutions for nginx with it.
    2. There is a third-party module for nginx - ngx_aws_auth . Judging by the source, it supports Signature v4. However, the project looks abandoned: for more than a year there have been no changes in the code base, and there is also a compatibility problem with other modules that the developer does not respond to. In addition, adding additional modules to nginx is often a painful step in itself.
    3. You can use a separate s3 proxy, of which quite a lot has been written. Personally, I liked the Go solution - aws-s3-proxy : it has a ready-made and fairly popular image on DockerHub. But in this case, the application will acquire another component with its potential problems.

    Apply AWS Bucket Policy


    AWS, as a rule, scares new users with its complexity and volume of documentation. But if you look, you understand that it is designed very logically and flexibly. In Amazon'e it has found a solution for our problem - the S3 Bucket the Policy . This mechanism allows you to build flexible authorization rules for the bucket based on different parameters of the client or request.


    The policy generator interface - AWS Policy Generator

    Here are some interesting options you can bind to:

    • IP ( aws:SourceIp)
    • Referer ( aws:Referer) header
    • User-Agent ( aws:UserAgent) header
    • the rest are described in the documentation .

    IP binding is a good option only if the application has a certain place of residence, and in our time it is rare. Accordingly, you need to become attached to something else. As a solution, I propose to generate a secret User-Agent or Referer and give files only to those users who know the secret header. Here's what a similar policy looks like:

    {
        "Version": "2012-10-17",
        "Id": "http custom auth secret",
        "Statement": [
            {
                "Sid": "Allow requests with my secret.",
                "Effect": "Allow",
                "Principal": "*",
                "Action": "s3:GetObject",
                "Resource": "arn:aws:s3:::example-bucket-for-habr/*",
                "Condition": {
                    "StringLike": {
                        "aws:UserAgent": [
                            "xxxyyyzzz"
                        ]
                    }
                }
            }
        ]
    }

    A little explanation:

    • "Version": "2012-10-17" - This is an internal AWS kitchen that you do not need to edit;
    • Principal- who is affected by this rule. You can specify that it works only for a specific AWS account, but in our case it’s worth it "*"- this means that the rule works for all, including anonymous users;
    • Resource- ARN (Amazon Resource Name) bucket and template for files inside the bucket. In our case, the policy applies to all files that are in the bucket example-bucket-for-habr;
    • Condition- here are indicated the conditions that must converge in order for the policy to work. In our case, we are comparing the predefined User-Agent header with the string xxxyyyzzz.

    And here is how this rule works from the point of view of an anonymous user:

    $ curl -I https://s3.eu-central-1.amazonaws.com/example-bucket-for-habr/hello.txt
    HTTP/1.1 403 Forbidden
    $ curl -I https://s3.eu-central-1.amazonaws.com/example-bucket-for-habr/hello.txt -H 'User-Agent: xxxyyyzzz'
    HTTP/1.1 200 OK

    It remains to configure nginx for proxying:

      location /s3-media/ {
          limit_except GET {
              deny all;
          }
          set $aws_bucket "example-bucket-for-habr";
          set $aws_endpoint "s3.eu-central-1.amazonaws.com:443";
          set $aws_custom_secret "xxxyyyzzz";
          proxy_set_header User-Agent $aws_custom_secret;
          rewrite ^/s3-media/(.*)$ /$aws_bucket/$1 break;
          proxy_buffering off;
          proxy_pass https://$aws_endpoint;
      }

    Conclusion


    So, once we wrote a simple policy for bucket, we got the opportunity to safely proxy files using nginx. However, we are not tied by IP and are not dependent on additional software.

    PS


    Read also in our blog:


    Also popular now: