bediary August 18, 2014 at 10:35

VCStart: how we created the platform

The development of our platform began back in 2013, when our team, full of inspiration and enthusiasm, took up this ambitious and interesting project, which would allow us to combine the funds of thousands of small private investors and startup enthusiasts to realize business ideas.

Instruments

For the development of the investment platform, the good old PHP was chosen and the very fast and well-proven PHP framework Yii of the first version in previous projects.

Database

When developing a new system, you first of all think about the volumes and format of the data that will need to be stored.
Looking far ahead, we thoroughly approached the development of platform architecture.
The approach to its extensibility and scalability was dictated by the maximum performance (at that time) of Kickstarter.

Namely:

3,000,000 users
100,000 projects
each user on average invests in 3-10 projects, but there are exceptions for 500-700 projects
each user subscribes to 20-50 projects
but in each project there may be
2000-5000 investors and 5000-20000 subscribers

Therefore, one would have to query tables with billions of records. In order to avoid this, it was decided to implement horizontal sharding of the database tables.

Features of horizontal sharding:

Parts of the same table may reside on different physical servers
In the web application, the desired connection with each of the databases is selected according to a predetermined algorithm
Before each call to the table, the desired connection is selected

All startup data entities are based on a startup, which is why it formed the basis for the connection selection algorithm. A database class was written, which, depending on the startup ID, will select the required connection and table, without requiring explicit instructions anywhere in the code.

It works like this: you need to get the number of subscribers for a specific project, let's say id 600. When building a query, the database access class determines that spot number 2 should be used. The db connection and the table name subscriber2 are extracted from the configuration file, and automatically inserted into the request .

The configuration file for linking spots, servers and database tables looks very simplified like this:

Config for sharding

Reliable and productive PostgreSql was chosen as a DBMS. So far, our web servers use only one database server, since careful caching of requests, pages, and page fragments has practically deprived it of the load.
Another database server is given for replicas.

Web servers

As you know, vertical scaling (to buy RAM, more productive hardware) is expensive and has a quickly achievable limit of increasing resources, so we chose the horizontal scaling of web servers, which provides a larger performance increase for less money.

To process requests from users, it was decided to build the following scheme:

Web server configuration

Only one server looks out onto the Internet - the nginx balancer, which users directly access when they visit the site. In the configuration of the balancer, all available backend servers are registered, to which user requests will be redirected, and which will give them content.

At the moment, we are processing requests from users that are processed by 5 backend servers that do not experience almost any load, but withstand a thousand or more connections per second if necessary.

In order to avoid discrepancies between the versions of content displayed to users, backends use a single cache and sessions, which are stored in the high-performance key-value storage Redis.
In addition to the cache and sessions, Redis stores all the counters, some data lists and statistics on user visits, projects, etc.

Also, the balancer allows you to set the weights (priority) of each of the backend servers, the number of unsuccessful attempts to connect to the backend server, after which it will be considered out of service, and requests will be redirected to the remaining servers.

With a sharp increase in server load, we can very easily add any number of backend servers, deploying a new server from snapshot takes about three minutes.

Amazon

During development, the question arose of where to store the files and pictures of the projects, so that each of the servers had access to them, for example, when the user wants to crop an already downloaded picture. Amazon Simple Storage Service (S3) came to the rescue.

By the way, Amazon rescued us twice, we sent Amazon Simple Email Service (SES) to send letters to users, which allowed us to do mass mailings and ensured a high percentage of messages getting to users bypassing spam filters.

Finishing touch

After many stages of functional testing, we performed a series of stress tests. The load testing service loaddy.com, first of all, helped us in this.
We tested mainly 4 pages - the main page of the platform, accessible to all unauthorized users, the page with projects, which is the main page for authorized users, the main page of a separate project and the promotional page VCStart.com/start/, on which pretty girls keep a counter of the time remaining until final launch of the platform.

VCStart: Coming Soon!

Careful work on caching took some time, after which the results of load tests became more than satisfactory - the platform withstood the load of up to 1000 connections per second, while the average response time to the request was less than two seconds.

Such careful preparation for the launch of the project gave us a huge reserve of opportunities for expanding the audience, thanks to which the expected tsunami of habraeffect seemed to us just a breeze.

We are waiting for you, investors, and you, startup authors, on our first in Russia crowdfunding platform VCStart.com !
If you are interested in any details, I will be glad to answer them in the comments

Tags: