Pinterest architecture - 18 million visitors, 10x growth, 12 employees, 410 TB of data

Original author: Todd Hoff
  • Transfer
The story of Pinterest is very similar to Instagram. Phenomenal growth, a huge number of users and stored data, while there are an amazingly few employees. And still everything is in the cloud.

Indeed, neither Pinterest nor Instagram made any major scientific or technological discoveries, but this is more a result of the ease of use of cloud technology than a sign of the sunset of the era of innovation in Silicon Valley (the Golden Age of Silicon Valley is over and we are dancing on its grave - approx. Translator) . Figures in the header and cost estimatesThese companies are so large that it seems to us that they are behind some kind of technological revolution that ensures their rapid growth. However, this revolution is much more skillful - it shows how easy it is to achieve such rapid growth if you are able to realize a good idea. Get used to it. Now this is the norm.

Here's what Pinterest is today:
  • 410 terabytes of user data or 80 million objects are stored in Amazon S3. This is 10 times more than in August 2011. The number of Amazon EC2 instances over the same time increased 3 times. Monthly costs are around $ 39K for S3 and $ 30K for EC2.
  • Only 12 employees, as in December 2011. Thanks to the use of cloud technologies, the project can continue to grow, and the team supporting it can remain very small. UPD: It looks like there are already 31 employees .
  • Paying only for used resources saves a lot of money. Peak traffic occurs in the afternoon and evening, so at night the number of EC2 instances is reduced by 40%. During the maximum traffic level, on average, it takes about $ 52 per hour to EC2, and at night, when the load drops, the costs are only $ 15 per hour.
  • 150 EC2 instances as web servers.
  • 90 instances are used to cache data in memory to offload the database.
  • 35 instances for internal use.
  • 70 master database servers and parallel existing backup database servers in several regions around the world to ensure data storage redundancy.
  • Written in Python and Django.
  • Sharding is used, the database is broken when it reaches 50% of capacity. This allows you to simply scale the database while maintaining a sufficient speed of IO operations.
  • Amazon ELB is used for load balancing between EC2 instances. The ELB API makes it easy to enter and deactivate instances.
  • One of the fastest growing sites in history. Using Amazon Web Service, with a minimum level of IT infrastructure, it was possible to process the requests of 18 million visitors in March, which is 1.5 times more than a month earlier.
  • The cloud allows you to easily and cheaply experiment with the service, without having to purchase new expensive servers.
  • Based on Apache Hadoop , Elastic Map Reduce is used for data analysis and costs only a few hundred dollars a month.

Also popular now: