yoihj September 18, 2008 at 08:05

Such a pain! Crowds vs. Web - 2-0. Episode Two - Clones Come in at Noon

We continue the warstory optimization of PHP + mySQL site of the rarest words. One day in May, delving into the words, the idea occurred to me - to place textboxes under these words - use [crowded for Web 2.0] crowdsourcing. Fill in whatever you want, only about this particular word. But collaborative was supposed to work - like Google Docs - if you and someone else are editing words on the same page - the changes will be displayed at the same time. Or, for example, someone edits the word “google” on the Google page, and on the TechMeme page, for example, there is also the word “google” - and they will see these changes from the Google page in real-time [this is not necessary to understand]. I would know what consequences this fun idea on mySQL will have later ...

It was not difficult to do this: prototype.js - in hand, ajax - every 10 seconds, look in the database what has changed over the last 10 seconds, make an intersection with the words on the current page, send js highlight effect and new text - voila. There were people, but few people introduced anything, then I came up with the “just said” block in the corner of the page. For example: “The page 'fibonacci.com' just said that 'fibonacci' → 'crazy math scientist'.” This has already significantly interested people and occupancy has increased significantly. Everyone saw the traffic on the site and got involved.

Actually, I made enough functionality for a rather funny toy - "site wars with rare words", auto-categorizer, synonymizer, etc. I sat peacefully editing a “live” site, uploading a new file, if I saw an error, I corrected it without worrying that someone could see it. I did not suspect that already thousands of people see this. After another mistake in the name of the function, something happened ... I could not upload to index.php FTP ... it was TechCrunch ....

One of the site visitors wrote an article for them, they published it (“TheRarestWords: Intriguing Semantic SEO Project from Russia” ) and a million of their subscribers went to the site [well, it got to the site much less].

Server load quickly increased. I was not suspecting anything, at that moment I edited index.php, and let's make a mistake in it. Many already sat on open pages, and worked only ajax'om. The consequences were already inevitable - the point of no return was passed ... LoadAvg steadily grew. And coupled with a database of 72 million records in MySQL, this did not give any hope of survival ... In about five to ten to fifteen minutes I managed to download index.php anyway - 18 hours of “optimization in combat conditions” had already begun.

Temporary poultices

First, index.php was replaced by:

print "I, for one, welcome our TechCrunch overlords, but we're kind of overloaded"; 
  exit ();

which knocked down the load and allowed me to fix it on:

if (uptime ()> 50) {
  print "I, for one, welcome our TechCrunch overlords :) but we're kind of overloaded"; 
  exit ();
};
function uptime () {
	$ fp = @ popen ('uptime', 'r');
	$ s = @ fgets ($ fp);
	@fclose ($ fp);
	@preg_match ('# load average: ([0-9 \.] +) #', $ s, $ m);
	return $ m [1];
};

That is - if loadavg became more than 50 - the site simply turned off. This allowed me to put a ceiling above which the server did not rise anymore. At first he jumped, in my opinion, to almost a hundred, as if I had seen 70.

Next, I began to playfully think which blocks of my site took the most time to generate. I knew that the auto-categorizer took a good half second for each page, so it was immediately taken in if uptime ()> 5 ... else print “block is not available now.”- loadavg decreased significantly, but still went under the ceiling of 50. I continued to assign the load blocks to the other blocks at which they would be turned off. This greatly relieved the load, but it continued to be under the ceiling, causing the site to turn off - after all, a million TechCrunch readers did their job, given that many of them were sitting on the site and filling in the cells.

We quickly think what to do next. I remembered that I played with memcached a bit and decided that it should be applied.

memcached

yum install memcached

Firstly, when I thought about it, I realized that the top statistics of word occupancy: “We have a total of 17 million words in the database, now we have filled in 12321 words” took a lot of time to calculate, because I took the log table in which all definitions were recorded and made GROUP BY word to calculate the number of unique words. Moreover, for each user this happened every 10 seconds. But I did not want to simply cache it, I wanted almost real-time statistics at all costs. The solution came obvious - we use memcached as a second database. So memcached_set ('count', real_account_of_mysql (), 0), that is, set the key, actually counting the words, and now this key was given to everyone. And if someone entered a word - I used memcached INCREMENTon this key. This did not solve the problem of unique words, so by rand (0,1000) <= 1 this key was replaced with a real account from mysql. That is, approximately during these 1000 requests a small error (error) occurred, but it returned on its way with a real account [every 1000 requests].

Actually, this is what Google developers say, aggregated data should be considered iteratively.

By the way, if you had a question why I didn’t just make a table of words, but made a log, where I kept all the historical definitions of each word - this is because of trolls . At first, there was a lot of trash like “fsdhjfsdfsd” - registration is not required, write whatever you want. Therefore, in order to quickly roll back this, checkmate, advertisement (which was just a lot!), I needed such a table, but in real time I could not redo it to a new table.

Live vs beta

Actually, at that moment I realized that you can’t work with a “live” website anymore, so in httpd.conf I made a full copy of the VirtualHost block for the beta .site.com subdomain , where I copied all the scripts and worked with them so that for now I’m debugging the site in real time under load - people didn’t fall off for a hundred or two hundred people because I was sealed in the name of the function.

It is clear that it was necessary to continue to optimize the piece that is done most often - namely ajax. Okay, I really liked memcached as an alternative database, because it guaranteed no I / O, so I started to think about where to use it again. [loadavg less and less reached 50, but it was almost impossible to work with the site anyway]

My attention was drawn to the block: “the last thing that was said.” After all, he had to find the last 5 records in the mysql table, and this, of course, was not a joke, it was done on a 100,000 table using the ORDER BY time1 DESC method ... every 10 seconds ... for each of the thousands who simultaneously sat on the site ... that is, about a hundred times per second. memcached the block and voila ... the load was off!

Hallelujah! stop ... but where is collaborative editing? I don’t see how hundreds of people edit rare words on google.com ... your division, of course ... I cached this ajax block, and it caused a change in the fields. He cached once for a page and tried to insert the same changes on others, only there were no such words.

Array in memcached

So, I needed to do something to temporarily store what people did on the site in the last 10 seconds! I organized something like an array in memcached - keys with a name from ' tmp_cache_0 ' to ' tmp_cache_20 ' [20 - picked up experimentally] and when someone entered something - I went through them all, looking for a free cell, and when I found her - put this key for 10 seconds in the denormalized value of "word | page | that the person entered." After all, once every 10 seconds, all users on the site turned to ajax and in the next 10 seconds, these values are guaranteed not needed - the expiration of the keys in memcached played clearly for me here, because I did not need to give DELETE so as not to clog the database. As, for example, would be the case with mySQL.

And now, with every ajax call, I did not need to access mysql at all , except when someone introduced something new - checking 20 memcached keys, which is very fast and another key is updating real-time stats. Voila, the load dropped to 10-15! I was pleased as the Postgres logo and went to sleep - by this time I had been coding for 18 hours.

It should be noted that I understood that memcached is not a permanent storage and the keys can drop out, but I did not fallback in mySQL, because it is not a banking application and the fact that someone will not see one or two updates does not make a disaster. I’m an engineer in the brain and if the application works reliably in 95% of cases, I won’t spend another 50% of the time to complete the remaining 5%. Although, in my case, I did not see a single lost key, but someone quite possibly could.

Yoi Haji,
view from Habr

Tags:

website optimization

Such a pain! Crowds vs. Web - 2-0. Episode Two - Clones Come in at Noon

Temporary poultices

memcached

Live vs beta

Array in memcached

Also popular now: