MIT course "Security of computer systems". Lecture 14: "SSL and HTTPS", part 2

Original author: Nikolai Zeldovich, James Mykens
  • Transfer
  • Tutorial

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems." Nikolai Zeldovich, James Mykens. year 2014


Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.

Lecture 1: “Introduction: threat models” Part 1 / Part 2 / Part 3
Lecture 2: “Control of hacker attacks” Part 1 / Part 2 / Part 3
Lecture 3: “Buffer overflow: exploits and protection” Part 1 /Part 2 / Part 3
Lecture 4: “Privilege Separation” Part 1 / Part 2 / Part 3
Lecture 5: “Where Security System Errors Come From” Part 1 / Part 2
Lecture 6: “Capabilities” Part 1 / Part 2 / Part 3
Lecture 7: “Native Client Sandbox” Part 1 / Part 2 / Part 3
Lecture 8: “Network Security Model” Part 1 / Part 2 / Part 3
Lecture 9: “Web Application Security” Part 1 / Part 2/ Part 3
Lecture 10: “Symbolic execution” Part 1 / Part 2 / Part 3
Lecture 11: “Ur / Web programming language” Part 1 / Part 2 / Part 3
Lecture 12: “Network security” Part 1 / Part 2 / Part 3
Lecture 13: “Network Protocols” Part 1 / Part 2 / Part 3
Lecture 14: “SSL and HTTPS” Part 1 / Part 2 / Part 3

The first reason is that the OCSP protocol adds a delay to every request you make. Every time you want to connect to the server, you first need to connect to OCSP, wait for it to respond, and then do something else. So connection delays do not contribute to the popularity of this protocol.

The second reason is that you do not want OCSP to affect your ability to browse the web. Suppose that the OSCP server has disconnected, and then you can lose the Internet altogether, because the protocol considers that it cannot verify someone’s certificate, it is possible that all sites on the Internet are bad and you cannot be allowed there. But nobody needs that, so most customers view the non-interference of the OCSP server as a positive event.



This is really bad in terms of security. Because if you are an attacker and want to convince someone that you have a legitimate certificate, but in fact this certificate has been revoked, all you need to do is somehow prevent the client from communicating with the OCSP server.

The client will say this: “I am trying to request verification of the certificate of the site I need, but this OCSP does not seem to be around, so I’ll just go to this site.” So using OCSP is not a good plan.

In practice, people try to create this alternative, because customers simply tend to make serious mistakes. For example, the Chrome web browser is delivered to the client, having already inside itself a list of certificates that Google really wants to revoke. So if someone incorrectly issues a certificate for Gmail or another important site, for example, Facebook or Amazon, then the next version of Chrome will already contain this information in the built-in verification list. Thus, you do not have to contact the CRL server and communicate with OCSP. If the browser has verified that the certificate is no longer valid, the client rejects it.

Student: let's say I stole the secret key of the CA certificate, because not all public keys are encrypted?

Professor:yes, it will have bad consequences. I do not think there is any solution to this problem. Of course, there were situations when certificate authorities were compromised, for example, in 2011 there were two compromised CAs that in some way fraudulently issued certificates for Gmail, Facebook, and so on. It is not quite clear how this happened, perhaps someone really stole their secret key. But regardless of the reasons for the compromise, these CAs were removed from the list of trusted certificate authorities that are built into the browsers, so that in the next release of Chrome they were no longer there.

In fact, it caused trouble for the legal holders of certificates issued by these centers, because their previous certificates became invalid and they had to get new certificates. So, in practice, all this fussing with certificates is a rather complicated matter.

So, we have considered the general principle of validity of certificates. They are better than Kerberos in the sense that you no longer need this guy to be on the Internet all the time. In addition, they are more scalable, because you can have several KDCs and you don’t need to communicate with them every time you connect.

Another interesting feature of this protocol is that, unlike Kerberos, you are not required to authenticate both sides of the connection. You can connect to the web server without having a certificate for yourself, and this happens all the time. If you visit amazon.com, then you’re going to check that Amazon is the right site, but Amazon has no idea who you are and won’t know about it until you authenticate to the site. Thus, at the level of the encryption protocol, you do not have a certificate, and Amazon has one.

This is much better than Kerberos, because you must have an entry in its database before connecting to Kerberos services. The only inconvenience of using this protocol is that the server must have a certificate. So you can't connect to the server and say, “Hey, let's just encrypt our stuff. I have no idea who you are, and you have no idea who I am, but let's encrypt it anyway. ” This is called opportunistic encryption, and of course, it is vulnerable to man-in-the-middle attacks. You can encrypt common things with someone, while not knowing him, then an attacker preparing to attack you, can also later encrypt their packets and protect themselves from spying.

So it is a pity that these protocols that we are considering here — SSL, TLS — do not offer this kind of opportunistic encryption. But such is life.



Student: I'm just curious. Let's just say, once a year, they create pairs of keys with new names. Why not try using this particular key for a whole year?

Professor:I think they do. But it seems that with this scheme something goes wrong. Here, as in the case of Kerberos, people start with the use of strong encryption, but over time it gets worse and worse. Computers are becoming faster, new algorithms are being developed that successfully break this encryption. And if people do not care about improving reliability, problems grow. This is the case, for example, when a large number of certificates are signed.

There are two nuances here. There is a public key signature scheme. Further, given that the encrypted public key has some limitations, you, signing the message, in fact, only the hash of this message is signed, because it is difficult to sign the giant message, but it is easy to sign the compact hash.

The problem arose because people used MD5 as a hash function, turning the signing of a huge message into a 128-bit thing that was encrypted. Perhaps 20 years ago, MD5 was good, but over time, people found weaknesses in it that could be exploited by an attacker.

Suppose at some point someone actually asked for a certificate with a specific MD5 hash, and then carefully disassembled another message that was hashed with the same MD5 value. As a result, he had a hashed CA signature, and then another message appeared, or another key, or another name, and now he can convince someone that it is signed with the correct certificate. And this really happens. For example, if you spend a lot of time trying to crack one key, you will eventually succeed. If this certificate uses encryption, it can be hacked using the brute-force method.

Another example of unsuccessful use of encryption is the RSA algorithm. We did not talk about RSA, but RSA is one of these public-key cryptographic systems that allows you to encrypt and sign messages. Nowadays, you can spend a lot of money, but in the end, hack 1000-bit RSA keys. You may have to do a huge amount of work, but this is easily done during the year. You can ask the certificate authority to sign a message or even take someone’s existing public key, try to find the corresponding private key for it, or crack it manually.
Thus, you must keep up with the attacker, you must use larger RSA keys or use another encryption scheme.

For example, now people do not use MD5 hashes and certificates. They use the SHA-1 cryptographic hashing algorithm. For some time he provided the necessary security, but today it is a weak defense. Now Google is actively trying to force web and browser developers to abandon the use of SHA-1 and use another hash function, because it’s quite clear that in 5 or 10 years it’s possible to successfully attack SHA-1 without any difficulty. His weakness has already been proven.

So, I suppose, the magic bullet as such does not exist. You just have to make sure that you continue to grow in parallel with the hackers. Of course, the problem exists. Therefore, all the things we talked about should be based on the correct encryption, or on the fact that it is very difficult to hack. Therefore, you must select the appropriate parameters. At least, there is a shelf life here, so it’s better to choose the parameters for a shelf life of 1 year, rather than 10 years.



This key CAs creates a more serious problem, since it does not have a mandatory shelf life. Therefore, you should choose more aggressive security options, for example, 4000 or 6000 RSA bit keys, or something else. Or another encryption scheme, or all together, but do not use SHA-1 here.

And now let's see how we integrate this protocol into a specific application, namely into a web browser. If you want to communicate online or communicate with sites using cryptography, there are three things in the browser that we need to protect.

The first thing, A - is the protection of data on the network. This is relatively easy, because we are just going to start a protocol that is very similar to the one I described so far. We will encrypt all messages, we will sign them, we will be convinced that they were not forged, in general, we will do all these wonderful things. This is how we will protect the data.

But there are two more things in the web browser that we really should worry about. So, the first, B - is the code that is used in the browser, for example, JavaScript or important data that is stored in the browser, your cookies, or local storage, all this should be somehow protected from hackers. In a second I will tell you how to protect them.

Last, C, what you often do not think about, but what can be a real problem in practice is the protection of the user interface. And the reason for this is that, ultimately, most of the confidential data we are concerned about protecting comes from the user. So, the user prints the data on some site, and he probably has several tabs of different sites open at the same time, so you need to be able to distinguish which site he actually interacts with, at any given time.



If he accidentally enters an Amazon password on some web forum, it will not be disastrous, depending on how much he cares about his password, but it will still be unpleasant. Therefore, you really want to have a good user interface that helps the user understand what he is doing, whether he prints sensitive data on the correct website and whether something will happen to this data after he sends it. So this turns out to be quite an important issue for protecting web applications.

So let's talk about how A, B and C modern browsers do these things. As I mentioned earlier, we’ll simply use this protocol, called SSL or TLS, to protect data on the network if we use data encryption and authentication.



This is very similar to what we discussed, and includes certificate authorities, and so on. And then, of course, there are many more details. For example, TLS is extremely complicated, but we will not consider it from this point of view. We will focus on browser protection, which is much more interesting. We need to make sure that any code or data delivered over unencrypted connections cannot change the code and data received from an encrypted connection, because our threat model is such that everything unencrypted can be faked by the attacker over the network.

So if we have some kind of unencrypted JavaScript code running in our browser, we have to assume that it could have been tampered with by an intruder, because it was not encrypted. It did not pass network authentication. And, therefore, we must prevent it from interfering with any page that was delivered via an unencrypted link.

Thus, the general plan is that for this we are going to introduce a new URL scheme, which we will call HTTPS. You often see this in URLs. The new URL scheme is that now these URLs are simply different from HTTP addresses. So if you have a URL with this HTTPS: //, then it has a different origin origin than the usual HTTP URLs, because the latter go through unencrypted fixes, they go through SSL / TLS. Thus, you will never confuse these types of addresses if the same origin policy works correctly.

So this is one piece of the puzzle. But then you should also make sure that you correctly distinguish the encrypted sites from each other, as for historical reasons they use different cookie policies. So let's first talk about how we will distinguish different encrypted sites from each other.

The plan is that the hostname via the URL should be the name in the certificate. In fact, it turns out that CAs are going to sign the host name, which appears in the URL as the name of the public key of the web server. Thus, Amazon allegedly has a certificate for www.amazon.com . This is the name in our table that has a public key corresponding to their private key.



This is what the browser will look for. So if he gets a certificate, if he tries to connect or get the URL of foo.com , it means that the server exactly represents the authentic certificate of foo.com. Otherwise, let's say, we tried to contact one guy, and contacted another, because his certificate has a completely different name to which we are connected. This will be a mismatch of certificates.

This is how we will distinguish different sites from each other: we will attract CAs to help them distinguish these sites from each other, because CAs promise to issue certificates only to the correct members of the network. So this is part of the same origin policy, according to which we divide the code into parts. As you remember, cookies have a slightly different policy. They are almost the same origin, but not quite, cookies have a slightly different plan. Hooks have a so-called security flag, Secure Flag. The rule is that if a cookie has such a flag, then they are sent only in response to HTTPS requests or with HTTPS requests. Kukiz with the security flag and without such a flag correspond with each other as https requests and http.



It is a bit difficult. It would be simpler if the cookie simply indicated that this is a cookie for the HTTPS host, and this is the cookie for the HTTP host, and they are completely different. This would be very clear in terms of isolating secure sites from unsafe sites. Unfortunately, for historical reasons, cookies use this strange kind of interaction.

Therefore, if a cookie is marked as secure, it applies only to HTTPS sites, that is, it has the correct host. Secure cookies apply only to HTTPS host URLs, and unsafe cookies apply to both kinds of addresses, both for https and for http, so in just a second this will be a source of problems for us.

And the final touch that web browsers put to try to help us in this plan is an aspect of the user interface in which they are going to enter some kind of lock icon for users to see. Thus, you should pay attention to the lock icon in the address bar of your browser and the URL to find out which site you are on.

Web browser developers expect that you will behave this way: when you hit a website, you first look at the URL and make sure that this is the name of the host you want to talk to, and then find the lock icon and understand that all is well. This is an aspect of the browser user interface.

However, this is not enough. It turns out that many phishing sites will simply include a lock icon in the site itself, but use a different URL. And if you do not know what the address of this site should be, you can be deceived. In this sense, this side of the user interface is a bit confused, in part because users themselves are often confused. So it's hard to say what is right here. Therefore, we focus mainly on the second aspect, B, which is definitely much easier to discuss. Any questions about this?



Student: I noticed that some sites eventually turn from HTTP to HTTPS.

Professor:Yes, browsers evolve over time, and this is confirmed by the fact that they get the lock icon. Some browsers set the lock icon only if all the content or all resources of your page are also transmitted via https. So one of the problems that HTTPS is trying to solve forcibly is the mixed content or the problems of unsafe kinds of content embedded in the page. Therefore, sometimes you will not be able to get a lock icon because of this check. If the Chrome browser believes that the site certificate is not good enough and uses weak cryptography, then it will not give you a lock icon. However, different browsers come in different ways, and if Chrome does not give you a lock icon, then Firefox can give. Thus, again, there is no clear definition of what this lock icon means.

Let's see what problems may arise during the implementation of this plan. In normal HTTP, we are used to relying on DNS, which should give us the correct IP address on the server. How much should we trust the DNS in providing these HTTPS URLs? Do DNS servers deserve trust, or is DNS more important for us?

Student: I think you should trust, because the certificate signs the domain name, not the IP address.

Professor: Exactly, the certificate signs a domain name, such as amazon.com.

Student: suppose someone steals a private key amazon.com, connects it with another server and another IP address.

Professor:quite right, you say that both cases are possible - when someone steals a private key and redirects the DNS to itself. So, the DNS itself is quite a delicate thing, so you need to take care of its security. You are right that DNS is needed to determine the IP address we need, otherwise you can lose the host. But what happens if someone hacks the DNS server and gives us another IP address? How bad is it?

Student: maybe he will just compromise HTTPS?

Professor: in principle, it is dangerous, because in this case, the browser can refuse to establish a connection at all.

Student: No, the hacker will simply redirect you to an HTTP URL.

Professor:The fact is that if you connect to the site via HTTPS, they cannot redirect you to an unprotected site.

Student: you can sign your certificate and try to trick the user.

Professor: yes it is true. You can try to use another certificate. This is possible if you compromised the CA, or just signed your own certificate, or maybe you have some old certificate from the guy who got the secret key.

It turns out that most web browsers to force https use ask the user if something is wrong with the certificate, and this seems rather strange, because there are rules according to which the host name must match the certificate name, the certificate must be valid, it should not be expired, and the rules make it clear.

But historically, HTTPS has been implemented in such a way that web server operators have incorrectly configured it. Perhaps they simply forget to renew their certificates. Everything was going fine, you did not notice that your certificate expired, and you simply forgot to renew it. However, this does not suit the developers of web browsers. "This is just an expired certificate, so let's allow the user to use it." Therefore, they offer the user a dialog box in which they indicate that the site has a certificate, but it looks somehow wrong, but you can still continue to work with this site. Thus, web browsers seem to allow users to cancel the current rules regarding the validity of certificates.

They do the same with host names, because maybe your site has several correct names. For example, you can connect to amazon.com or www.amazon.com , or perhaps another hostname.

If you are not careful enough as a web operator, you may not even know that you need to get certificates for all possible names that your site has. In this case, the user thinks, saying to himself: “Well, yes, the host name does not look quite right, but maybe I’ll come here anyway.” This is the reason why web browsers allow users to accept a wider range of certificates than rules dictate. So to a certain extent, this choice is also becoming a security issue.

In this case, if you captured DNS, you could redirect the user to one of these sites with the wrong certificate, and if the user is not careful and his browser accepts such a certificate, then he will have problems.

So the question of how much you really should trust the DNS remains open. Of course, you don’t want to give arbitrary users control over your DNS name, but the goal of SSL / TLS and HTTPS is not to trust the DNS at all. If everything works correctly here, then DNS should not be trusted. You should not be able to undergo a DoS attack, or intercept your data, damage it, and so on.

Another interesting question to talk about is how can you attack a user who incorrectly validates a certificate? We said that if a user accepts a certificate from the wrong host or an expired certificate, what can go wrong? What should we worry about if the user makes mistakes of this kind?



Student: for example, he may perform some actions not on the site that he intended to visit. Thus, attackers can hide behind the user name.

Professor:correctly, the user can be deceived, thinking that he has all this money, but it turns out that he has no money at all, because the result page returns and says: “here is your balance”! Therefore, it is possible that the user will take some action, implying that the bank has what it does not really have, on the basis of the result obtained. It all seems bad, but not necessarily so disastrous.

Student: I think a hacker can get all the user's cookies and use them for his own purposes.

Professor:sure, here it is to be feared. This may have a more lasting effect on you. The reason why it works is that the browser, when deciding who is allowed to receive a certain set of cookies and who is not allowed, simply looks at the host name in the URL to which you should have been connected, and is governed by the origin policy. Therefore, if you connect to the web server of attackers and simply accept their certificate for amazon.com as the real thing, the browser will think that the entity with which it speaks is the real amazon.com. Therefore, it will treat it as a real server at amazon.com, which means that this server should access all the cookies that you have for this host, and presumably on the same principle of the same origin,

You may have a tab in your browser connected to a real website. Then you close your laptop and open it in another place to continue using the site. At this point, someone intercepts your connection with amazon.com and “injects” your answer. If you approve it, the hacker will be able to access the old amazon.com page, which is open in your browser, and the browser will consider that it has the same origin, because the pages have the same host name. It will be quite problematic. So if the user makes the wrong choice, approving someone else's wrong certificate, he has a chance to undergo a similar attack.

52:10 min.

The course MIT "Security of computer systems." Lecture 14: "SSL and HTTPS", part 3


Full version of the course is available here .

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .

Dell R730xd 2 times cheaper? Only here2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Also popular now: