MIT course "Security of computer systems". Lecture 13: "Network Protocols", part 1
Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems." Nikolai Zeldovich, James Mykens. year 2014
Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.
Lecture 1: “Introduction: threat models” Part 1 / Part 2 / Part 3
Lecture 2: “Control of hacker attacks” Part 1 / Part 2 / Part 3
Lecture 3: “Buffer overflow: exploits and protection” Part 1 /Part 2 / Part 3
Lecture 4: “Privilege Separation” Part 1 / Part 2 / Part 3
Lecture 5: “Where Security System Errors Come From” Part 1 / Part 2
Lecture 6: “Capabilities” Part 1 / Part 2 / Part 3
Lecture 7: “Native Client Sandbox” Part 1 / Part 2 / Part 3
Lecture 8: “Network Security Model” Part 1 / Part 2 / Part 3
Lecture 9: “Web Application Security” Part 1 / Part 2/ Part 3
Lecture 10: “Symbolic execution” Part 1 / Part 2 / Part 3
Lecture 11: “Ur / Web programming language” Part 1 / Part 2 / Part 3
Lecture 12: “Network security” Part 1 / Part 2 / Part 3
Lecture 13: “Network Protocols” Part 1 / Part 2 / Part 3
So today we will talk about Kerberos, a cryptographically secure protocol designed for mutual authentication of computers and applications on the network. This is a protocol for authenticating the client and server before establishing a connection between them.
So now, finally, we will use cryptography, unlike the last lecture, where we looked at security only with the help of TCP SYN sequence numbers.
So let's talk about Kerberos. What is trying to support this protocol? It was created at our institute 25 or 30 years ago as part of the Athena project to ensure the interaction of multiple server computers and many client computers.
Imagine that you have a file server somewhere. This could be a network-connected mail server or other network services, such as printers. And all this is simply connected to a network, and not processes on the same computer.
The prerequisite for the creation of Athens and Kerberos was the fact that you had a machine for simultaneous sharing, where everything was a separate process, and everyone could just enter the same system and store their files there. Therefore, the developers wanted to create a more convenient distributed system.
Thus, this meant that on one side you would have these servers, and on the other - a bunch of workstations that users will use themselves and on which applications will run. These workstations will connect to these servers and store user files, receive their mail, and so on.
The problem they wanted to solve was how to authenticate users who use these workstations for all of these different computers in the server side, without having to trust the network and verify its correctness. This in all respects was a reasonable design requirement. I have to mention that at that time the alternative to Kerberos was the R login commands, discussed in the last lecture, which seemed like a bad plan, as they simply use their IP addresses to authenticate users.
Kerberos was quite successful, it is actually still used on the MIT network and is the core of Microsoft’s Active Directory server. Almost every Microsoft product based on the kind of Windows Server uses Kerberos in one form or another.
But this protocol was developed 25 or 30 years ago, and since then changes have been needed, because today people understand security much more. Thus, today's version of Kerberos is noticeably different in many respects from the version described in the materials for this lecture. We will consider which assumptions are not good enough today and what was wrong with the first version. This is unavoidable for the first protocol that actually used cryptography to authenticate network participants in a full-scale system.
In any case, the diagram shown on the board is a kind of setup for creating Kerberos. It is interesting to find out what the model of trust is here. Therefore, an additional structure is introduced into our scheme - the Kerberos server located here at the side.
Thus, our third model is in some sense based on the fact that the network is unreliable, as we mentioned in the last lecture. Who should we trust in this Kerberos scheme? Of course, all members of the network must trust the Kerberos server. Thus, the creators of the system at one time assumed that the Kerberos server would be responsible for all checks of network authentication in one form or another. What else do we have on this network, what can we trust?
Student: users can trust their own machines.
Professor:Yes, this is a good argument. There are users I didn't draw here. But these guys use some kind of workstation, and in fact, in Kerberos it’s very important that the user trusts his workstation. What happens if you do not trust your workstation? Because if the user does not trust the workstation, then you can simply "smell out" your password and act on your behalf.
Student: An attacker can do much more, for example, by finding out your ticket to the Kerberos server.
Professor:Yes exactly. When you log in, you enter your password, which is even worse than a ticket. So actually, here there is a small problem with Kerberos, if you do not trust the workstation. If you use your own laptop, it is not so bad, but the security of a public computer is in doubt. We will look at what exactly might go wrong in this case.
Student: you must trust server administrators and be sure that they can have privileged access to each other’s servers.
Professor: I think that the machines themselves do not have to trust each other, for example, the mail server does not have to trust the print server or the file server.
Student:not to trust, but to be able to access a server that is not supported through another server.
Professor: yes it is true. If you establish a trust relationship between the mail server and the print server, but at the same time just for convenience, give the mail server access to your files on the file server, then this can be misused. So you need to be careful about introducing additional levels of trust or redundant trust relationships here.
What else matters here? Should servers somehow trust users or workstations? I guess not. Kerberos’s global goal was that the server should not, a priori, know all these users or workstations, or know how to authenticate them, until these users can cryptographically prove that they are legitimate users and should have access to their data or something. more than the server manages.
Let's take a look at how Kerberos works and what its overall architecture is. Let's draw the Kerberos server on a larger scale. Nowadays, it is called the KDC - Key Distribution Center, or Key Grant Center. Somewhere here are located users and services to which you can connect. The plan is that the Kerberos server will be responsible for storing a shared key for communication between the Kerberos server and each computer entity in the world around it. Thus, if a user has some client key Kc, then the Kerberos server remembers this key and stores it somewhere inside of it. Similarly, the Ks key for the service will be known only to this service itself, the Kerberos server, and no one else. So you can think of it as a general use of passwords, when you know the password and Kerberos knows it,
So you're going to prove to each other that "I am that guy." Of course, the Kerberos server will have to track who owns this key, so it must have a table in which user names and services will be stored, for example, serv afs (this is a file server), and the corresponding keys.
In this case, the KDC is responsible for storing a giant table, which is not very large in terms of the number of bytes, but very voluminous in terms of the number of records, because it takes into account any computer entity that lives in the MIT network, which the Kerberos server should know about. Thus, we have two kinds of interface.
In the materials of the lecture this is not clearly stated, that is, the existence of these 2 interfaces is simply implied. In fact, there are actually two interfaces for one machine. One of them is called Kerberos, and the second is TGS, Ticket Granting Service, or Ticket Service.
In fact, in the end, these are just two ways to talk about the same thing, and the protocol is only slightly different for these two things. Therefore, initially, when a user logs in, he “speaks” with the upper interface, Kerberos and sends him his client name C, this can be your username on the Athena university network.
The server responds to this request with a tgs ticket or ticket information; we will look at the details of this information a little later. Then, when you want to talk to a service, you must first access the TGS interface and tell it: “I have already logged into the system through the Kerberos interface and now I want to talk to the S server, which will provide me with a certain service.”
So you tell TGS about the server you want to talk with, after which it will return you something like a ticket to talk to server S. Then you can finally talk to the server you need, using the received ticket for server S.
This is a kind of high level plan. So why is it using 2 interfaces? On this occasion, you can ask a lot of questions. In the case of the Ks server, this service is likely to be stored on disk. And what happens to this Kc on the user's side? Where does this Ks come from in Kerberos?
Student: this Kc should be in the database, in the table of the KDS server.
Professor: yes, well, the C key is here in the table, in this gigantic database. But it must also be known to the user, because the user must prove that he is a user.
Student: is it a one-way function that then requires a password?
Professor:Yes, they actually have such a smart plan, where Kc is obtained by hashing a user password or some kind of key generation function, for this there are several different techniques. But basically we take the password, convert it in some way, and get this key Kc. So it seems like a good way.
But why do we need two protocols? After all, you can imagine that you simply request a ticket directly from the first Kerberos interface, telling him: “hey, I want a ticket for this particular name!”, He will send you the ticket back, and you can decrypt it using Kc.
Student: maybe they don’t want the user to re-enter their password every time they want to access another service?
Professor:That's right, the reason for the difference between these two interfaces is that from the first interface all the answers are returned encrypted with your Kc key, and the creators of Kerberos were worried about the possibility of saving this Kc for a long time. Because either you have to ask the user to enter the password every time, which is just annoying, or he constantly “sits” in the memory. Basically, it’s as good as a user’s password, because someone with access to Kc can retain access to the user's files as long as the user may not change their password, or even longer. Later we will take a closer look at this issue.
So leaking this Kc key is a very dangerous thing. So the whole point of using the first and then the second interface first for all subsequent requests is that you can actually forget Kc as soon as you decrypt the response from the TGS interface of the Kerberos server. From now on, even in the event of a key leak, the functionality will depend on the ticket received. So in the worst case, someone will get access to your account for a couple of hours, and not for an unlimited amount of time. This is what explains this scheme with two ways of accessing the same resources.
So, before we delve into the mechanics of how these protocols actually look online, let's talk a bit about the aspect of names in Kerberos. In a sense, Kerberos can be considered a registry of names. He is responsible for displaying these cryptographic keys as string names. This is a fundamental type of operation that Kerberos performs. You will see in the next lecture why we need a similar function. It can be implemented differently than in Kerberos, but it is fundamentally very important to have such a thing in almost any distributed security system. So let's see how Kerberos acts with names.
Kerberos has some sort of system call for each computer entity in the database of network participants, and the main type of this data is just a string. So you can have some basic names like nickolai. This is the name string.
It is the main parameter in some Kerberos realm, in fact, this thing is in the left column of the KDC table. And there are also some additional parameters that the protocol supports. I could, for example, enter another name like nickolai.extra sec that would be used in addition to the name nickolai to access resources that need extra security. So maybe I will have one password for really safe things and another password for my regular account.
This aspect is mentioned in the Kerberos work. Therefore, one may wonder - where does the impact come from? The Kerberos service matches names for certain keys for you, but how do you know what name you need to ask or what name to expect in response when you communicate with some computer? That is, I ask, what kind of names appear outside the Kerberos server, or where exactly do these user names appear? Do you have any ideas?
Student: Presumably, you can ask for usernames from the MIT server.
Professor: yes, of course. This is how you can list these things. In addition, users simply enter them when they log into the system, this is where they originally come from. Do usernames appear anywhere else? Should they appear somewhere else?
Student: perhaps, user access is indicated in the lists in various services.
Professor: yes, this is a really important point, right? The purpose of Kerberos is simply to associate keys with names. But that does not tell you what this name should have access to.
In fact, the way applications typically use Kerberos is that one of these servers uses Kerberos to figure out which line name it is talking to. When the mail server receives a connection from some workstation, it receives a Kerberos ticket, which proves that this user is called Nikolay. After that, the internal mail server finds out what this user has access to. Similarly comes the file server.
Thus, within all of these servers there are access control lists, possibly lists of groups or other things that authorize. So Kerberos provides authentication, which shows you who this person you are talking to is. The service itself is responsible for implementing the part of the authorization that decides what level of access you should have based on your username. So we figured out where the usernames appear. There are other basic names that Kerberos supports to interact with services.
According to the materials of the lecture, the services look like this: rcmd.hostname. The reason why you need a name for one of these services is that you want, for example, when connecting to a file server, to perform mutual authentication. This means that in this procedure, not only the final server finds out who I am, but I, the user or the workstation, are convinced that I am talking to the correct file server, and not to some fake file server that has forged my files. Because, perhaps, I want to see the file with my estimates and send it to the registrar. So it would be too bad if some other file server could play the role of the right server and give me the wrong rating file.
Therefore, services also need their own name, and workstations must figure out which name I expect to see when I connect to the service.
As a rule, at some level it comes from the user. So, for example, if I type ssh.foo, it means that I should expect the appearance of the main Kerberos name like rcmd.foo at the other end of this connection. And if someone else is there, then the SSH client should break the connection and not allow me to connect, because then I will be misled and talk to some other machine.
This raises an interesting question. When can we reuse names in Kerberos? For example, you all have accounts in the Athena institute system. When you graduate, can MIT delete your record in the database and allow someone else to register the same username? Would that be a good idea?
Student: but, after all, not only the Kerberos database, but also the services have a list of user names?
Professor: yes, because these names are in fact simply represented by string entries somewhere in the ACL on a file or mail server. If we erase your entry in the Kerberos server database, this does not mean that your entry has disappeared altogether. These entries are version independent.
For example, the record says that Alice has access to some Athens locker. Then Alice finishes the institute, and her record is deleted, but some new Alice enters the institute and is going through the registration process in the Kerberos database. At the same time, she gets the main name, completely identical to the name of the old Alice, so the file server can give the new Alice access to the files of the old Alice.
Thus, in Kerberos, there is a complicated process of restoring the principal names of the participants, because there is no real connection between the Kerberos server and the servers of the services or the checking of the relevance of the versions. Therefore, reusing basic names is quite difficult, and once you register that name, you probably don’t want to reuse it too often.
The same is true for the main service names. As long as this hostname is associated with some well-known service that people expect certain functions from, you probably don’t want to get rid of your key, even if that service stops working. Because maybe a year later, some guy will try to connect to this name and will expect from him a certain kind of things. And if this name was reused for another service, this guy might be deceived by his expectations. There is probably nothing dramatic about it, but you should still be careful about re-using basic names in a protocol of this kind.
Now let's see how the protocol itself works. We will start with this step, where you initially get a ticket with a password, and then consider how the TGS interface works.
So, there is a basic data structure that Kerberos uses, and it is called a “ticket”. And this ticket between client and server looks like this: it contains server and client s names and s, client's addr IP address, time stump time stamp, life, indicating how long this ticket is valid, and Kc key , s, which is common to the client and server. All this is recorded in the ticket.
So that's what is written in the ticket.
We also have another original data structure, which Kerberos calls the “Authenticator”. This Ac authenticator is associated with a specific client and includes only the client name, the client’s IP address and a time stamp indicating when the client created this authenticator. Both of these things are usually encrypted. The authenticator is encrypted with the key Kc, s, which is common for the client and the server, and the Kerberos ticket itself is encrypted with the key for the Ks service. Thus, the subscript here refers to encryption using a specific key.
Let's try to figure out what the protocol is with which the user initially logs into Kerberos and gets a TGS ticket. As we saw earlier, the plan is that the client is going to send his username to the Kerberos server, or its interface, and in return receive a ticket. In this case, the client transmits both main names: the name of the client of a particular service C, for which he wants to get a ticket, and the actual name of service S, which is contained in the table of the TGS database. So you can get a ticket to any service.
The answer will be a ticket between the client and the server Tc, s, encrypted with the key Ks, which is shown above in the figure, also containing the common key Ks, s, and all this is encrypted with the key Kc. This will be a wired protocol.
Let's try to figure out a couple of things. First of all, how does the Kerberos server authenticate the client here? How does he know that the request is made by the correct user?
Student: he can make sure that this is the same ticket that he sent because he has Kc.
Professor: I think what happens is that the Kerberos server really does not know whether this is the correct client or not. But he thinks like this: “well, for me it doesn’t matter who makes this request. I just send him this thing, because only the person who should be able to use it knows this Kc key. ” This is really cool because the client doesn't need to send his password over the network at all.
Thus, in a sense, it is actually better than if the client sent its password to the Kerberos server, because even if the Kerberos server tried to write down the passwords, it would never have received your password. Or, maybe, if someone pretended to be a Kerberos server, he would not get a copy of your password, so there is nothing wrong with security.
Student: and if the attacker wants to get your password offline without ...
Professor:yes, in fact, this is not the best aspect of Kerberos, right? Do you see what the problem is? The problem is that the way in which the client can tell if he has received the correct password or not, or the workstation can say that the client is using the correct password, is that they are trying to decrypt this ticket, and only then they see he works or not.
Decryption is pretty cheap here, it’s symmetric encryption, and you can probably do millions of decryptions per second on modern machines, if you try hard. This means that you can try millions of potential passwords per second and guess what kind of password this person has. And it can be done for the password of any person. You can simply send his primary name to the Kerberos server, and he will happily return you the answer, encrypted with the user's password. Then you can just try to pick up different passwords and see which one works and which one doesn't.
Student: Is decoding the content an advantage to authenticate? How can we be sure that you are directly ...
Professor:Yes, this is actually another interesting aspect. It suggests that the developers of Kerberos, when creating it, did not quite understand that they must very carefully separate encryption from authentication. There is a hint in the lecture materials that whenever you encrypt a piece of data and send it to someone else and this person can decipher the data and make sure that everything is fine with them, this means that he has the right key and they were not damaged by data transfer. After 30 years, it does not look like a good plan, although at the time it was not so obvious.
Now in version Kerberos 5, it does this: both the server and the client both encrypt all parts of the data and authenticate the message by calculating the hash using the key. In this case, the result actually tells you that yes, this part of the data is not damaged, it is correctly signed with this key, and so on.
In the Kerberos 4 version of the ticket, there were some extra bits that were encrypted and were supposed to be certain patterns, such as zeros. If you got the key wrong, this template would not look like solid zeros. This mechanism was not guaranteed cryptographically, but in most cases the pattern did not look like solid zeros and therefore you could understand whether you received the right key or not.
So, this is the plan, how to inform the client whether his ticket is valid. They are just trying to decipher it and see how it works. Another interesting question - why is this key Kс, s in some form included in the ticket twice? It is present in the ticket separately as the key Kc, s and is present implicitly in the ticket Tc, s itself. Why do we have two copies of the key Kc, s?
Course MIT "Computer Security". Lecture 13: "Network Protocols", part 2
Full version of the course is available here .
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).
VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .
Dell R730xd 2 times cheaper? Only here2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?