MIT course "Security of computer systems". Lecture 12: "Network Security", part 1

Original author: Nikolai Zeldovich, James Mykens
  • Transfer
  • Tutorial

Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems." Nikolai Zeldovich, James Mykens. year 2014


Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.

Lecture 1: “Introduction: threat models” Part 1 / Part 2 / Part 3
Lecture 2: “Control of hacker attacks” Part 1 / Part 2 / Part 3
Lecture 3: “Buffer overflow: exploits and protection” Part 1 /Part 2 / Part 3
Lecture 4: “Privilege Separation” Part 1 / Part 2 / Part 3
Lecture 5: “Where Security System Errors Come From” Part 1 / Part 2
Lecture 6: “Capabilities” Part 1 / Part 2 / Part 3
Lecture 7: “Native Client Sandbox” Part 1 / Part 2 / Part 3
Lecture 8: “Network Security Model” Part 1 / Part 2 / Part 3
Lecture 9: “Web Application Security” Part 1 / Part 2/ Part 3
Lecture 10: “Symbolic execution” Part 1 / Part 2 / Part 3
Lecture 11: “Ur / Web programming language” Part 1 / Part 2 / Part 3
Lecture 12: “Network security” Part 1 / Part 2 / Part 3

Today we will talk about network security, in particular, we will discuss Stephen Bellovin’s article titled “Looking into the past:“ A Look Back at Security Issues in the TCP / IP Protocol Suite ”). This guy used to work at AT & T, and now he works in Colombia. In this work, it is interesting that it is relatively old - it is more than 10 years old, and in fact, these are comments on an article that came out a decade earlier in 1989.

Many of you guys ask why we study this if many of the problems described there have been resolved in today's versions of the TCP protocol?



This is true, some of the problems described by Stephen have since been resolved, and some of them still remain problems. Given this, we will sort them out and see what happens. You may wonder why people did not solve all these problems in the first place when designing TCP? What were they just thinking about?

And this is not really clear. What do you think? Why did the TCP protocol not possess the necessary security, considering all these considerations? Any thoughts?

Student: at that time, the Internet was a much more trusting place.

Professor:yes, it was literally a quote from this guy's article. Yes, at that time in general ... a set of Internet protocols was developed, I think, about 40 years ago. The requirements were completely different. It was necessary to simply connect to the public network a bunch of relatively trusting sites that knew each other by name.

I think that this often happens in any system that becomes successful - it needs changes. Previously, it was a protocol for a small number of sites, now this protocol covers the whole world. And you no longer know by name of all people connected to the Internet. You cannot call them on the phone if they do something bad, and so on.

Therefore, I think this story is the same for many protocols that we are considering. And many of you guys are wondering, like: “what the hell were these guys thinking? This is so flawed "! But in reality, they designed a completely different system, it was simply adapted for modern needs.

The same and the Internet, as we have seen over the last couple of weeks, was designed for a very different purpose. But it has expanded, and we have new concerns about how to adapt this protocol to modern requirements.

There is one more thing that happened somewhat suddenly, that people had to overestimate the seriousness of the security problem. It used to be that you really didn’t understand all the things you should worry about, because you didn’t know what the attacker could do with your system.



I think partly for this reason it will be interesting to see what happened to TCP security, what went wrong, how we can fix it, and so on. As a result, we need to figure out which kinds of problems need to be avoided when developing our own protocols, as well as what constitutes the right thinking about such attacks. How do you know what an attacker is capable of doing in your own protocol when you are only developing it, in order to then avoid such pitfalls?

Okay, let's leave the preamble aside and talk about this article.

So how should we think about network security? I think we could start with the first principle and try to figure out what our threat model is. So, what can an attacker do in our network?

He probably has the ability to intercept packets, and perhaps he is able to modify them. Thus, if you send a packet over the network, it is reasonable to assume that some bad guy will see your package and be able to change it before it reaches its destination. He may also be able to drop it and use the ability to enter his own packets with arbitrary content that you have never sent.



But the possibility of bad guys interfering with your protocols is more dangerous. An attacker has his own computer, which he completely controls. Even if all the computers you trust are behaving properly, a bad guy who has a computer can interfere with your protocol or system operation.

So if you have a routing protocol that involves a lot of people talking to each other, and some scaling would probably be impractical to keep the bad guys outside. If a routing protocol is being executed with 10 participants, then perhaps you can just call all of them and then say, "well, yes, guys, I know all of you."

But on the scale of the Internet today it is impossible to directly find out who the other members of the network are using this protocol. So, probably, some bad guy is going to participate in your protocols or distributed systems. Therefore, it is important to design distributed systems that, however, can do something reasonable with this.

OK, so what are the consequences of all this? I think we will go through the list. Capturing packets is generally easy to understand, you cannot send any important data over the network if you expect the bad guy to intercept them, or at least not send them in clear text. Perhaps you should encrypt your data.



This seems relatively easy to understand, although you still need to keep this in mind when developing protocols. Deploying or injecting packages leads to a wider range of interesting problems that are discussed in this article. In particular, attackers can enter packets that can impersonate packets from any other sender. Since the data transfer path is based on the use of IP, the packet itself has a header indicating the source IP of the packet and the destination IP. In this case, no one checks that the source is necessarily correct. Nowadays there is some filtering, but it is not perfect and it is difficult to rely on it.

So, in the first approximation, an attacker can insert inside any IP address as a source and send it to the correct destination. It is interesting to find out what an attacker can do with the ability to send arbitrary packets.

In the weeks before, we considered buffer overflow issues from a web security perspective. We looked at how an attacker could use an implementation error such as a buffer overflow. Interestingly, the author of this article is not really interested in implementation errors, he is more interested in protocol errors.

So what's so special about it? Why did he not pay attention to implementation errors, although we spent several weeks on their consideration? Why does it matter?

Student:because we have to eliminate these errors when writing the protocol.

Professor: yes, this is a really big failure due to an error in the design of the protocol, because it is difficult to change. So if you have an implementation error and you have a memcpy or print-out that did not check the memory range, it’s impossible to notice this error. But if you have a range check, and it still works, then a buffer overflow can be avoided, so this is great.

But if you have some kind of error in the protocol specification, how the protocol should work, then correcting such an error will require correcting the entire protocol, which means a potential impact on all systems that speak this protocol. So if we find some problem in the TCP protocol, potentially it will be rather destructive. Because every machine that uses TCP will have to make changes, because it is potentially very difficult to make the modified protocol compatible with the old machine.

The errors of the TCP protocol that Steven was so worried about are fundamental, so he decided to talk about them. In the first example, it examines how the TCP SN sequence numbers work.

Student:it's a little off topic, but I'm just curious. Suppose you find an error in TCP. How do you make changes to it? How do you tell all the computers in the world to change this?

Professor : yes, I think this is a huge problem. What if you find a bug in TCP? Well, it is not clear what to do. I think the author is struggling with this here. If you could make a TCP redesign, then many of these errors are relatively easy to fix if you know in advance what to look for.

But since TCP is rather difficult to correct or change, eventually the following happens: developers try to find backward compatible settings that either allow old implementations to be used in conjunction with the new implementation, or add some additional field that makes the connection somewhat more secure.



But this is a big problem. If this is some kind of security problem that is deeply rooted in TCP, then it will become a huge problem for everyone, because it is very difficult to even simply upgrade to the TCP version, suppose n plus 1.

IPv6 can be seen as an example of the fact that this does not happen, and we know that this problem will arise for another 15 years or 20 years. IPv6 has been around for more than 10 years, but it's hard to convince people to move away from IPv4. IPv4 is enough for them, it seems to be working, and they think that switching to a new Internet protocol will be too expensive. They think: “no one else speaks IPv6, so why should I start talking on this strange protocol that I have no one to talk to?”. In any case, this is a kind of forward movement, but I think it will take a lot of time. There really will be some motivation for migration, and backward compatibility helps a lot in this case.

IPv6 has many backward compatibility features; for example, you can talk to an IPv4 host using IPv6. Therefore, developers are trying to design all this support, but it is still difficult to convince people to upgrade.

So, considering the TCP sequence numbers, we’re actually going to look at two issues that are related to how the TCP handshake works. So let's spend some time looking at how a TCP connection is initially established.

Three packets are sent to establish a new TCP connection. Our client generates a packet to connect to the server, which says that here is my client IP address, I send it to the server. At the same time there is a packet header structure consisting of different areas, but we will be interested in the sequence number area. Here we will have a SYN flag saying: “I want to synchronize the state and establish a new connection”, and it includes the serial number of the SNc client.



Then, when the server receives this packet, it says: “the client wants to connect with me, so I will send the packet back to this address, no matter who says that he is trying to contact me.” Thus, the server will send a packet to the client, which includes its own SNs server synchronization sequence number and ACK client acknowledgment number (SNc). Finally, with the third packet, the client responds to the server, confirming the synchronization and sending the server an ACK confirmation number (SNs) to the server. Now the client can start sending data.

Thus, in order to send data, at the beginning of the connection, the client must include some data in the packet and attach the sequence number of the customer SNc to indicate that this is actually legal customer data. It indicates, for example, that this is not some data from later messages that just come in now, because the server has missed some initial pieces of data.



Thus, as a rule, all these sequence numbers are intended to ensure the delivery of packages. If the client sends two packets, the one that has the initial sequence number is the first data fragment, the next sequence number is the next data fragment. It is also useful for providing some security requirements.

Before that, I gave an example that these requirements are changing. Therefore, initially no one thought that TCP should provide any security features. But then TCP started using applications, and they seemed to rely on these TCP connections, believing that they could not be broken by an attacker, or that the attacker would not be able to enter malicious data into an existing TCP connection. As if all of a sudden, this mechanism, which was originally intended only for ordering packages, began to guarantee some kind of security for these compounds.

Therefore, in this case, I assume that the problem is related to what the server could assume regarding this TCP connection. As a rule, the server assumes - implicitly, as you can imagine - that this connection is established with the right client at this IP address C, and it is natural for him to assume that. But is there any reason for such an assumption? If the server receives a message with some data about this client-server connection, and it has the sequence number C, why does the server conclude that this data was sent by the real client?

Student: because the sequence number is difficult to guess.
Professor:correct, so this is a kind of implicit thing, implying that there must be a valid SNc sequence number here. And in order for this connection to be established, the client must have a confirmed SNs server sequence number, with the server's sequence number being sent by the server only to the client's IP address.

Student: How many bits are available for the sequence number?

Professor: TCP has a sequence number of 32 bits, and although it is not a completely random number, it is not easy to guess, it would take a lot of bandwidth.

Student: is the sequence number higher than the initial sequence number?

Professor:yes, in principle, these things are increasing. Therefore, each time you send a SYN, it is considered to be 1 byte more than your sequence number. That is, if in the first line we had an argument (SNc), then in the fourth one it would be (SNc + 1), and then the numbering continues from here. Thus, if you send 5 bytes, then the next one will be (SNc) +6. It simply counts the bytes that you send, with each SYN counting as 1 byte. In the TCP specification, it is recommended to choose these sequence numbers so that their increment occurs at some roughly fixed rate. The initial RFC working papers suggested that you increase these things by about 250,000 units plus 250,000 per second.



The reason why this was not completely random is that these sequence numbers are actually used to prevent the intervention of failed packets or to mix packets from previous connections with new connections. Every time you establish a new connection, you choose a completely random sequence number. At the same time, there is some chance that if you install a series of connections over and over, a certain packet from the previous connection will have a sequence number rather similar to the sequence number of your new connection and therefore will be accepted by the server as a valid part of the new connection data.

So this is what TCP developers were very worried about — these unordered packets or delayed packets. As a result, they really wanted these sequence numbers to be a fairly monotonous sequence in time, even between connections.

If I open one connection, it can have the same source and destination, port numbers, IP addresses, and so on. But since I established this connection now, and not earlier, packets from previously sent messages, I hope, will not match the sequence numbers that I have for my new connection. So this was a mechanism to prevent confusion between repetitive connections.

Student:if you don’t know exactly what the step of the sequence of packages will be, how you will know that the package you receive is the next package and not the part of the previous one you ...

Professor: as a rule, you remember the last package received. And the next sequence number is exactly the next packet in the sequence. So, for example, the server knows that I saw exactly the data portion of date (SNc +1), then the next one will be the SYN packet (SNc +1), because the previous packet at the beginning of the connection was SYN (SNc).

Student: so, you say that when you set a sequence number, even after that you ...

Professor:Well, of course, these sequence numbers, initially, when you install them, are selected according to some plan. We will talk about this plan. You may think that they are random, but over time they should be some kind of consistent flow of changes in the initial sequence numbers for the connection.

But within a single connection, everything ends as soon as it is established - the sequence numbers are fixed. And they just mark this connection as data is sent over it.

There were plans that offered to manage these sequence numbers. In fact, these were reasonable plans to prevent duplication of packets on the network, which caused problems. But of course, the problem was that intruders could guess these sequence numbers, because not so many accidents were chosen.

So, the host computer would select these sequence numbers and set a running counter in memory, which increments them every second by 250000. And whenever a new connection arrives, it would be labeled with some constant, such as 64k or 128k, I forgot the exact number. So it was relatively easy to guess how to guess the number — you simply send them a connection request and see which serial number is returned.

In this case, you know that the next number will be 64k higher than it. Thus, in this protocol there was not enough randomness.

We can just sketch out what it looks like. Imagine that I am an attacker who wants to connect to the server, but at the same time pretend that the request comes from a specific IP address.
What I could do was send a request to the server, just as it was done in the first step of the previous connection setup scheme, and insert there some initial sequence number I chose. At this stage, any sequence number is equally good, because the server has no assumptions about how this sequence number should look like.

So what does the server do? The server receives the same packet as before, and does the same as before — it sends the packet back to the client with a certain server sequence number and recognizes the SNc. And now, if an attacker wants to establish a connection, he needs to somehow synthesize a packet that looks exactly like the third packet of a real client, since it needs to send a packet from the client to the server.

It's simple enough, you just fill in these values ​​in the header. But at the same time, you must recognize the sequence number of the ACK server (SNs).

This is where the problems begin.



If the value of SNs is relatively easy to guess, then the attacker can go far because now the server thinks that it has established a connection with the client from this IP address C.

Now an attacker can embed data into this connection, as before. It simply synthesizes a packet that looks like this: it has data, and it has the sequence number of the client that the attacker actually chose, that is, a message of the form data (SNc +1).



But it all depends on the ability to guess this particular server sequence number (SNs). It's clear?

Student: What is the reason that the server sequence numbers are not entirely random?

Professor:There are two reasons. One, as I described earlier, is that the server wants to make sure that packets from different connections do not interfere with each other over time. So, if you establish a connection from the source port to the destination port, close this connection and then re-establish another connection from the same source port to the destination port, you will want to make sure that packets from one connection do not appear on the other connection.

Student: thus, the server sequence number is incremented, that is, increases for each of the packages?

Professor: The sequence numbers in the connection are tied to all the data in the connection. But the question arises, how is the initial sequence number chosen here?

This happens every time a new connection is established. So the hope is that the time it takes for it to wrap around 32 bits and come back will be enough so that all the old packets on the network are actually dropped and are no longer displayed as duplicates.

So this is why you do not just choose random points, or the developers initially did not choose random points.

Student: so this is a problem between connections, for a connection between the same client, the same server, the same source port and the same destination. But we are worried about the old packets ...

Professor: yes, that is why the TCP developers were so worried about the method of choosing the initial sequence numbers.

Student:so if you have different new connections, then you can distinguish them from each other.

Professor: yes it is true.

Student: then I do not understand why they used an increase in the serial numbers, and not just took them randomly.

Professor: I think that the reason why they do not randomly select sequence numbers is this. If you randomly select them and set up, for example, 1000 connections in a short period of time from the same source port to the same destination, then each of these numbers is some random value of 2 to 32 degrees.

At the same time, there is a nontrivial chance that some packet from one connection will be delayed in the network, and eventually it will show up again, after which it will be confused with the packet from another connection. This has nothing to do with security. This is precisely their design, which was originally intended for reliable delivery of packages.



Student: the attacker acts as some other client server, right?

Professor: yes it is true. In fact, we did not say why the attacker does this, because he could have just logged in from his old IP address, right?

Student: what happens in this case with the server?

Professor:This is a really interesting question - what happens here? After all, this package from the second line of the scheme is not just discarded, it is sent to the client’s computer. And what happens?

Student: they just mentioned that attackers are trying to do this when other computers have been updated, rebooted, turned off, or something like that.

Professor: yes, of course, they feel that the computer is disconnected from the network, and the package will be discarded, then they should not worry too much about it. But if the computer really listens on this IP address, then in TCP you must send a reset packet that drops the connection, because this is not the connection that Computer C knows about.

TCP assumes that this may be some kind of old packet that I requested a long time ago, but forgot about it, in which case computer C can send an RST (SN ...) packet to the server, saying that it wants to reboot.



In fact, I forgot what exactly the sequence number is included there, but client C knows all the sequence numbers and sends any sequence number as needed to reset this connection.

Therefore, if computer C is going to do this, then it may interfere with your plan to establish a connection. Because when S receives this packet, he says: “Oh, of course, if you don’t want it, I will drop your connection.”

There are some implementation errors that you can use, or at least the author of the article talks about their potential uses, which can prevent client C from responding to the server.

For example, if you “crash” computer C with a large number of packets, then this is an easy way to make it drop them. There are other more interesting errors that do not require “bombing” with a large number of packages that will force him to abandon this package. At least, such errors are found in some implementations of TCP stacks.

Student: Presumably, most firewalls can also drop a packet. Suppose that the client did not initially send a SYN to this server, and the firewall is going to drop the incoming packet.

Professor:There is such a possibility, but it depends on how complex a firewall you have. Of course, if you have a very complex firewall with state tracking of all existing connections, or, for example, if you have NAT, then this could happen. But on the other hand, NAT can actually send RST on behalf of the client, so this is not exactly good. I think that this, that is, dropping packets with a firewall, does not happen so often, for example, in the Comcast network no one intercepted these packets and did not maintain the connection status for me, sending RST on my behalf.

Student: why can't the server have independent sequence numbers for each possible source?

Professor:yes, this is what TCP stacks do today. This is one example of how to fix the backward compatibility problem. If you look at it carefully, you will notice that there is no need for this initial sequence number to be global. You simply attribute it to each source / destination pair, and then you have the opportunity to retain all the anti-duplication properties plus some security.

I will write on the board how an attacker gets this initial sequence number data (SNc +1). Probably, the attacker simply sends a connection from his own IP address to the server, saying: “I want to establish a new connection”, and the server will send a response to the attacker, which will contain its own server sequence number S.



And if this SYN (SNs) for this connection in the last line and (SNs) in the third line are related, then this is really a problem. But you say - let's make them unrelated, because this is a number from a different IP address, so this is no longer a problem. You cannot guess that this SNS will be based on SNS for another connection.

25:50 min.

The course MIT "Security of computer systems." Lecture 12: "Network Security", part 2


Full version of the course is available here .

Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).

VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .

Dell R730xd 2 times cheaper? Only here2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?

Also popular now: