MIT course "Computer Systems Security". Lecture 6: "Opportunities", part 1
Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems." Nikolai Zeldovich, James Mykens. year 2014
Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.
Lecture 1: “Introduction: threat models” Part 1 / Part 2 / Part 3
Lecture 2: “Control of hacker attacks” Part 1 / Part 2 / Part 3
Lecture 3: “Buffer overflow: exploits and protection” Part 1 /Part 2 / Part 3
Lecture 4: “Privilege Separation” Part 1 / Part 2 / Part 3
Lecture 5: “Where Security System Errors Come From” Part 1 / Part 2
Lecture 6: “Capabilities” Part 1 / Part 2 / Part 3
So, in the continuation of the topic of the division of privileges, today we will talk about the possibilities. If you remember, last week we talked about how Unix provides some mechanisms for applications that can be used if we need to separate privileges from the internal structure of the application.
Today we will talk about the possibilities that will make us think very differently about the privileges that the application may have. Therefore, in today's lecture, we have two separate issues for discussion. One question concerns how to avoid confusion in determining authority and to make your privileges when writing a program more clear and unambiguous, so that you do not accidentally use the wrong privileges.
The second question concerns the sandbox calledCapsicum is a system similar to OKWS , which allows you to run a code snippet with lower privileges. Therefore, if the system is compromised, it will not be in danger of great damage.
This approach will allow you to manipulate privileges differently than Unix allows . To begin with, we will look at this confusing problem of authority that the author of the article we are discussing, Norman Hardy, has encountered and will find out why she puzzled him so.
This article was written a long time ago, and the author uses the syntax for file names, which is a bit surprising. But we can try, at least, to transcribe his problem into a more familiar Unix -style name syntax .
As far as I can tell, he used the Fortran compiler , which was in / sysx / fort , and he wanted to change this compiler to keep statistics on what was compiled, which parts of the compiler were the most resource-intensive, and so on. So he wanted to make sure that this Fortran compiler would somehow write to the / sysx / stat file and that he would write here information about various compiler calls.
Their operating system had something like the setuid function that we talked about in Unix . They called it a license for home files. This meant that if you ran / sysx / fort, and this program had a so-called home files license , then this process that you just started will have additional write permissions for everything in / sysx / . That is, you could write everything after a slash that is usually marked with an asterisk, getting an expression like / sysx / * . This gave access to all files written in the directory after / , and the user could run them. Therefore, the specific problem they encountered was that some clever user could do this by running the compiler could take a lot of arguments like GCC does .
Such a user could, for example, collect something like foo.f , where f- the source code of Fortran , and add here - o / sysx / stat .
They had another file in the system / sysx , which was a billing file for all the clients in the system. His damage would do even more damage. It was possible to similarly “ask” the compiler to compile the source file / sysx / bill and place it in some special file in / sysx . And in their case, it worked. Despite the fact that the user himself did not have access to the record in this file or directory, he used the compiler, which had this additional privilege - a license for home files. Thanks to the privileges of the compiler, he was able to replace files contrary to the intentions of the developer.
Who should they blame for what happened? What did they think went wrong? Was it possible to act differently in order to avoid such problems? They believed that the Fortran compiler would be very careful when it uses its privileges. In fact, at a certain level, the Fortran compiler has two types of privileges that it enjoys.
One, mainly based on the fact that if the user causes the compiler, then he should be able to get access to the original file, such as foo.f . And if it was some other user who did not activate, or did not call the compiler, then such a user would not be able to access the source code of the “correct” user.
The second type of privilege is provided by this license itself, which allows you to write these special files. On the internal level of the compiler's source code, there should have been clear indications of which of these privileges he wants to use when opening a file or when performing some privileged operation. He could just open, read and write files, like any other program. He implicitly uses all the privileges he has, that is, in their system design it was a kind of combination of user privileges and a license to use home files.
So these guys were really interested in solving this problem. And they sort of called this compiler "stupid assistant," because he had to distinguish between the many privileges that he had and carefully use them when necessary.
So, we should consider how to develop a similar compiler in Unix . In their system, everything was tied to this file license. There are other mechanisms that they later introduced into their program to identify opportunities, we will talk about them in the near future. But can we solve this problem on a Unix system ?
Suppose you need to write this Fortran compiler in Unix, write this special file and avoid the problems that arise. What would you do? Any ideas? I think you can just declare it a bad plan, and, for example, do not keep statistics. Or do not support data entry type - oh . On the other hand, you can specify which source code you want to compile so that you can read the file / bill or the statistics file, which may need to be secret.
Or maybe you could provide support for standard source code, but then it would have to contain parts of another source code, so this is a bit abstruse.
Audience: it would be possible to share the privileges of the compiler.
Professor:Yes, it would be another potentially good design that shares its authority. We know that in fact the Fortran compiler does not need both privileges at the same time. So, perhaps, speaking in the Unix language , we could create a compiler like world / bin / fortcc , and it would be just an ordinary program without additional privileges. And then we would create / bin / fortlog , which will be a special program with some additional privileges, and it will collect statistics about what happens in the compiler, and the function fortcc will call the fortlog . So, what privileges would we give this fortlog ?
Lecture hall:maybe if you use something like setuid or fortlog , then any other user will also be able to register any arbitrary data through it.
Professor: yes, so this is not so great. Because the only way in Unix to give additional privileges to fortlog is to become its owner, I don't know, maybe create a fort UID and setuid . And every time you run fortlog , it switches to this fort UID . And, perhaps, some special statistical file is also needed here. But then, because everyone can call this fortlog .
And this is not good, because anyone can write to the statistics file. In this case, for security, this is a minor issue, but what happens if, instead of stat, it is a bill payment file ? In this case, the problems will be much more serious.
In Unix, there is a rather complex mechanism, the consideration of which we missed at the last lecture on Monday. This mechanism allows the application to switch between multiple uids . Thus, to perform different applications, you can switch between user IDs . It is a bit difficult to implement in the right way, but doable. So this mechanism may be another potential design of our system.
I think you could do another trick: make the fortlog “binary” executable only for a specific group and create a binary file for the group fortcc setgid . However, this is not very good, since it erases any list of groups that the user originally had. But who knows, maybe this is better than nothing.
In any case, this is a rather complicated problem, but it can be completely solved with the help of Unix mechanisms . Perhaps you should rethink your problem and not worry too much about the stat statistic file , putting its security first. What happens wrong in our project?
There are two things to watch out for if something went wrong. The first is calledambient authority , or external authority. Does anyone understand what they mean? They never gave it a precise definition.
Audience: This means that you have the authority given to you by the environment. So, as if you are a user, acting without restrictions.
Professor: yes, you are performing an operation, and you can indicate which operation you want to perform, but the decision on whether this operation will be successful comes from some additional indirect parameters in your process.
In Unix, you can figure out how ambient authority will look like.. Therefore, if you are making a system call, you probably gave the system call some name. And inside the kernel, this name is associated with some object. And this object supposedly contains inside itself some kind of access control list, for example, permissions for this file and so on.
Thus, there are some permissions that you can get from this object, and you need to determine whether an operation with this name that was granted to the application will be allowed, that is, the chain Name -> Object -> Permission is created . This is what the app gets to see the process.
Inside the kernel is the current user ID of the process process UIDthat makes calls. He is also involved in deciding whether to allow the execution of a specific operation or not. Thus, this current process user ID is an external privilege. Any operation you want to perform will be attempted by the kernel to check using your current UID , and your current GID , and any other additional privileges that you may have. And while there is a set of privileges that allow you to do this, you can do it. Although it is possible that you do not want to use all these privileges to open a specific file or to perform some other operation.
You understand what these ambient privilege areexternal privileges? In the case of the operating system, this means that the process has some kind of user ID. Can you give examples of such privileges not related to the operating system? For example, when you perform a process identification operation to find out if it was successful or not. Firewall is an example of this - if you are inside a network or have an internal IP address, you are allowed some operation, and if you are outside the network, the same operation will be prohibited for you.
Let's say you visit a website that contains a link to another server, and maybe you do not want to use the privileges that you have to go through this link. Because, perhaps, this will give someone access to your internal network printer and this someone will be able to use it. But in reality, the one who provided you the link should not get to your printer, because it is located outside the network. Or your browser firewall by visiting this link will be able to do it fraudulently.
This is a kind of moral equivalent of this confusing problem in network models.
Audience: existing permissions also affect this.
Audience: because in Capsicumin essence, DAC is applied — discretionary access control.
Professor: yes, this is largely because the guys from Capsicum use something like discretionary access control. This means that the user or owner of the object decides what the security policy will look like for this object. For Unix, this is very natural - these are my files, and I can decide what I want to do with them, I can give them to you or keep them. Thus, almost all DAC systems look like this because they need some permissions that the user can modify to manage the security policy of their files.
Downside dacis mandatory access control. We will talk about this later, but at some level, these systems have a very different view of the world around them. They think that you are only a computer user, and someone else sets a security policy for using this computer. This view came from the 70s or 80s, when the military really wanted to have secret computer systems in which you work on some things that are marked as “secret”. And if you work on things that are marked as “secret” , and I - on things with a “top secret” mark , then they cannot get you that easily. But I don’t have to set permissions on a file and so on; it’s just not allowed by some leading guy "on top".
Thus, mandatory access control is really trying to put different types of access policies in the first place, where there is a user and there is an application developer, and besides them, there is another guy who sets this policy. However, as you can guess, this is not always the case. We will talk about this a little later. But this is the mandatory meaning of discretionary access control.
We have many other examples of using external access control. This is not necessarily a bad thing; you must be very careful when using it. If you have ambient privilege, you need to be very careful when performing privileged operations. You have to make sure that you really use the right credentials and you will not be accidentally deceived, as with this Fortran compiler almost 25 years ago.
So this is one of the interpretations of what is happening. And this is not necessarily the only way to think about what goes wrong, right? Another possibility is that it would be nice if the application itself knew whether it should access the file on behalf of some principle. Therefore, problem number 2 is the difficulty of checking access control.
In a sense, when the Fortran compiler works, it opens the file on behalf of the user, and you need to repeat the same logic that we see here in the diagram, except that the Fortran compiler must connect something else for the Cur process UID . Instead of using current privileges, he should simply repeat the check Name -> Object -> Permission and try to do it with a different set of privileges for the Сur process UID .
In unixit is quite difficult to do because there are many places where security checks take place. If you have symbolic soft links, the symbolic link is viewed, and the path name is also evaluated with some privileges, and so on. But it may happen that in some system you could simplify access control checks if it can be done independently in the application. Do not you think that this is a reasonable plan? Would you agree with that? Is there a danger of repeating this check?
Audience: if you do checks in the app, you could simply not do other checks.
Professor: yes, you can easily skip the other checks, this is absolutely true. So in a sense, when they used the Fortran compiler hereHe did not even try to do any checks, so they failed. Another consequence, in addition to the lack of checks, is that the kernel can change all the time, and then it will have slightly different checks. This will introduce some additional security requirements, but the application will not change and will implement the old style of checks. So this is not a good plan.
Recall that there is one good idea in the field of security - reducing the number of mechanisms involved. Therefore, the program has only a small number of places where security policies are applied. You probably do not want to repeat the same functionality in applications, in the kernel, and so on. You really want to concentrate these checks in one place of the program. So what should be the solution to the problem of granting authority? Unix
file descriptors come closest to solving the problem . In a world of possibilities, an alternative to this pattern is that instead of following the chain Name -> Object -> Permission
and decide whether to allow its use based on the external credentials of the Сur process UID , you can use a very simple scheme.
Suppose you have capabilities related to a particular object. And these features may have a number of limitations as to what you can do with this object.
But in principle, if you have opportunities for this object, then you can access it. It is actually very simple. Thus, there is no external authority that decides whether an operation is possible and whether it can be resolved.
The only thing is that these Capability capabilitiesmust contain a few extra bits that indicate what capabilities you have for this file, whether it is limited to read operations, or only write or append operations. This security solution looks very easy. Because if you have Capability opportunities , you can do something, if you don’t have them, you can’t.
One of the important properties of Capability is that they must exclude the possibility of forgery. So, what in the world of possibilities does authenticity mean, and why is it needed?
In fact, this is too obvious if you can create any opportunity you want. For example, I can create an opportunity for any of your files and access it. And there is nothing in security design that would prevent me from accessing an object if I can create an opportunity.
Therefore, it is important that these opportunities, whatever happens, could not be created by the application "from the air." How to make it mandatory if we think about using file descriptors on a system like Capsicum ? How can file descriptors prevent an application from synthesizing features?
Audience: it probably looks like a structure or construct that says it has a capability.for some file descriptors.
Professor: yes, it's actually quite easy to see what happens if you remember what a file descriptor is. A file descriptor is just an integer. As in Unix , this integer is 0 if it relates to data entry, and is 1 if it refers to output. But in reality these are just integers in user space. Presumably, the application can choose any integer that it wants. But whenever you try to do something in a file descriptor, which is one of these integers, the kernel will always interpret an integer according to the file descriptor table of the current process. That is, for each PID process identifier , supposePID: 57 , the running process has an open file table. And every integer offered from user space refers to some record in this table. And, of course, the kernel must check that the integer is within this table, that it is not negative, does not cross the table boundaries, etc. Otherwise, we get a normal buffer overflow.
But if you carefully check that the integer is within the kernel implementation, then the application can only refer to the file descriptor recorded in this table. So, apparently, the kernel will have to somehow make sure that you have designated a specific feature legally.
Therefore, when you, for example, open a file outside of this model of capabilities in Unix, the kernel, after a successful open call, will change the entry in the file descriptor table to indicate the specific open file, for example, the password file / etc / pwd .
Now the entry in this slot of the table points to the open file. Some of them may actually be zero. It may be that you do not have an open file with a specific index in this table. As a result, what in this case would mean “to falsify an opportunity”?
The only thing you can do in user space is to invent an integer integer . But it makes sense to invent only such numbers that would not be shown on the zero entries in this table. And these records will mean exactly the opportunities that you have.
Therefore, it makes sense why in this world of file descriptors it is first of all difficult to fake opportunities. It's kind of cool, isn't it? You can only work with those files that are open to you. And there is nothing else that you can touch or influence.
So, I think how Capability capabilities will help solve the problem of external powers that excited Norman Hardy in his Fortran compiler . Could the file descriptor be the equivalent of solving a sysx / fort problem ? Do you think he can really solve this problem?
Lecture hall:they simply use the appropriate features when they are needed. For example, when you need to access the output statistics file, you use the ability to get the output file . But when you access the file you are about to read, you do not use this feature.
Professor: yes it is. Therefore, I think that in fact the Fortran compiler should simply have an open file descriptor for this file / sysx / stat . But in their brief article they do not describe why we cannot get this opportunity.
Basically, this means that you don’t have to transfer file names around; you only need to pass a file descriptor. Thus, we would create a more elegant system design, using Unix in the Fortran compiler using capabilities. So perhaps the plan is that we just have to take the Fortran compiler interface , which has no extra privileges and takes all the arguments you give it, and convert all the path names you give it into open file descriptors .
Therefore, an alternative design might look like this. We have the program fort1 , which is the interface of the program, and it accepts a file, for example, foo.fand all other arguments, such as - o , xy and the like. In fact, only the compiler logic is implemented here, nothing more. He searches for path names in these arguments, is going to open them and set file descriptors for them.
The most useful thing here is that he does not have additional privileges, and if the user does not have access to the file name, he will fail. It is perfectly! And after the interface has opened all these file descriptors, it can execute some privileged additional components, such as the setuid function for the Fortran compiler .
For example, it may be setuidSome special user ID that has access to the statistics file. But in reality, it does not use path names as input, but uses file descriptors. And in this case, the file descriptor already proves that the user who made the call had access to open them.
Of course, this method does not solve all the issues that arise in the system, it’s just an outline how to help with the solution of the problem. But this is a rough plan of how you can demonstrate the fact that you have access to a specific name by simply opening it and using Capability , instead of showing why you are not trying to open this file and perhaps occasionally using some additional privileges.
MIT course "Computer Systems Security". Lecture 6: "Opportunities", part 2
Full version of the course is available here .
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).
Dell R730xd 2 times cheaper? Only we have 2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read aboutHow to build the infrastructure of the building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?