MIT course "Security of computer systems". Lecture 11: "Ur / Web programming language", part 3
Massachusetts Institute of Technology. Lecture course # 6.858. "Security of computer systems." Nikolai Zeldovich, James Mykens. year 2014
Computer Systems Security is a course on the development and implementation of secure computer systems. Lectures cover threat models, attacks that compromise security, and security methods based on the latest scientific work. Topics include operating system (OS) security, capabilities, information flow control, language security, network protocols, hardware protection and security in web applications.
Lecture 1: “Introduction: threat models” Part 1 / Part 2 / Part 3
Lecture 2: “Control of hacker attacks” Part 1 / Part 2 / Part 3
Lecture 3: “Buffer overflow: exploits and protection” Part 1 /Part 2 / Part 3
Lecture 4: “Privilege Separation” Part 1 / Part 2 / Part 3
Lecture 5: “Where Security System Errors Come From” Part 1 / Part 2
Lecture 6: “Capabilities” Part 1 / Part 2 / Part 3
Lecture 7: “Native Client Sandbox” Part 1 / Part 2 / Part 3
Lecture 8: “Network Security Model” Part 1 / Part 2 / Part 3
Lecture 9: “Web Application Security” Part 1 / Part 2/ Part 3
Lecture 10: “Symbolic execution” Part 1 / Part 2 / Part 3
Lecture 11: “Ur / Web programming language” Part 1 / Part 2 / Part 3
The last thing we need to do and what will be very instructive is to change this code, specifying the room module there, and then try to access the room table, as in the previous example. This option is not allowed, because this is not allowed.
It would be like being able to read and write private class fields in Java. And indeed, we get a fairly simple message, basically saying that we have here an unlinked variable, an unknown expression for the room parameter.
We could mention this add-on module, which we created just for fun.
But then it will be different tables. We can easily access it. Therefore, I will break it into two parts, we will start by simply calling the room method, and then doing a slightly different thing to read its elements.
Let's display a list of results, and not vice versa, this is roughly equivalent to how the program worked before, except for using different types of data. Let's see what happens. We now return to our chat in room 1 and enter any messages into the line. You can see that all of them are displayed without errors.
So now we have this encapsulation, so you can think of the structure of this room as a library, but you don’t need to worry about it.
There are different places that can damage the internal invariance of the system. Maybe you want to, after the message is added, it would never disappear. All this is in magazines. This structure provides such properties independently of other code, such as the one on which the chat room module is written.
Student: so you changed the definition of a room, what happens to the database table in this case?
Professor: you will have to manually run the alter table command if you want to save the old data. But when the application starts, it queries the system database directory and checks that the schema still meets expectations. Then you get a static error. Hope this gives you a hint on what you need to change in the database.
Student: but it will not automatically delete the database or something like that?
Professor: I hope not. I don't think the compiler should do this. You can imagine setting up a compiler to understand the evolution of a database. I think you need to write the alter table commands to run, because the compiler is not doing this right now.
Let's now talk about cross-site request forgery and its prevention. In fact, before we do this, let's look at the code on this page. We have the traditional HTML form that is generated here. And, of course, there is no protection against cross-site requests forgery, and I think this is good. Because, as I understand it, the problem of counterfeit cross-site requests is in an implicit context, which is sent by the application with each request.
Suppose there is some intruder who does not know your implicit context. Let's say your password is stored in a cookie, for a very simple example. And when an attacker deceives you when clicking on a link to an application he needs, the browser automatically sends an implicit context and forces the application to do what the attacker could not do directly.
In this case, there is no implicit context, so there is no risk of cross-site request forgery. Does anyone want to challenge this feature of the system before I continue? This can be quite informative for me. If not, then add an implicit context here. In this case, the system is automatically going to take the right countermeasures, based on the analysis of the program, which understands that now there is an implicit context.
Now I will insert cookies here. As another example of encapsulating a module, in fact, I will put here a whole user authentication system, in which we have user accounts and abstract types of identifiers and passwords. Thus, you cannot simply create the value of any of these data types directly. You will have to go through some approved method for constructing values of these types.
I'm going to place the table right in the signature. And I will also impose a restriction on it, indicating that the form ID is the key to it.
But the fact is that in this user table, the ID and password are abstract data types. Therefore, the code cannot see the password and cannot consistently generate all identifiers and try them in this table. Because it uses an abstract type, because of which it is impossible to establish what the ID looks like and it is impossible to find the password. They simply come from this table, and these are opaque tokens.
But we could allow them to be waste data. You might want to allow one direction of conversion between the lines and these types. Now I will do something here, mainly the details, and I will not try to explain them. But it looks like a declaration that you are allowed to convert strings to IDs. For those familiar with Haskell, this is an instant type class, this is the permission to turn strings into IDs.
We are not going to use a different resolution, because we don’t want to be able to turn the ID back into something else. Let's do the same for the password. We want to be able to read the user's password, but we are not going to accept the password and turn it into a string that will tell us that the user has entered the chat.
Thus, other parts of the code will be able to accept the input of the password from the user, convert it to this type, and transport it to the user’s module for verification. But what they cannot do is query the user table and get all the passwords in a form from which they can extract their text expression.
Then we can have a login method that accepts these two components, an ID and a password, and just works as a side effect, which is actually said in the code. We will also need a way to find out which users are registered. This is the code that performs the transaction that creates the ID.
The first step is to simply copy this definition. And then comes a surprise. It turns out that user IDs and passwords are strings, but outside the module this circumstance will not be disclosed.
Now we are going to create cookies. A cookie is another thing that is built into the language. In fact, they act as mutable global variables that have one copy for each client that uses your application.
Thus, we are going to create cookies, which for each user will store just a copy of the same two fields that we have here.
These cookies are private to this module. Other parts of the code will not be able to read the cookie, because they simply do not have this private field. So no one can directly see the ID and password that are stored for this user. But they will be saved when browsing various pages, as is the case with ordinary cookies.
Now I’ll insert a login function that will start the process to check the database and see if this is really the correct pair of username and password. This process just checks to see if we can find a row in the database that has this user ID and password.
If we find it, then good, then this is the correct value. Let's just keep it in cookies. We use a cookie altering method. And we have to put some things here, for simplicity, I will say that this cookie does not expire. I do not want to run SSL here, so I’ll say that in this case we don’t need security, and set the parameter Secure = false.
But if you really care about security, obviously you will write Secure = true. If the check fails and the module signals an error, the program execution will stop, giving a description of this error.
Finally, we can create this function, which tells you which particular user is logged in, receiving the current cookie value. This parameter can also be set to none if the user has not yet logged in, in which case we will receive another error message. Or it could be some kind of record of the type that we used above. So I just copy it here and run the same check. If this works, then we will simply return that part of the ID record that we have just checked in the database.
So let me just check it out. We start the compiler, and you see the result on the screen.
Main here are all these implementation details. But outside of this module we think about it from the point of view of the interface. There are some unknown types of IDs and passwords. This user table expresses terms that allow strings to be turned into identifiers and passwords, but not vice versa. We have these two login methods in the first place and there is a check on which user is logged in at this stage. Any questions about this?
Student: Do you need to open a user table?
Professor:I do this because I want to use it later as a foreign key, so this is not an important reason. So, we are almost at the stage where I can show you the CSRF protection in action.
Let's start with the fact that log in to the system, it is quite easy to do. Well, what else can we do in this place of the program? Let's just add here another part of the page that says that this is the place where you enter the login. Here you must enter your username and password and then click on the submit action button. This action will provide a call to the login function.
Let's define login as the function that does these things. In fact, this is just a wrapper around the call to the login function from this module, in which we take each of the components and convert it from a string to an abstract type.
She checks for a read error. The error means that if the login does not work, the operation will be interrupted in this place and the function will return to the main part of the program.
Now we can log in. We will probably want to create an account that will allow us to log in, so let me create a user named a and a password a.
Now I can log in as user, take my word for it. We have a set of cookies to record this information, so go to the chat room and send a message, for example, asfasf. You see that after clicking on the Add button, it appeared in the chat.
In fact, we have not added any access control here, so nothing special happens here. But we can check.
Student: What is cookie content?
Professor: this is the content that you intend to receive from the code. In other words, a cookie is declared as having the type of this entry - an identifier and password.
So let's do this very simply and call the whoami method. In fact, I'm just going to ignore it, or vice versa, to allow him to do something. Let's decide that the user we created is really special, and only this user can post anything. If this is not the case, we will receive an error message.
I’ll add an ID to the custom module, and then it should work, because the ID type supports equality checking.
Now we have to be fine, and we can do more things with this ID, which can cause some security problems.
This allows us to add an access control check, so let's see how it works and go back to the main page.
Now these four letters "a" appeared in the chat.
In the interface console, we see that the form now automatically received a hidden input name sig, which is a cryptographic signature of the values of all cookies. This sig signed cookies using a key that is secret for the server. And when the form is ready, the application knows - because the compiler told it to it - that the application should check the signatures for the next set of operations. To do this, we have a say operation.
Student: Do the signatures have something like a time stamp?
Professor: No, signatures do not have timestamps.
Student: in this case, if the attacker managed to “peep” this data, he could pretend to be a user, because the validity of these cookies will never expire.
Professor:Yes, it never expires. This is something that can be changed by simply changing the implementation of the language without changing the application, and then quickly deployed. But now it is not here. And I understand why it would be useful to add.
Student: you can also fix this by simply putting a timestamp on the signature.
Professor: yes, you are right, you can change the application so that you intentionally change the cookie data often enough, that is, make the signature expire.
Student: can you reassign urls?
Professor: Yes, what reassignments would you like to see?
Student: any, I just want to see how this is done.
Professor:So, the compiler assigns ... as we see, we called the say function, and this function call is serialized as a specific form of the URL. Suppose we do not like this form.
We decided that we were going to rewrite the URL, so to speak, inside the room module, inside the demo. Better to put it up. So, we want to reassign url Demo / Room / say to Demo / Room / Speak.
Run the compiler and go to the main screen of the application. Let's see what happens. Everything is in order, we can also enter any text messages and they appear in the chat line. In these rules, you can use unpredictable characters to replace one prefix with another, while the compiler will ensure that each function has a separate URL scheme, but by default the URL is automatically generated.
Student:You mentioned that HTML is not a compiler specific, it’s just a library. Are there other libraries for other formats?
Professor: there are other libraries that do not check the type with the same functional completeness, but, for example, there is a library for serializing and deserializing JSON, and a large number of automated ways to manage the type structure. That way you can do things that are not integrated into the compiler.
Professor:Let me download an Ajax version of this that will answer your question. This version has client side code. Let's just go to the version of the program called demo3. I entered the data on the main page in the chat line, and they also appeared at the bottom of the chat. Believe it or not, this time the add-in worked with an Ajax call. This is due to the button tag button value. It has an onclick attribute, which, when the user clicks a button, all this code below the line with this attribute is triggered on the client side.
You must understand the realization of some of these abstractions in order not to interfere with them. As long as I have this code, let me show you something else.
We still have the same say function as before. But now, instead of calling it by reference, we simply take a function call, which is filled with the arguments that come from the onclick handler context.
We simply wrap this function inside the rpc syntax. This means that this is a client-side call function, but the call itself is launched on the server with access to the database and other server resources, and then transfers the result back to here.
The client can make a call anywhere in the scope of the function, with the exception of hidden private fields. I mean, since the function is located in a given place of the program, it can make calls to all whose names are in its scope.
This name, it turns out, is not in scope, so we could not make the call right here. But everything that is in the scope of a function can be called.
Let's see if there is anything else interesting in this version that I wanted to mention. It includes the implementation of a GUI widget using a functional reactive style, which is great in terms of modularity of programming, but perhaps less healthy in terms of security.
Let me give an example of calling the method of this abstraction from the part of the page that displays a list of lines of text that can only be added but not deleted. You can really provide a podomnoe, because we don't have a DOM here. This is not at all when any part of the code can penetrate the document tree and modify it, change the log and destroy previously added lines.
Here, a more functional style means that you can actually have a GUI widget that owns part of the page and controls exactly what is shown there, and other code cannot interfere with the calculations that appear there.
So, if there are no more questions, then we will end the lecture.
Professor:I don't think we have enough time to properly demonstrate the channels. But in the lecture summary there is a code, and on the site all sorts of demo versions and educational materials for this project are posted.
Student: It is very difficult to write the right libraries and compilers. How do you mitigate the problems that the layers of abstraction themselves create?
Professor: The best way is to get people to use these things and report bugs. This is the best advice I can give you. The idea is that such compilers should be written much less frequently than new applications. Because he collects in one place all the errors and tries to eliminate them, even if it does not happen in the most effective way.
Student: why did you choose such a name for your language - Ur?
Professor: Ur is a concept from linguistics, the so-called language, which is the ancestor of the modern language. And the idea is that this language allows you to insert into it all sorts of other languages. So this is a kind of progenitor of all languages.
Full version of the course is available here .
Thank you for staying with us. Do you like our articles? Want to see more interesting materials? Support us by placing an order or recommending to friends, 30% discount for Habr's users on a unique analogue of the entry-level servers that we invented for you: The whole truth about VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps from $ 20 or how to share the server? (Options are available with RAID1 and RAID10, up to 24 cores and up to 40GB DDR4).
VPS (KVM) E5-2650 v4 (6 Cores) 10GB DDR4 240GB SSD 1Gbps until December for free if you pay for a period of six months, you can order here .
Dell R730xd 2 times cheaper? Only here2 x Intel Dodeca-Core Xeon E5-2650v4 128GB DDR4 6x480GB SSD 1Gbps 100 TV from $ 249 in the Netherlands and the USA! Read about How to build an infrastructure building. class c using servers Dell R730xd E5-2650 v4 worth 9000 euros for a penny?