Facebook Osquery Introduction
This post is a relatively free translation of the Introducing osquery article published on Facebook .
Real-time monitoring of your system status is very important. On Facebook, we developed a framework called osquery, which allows you to take a fresh look at low-level monitoring of the operating system.
Osquery presents the operating system as a high-performance relational database. This approach allows you to write SQL queries to easily and efficiently get information about your system. With osquery, the current state of the OS is represented as SQL tables from which you can get information about:
- running processes;
- loaded kernel modules;
- open stack connections.
SQL tables are created through an easily extensible API. Several tables already exist and many more are being developed. The following examples will help you better understand the lightness and expressiveness that osquery provides.
The first example illustrates how you can use osquery to interact with processes running on the current system. Specifically, this query returns the processes that are currently executing. The WHERE clause in this query returns only those processes that were started from the binaries, but currently no longer exist in the file system. This is a common practice that attackers resort to, so this query will not return any results on your system, provided that your system has not been compromised.
SELECT name, path, pid FROM processes WHERE on_disk = 0;
Interaction with the OS using SQL is a simple and entertaining task. One of the advantages that SQL gives us is the ability to join different tables together to analyze the system. The following example just shows us how to simultaneously use the information from two "tables" - listening_ports and processes. This request searches for all processes that are listening on network ports. Thus, using the “table” of processes from the previous example, we can combine it with another “table” - the table of network ports. Both there and there are used the PID of the process, by which the union occurs.
SELECT DISTINCT process.name, listening.port, listening.address, process.pid FROM processes AS process JOIN listening_ports AS listening ON process.pid = listening.pid;
Osquery includes many tables and many more are created by us daily. Tables are easy to write, so we don’t mind and even encourage the development of new tables by third-party developers. You can get detailed information in our wiki .
Osquery is the framework we use to create new products and tools. The osquery modular source code gives us an edge over existing concepts. We are releasing several tools as part of one open source release and we have a lot more to do. We are also looking forward to the moment when our community will present its osquery-based tools.
The osqueryi interactive console is an SQL interface where you can execute your queries and explore your OS. With all the expressive power of SQL and the many useful tables built into osquery, the console is an invaluable tool for diagnosing OS problems, solving performance problems, and much more.
More information on using osquery on our wiki .
To monitor large systems, we have a daemon - osqueryd. It allows you to schedule requests to execute them throughout your infrastructure. The daemon independently takes care of aggregating the results of queries by time and generates logs that display the state of changes in the infrastructure. You can use it to update information on the security, performance, configuration, and integrity status of your infrastructure. The osqueryd daemon can also integrate your internal logs with robust plugin architecture.
If you are interested in using osqueryd in your infrastructure, we invite you again to the wiki , as well as to the internal deployment guide .
Osquery - cross-platform. Although osquery has an advantage in low-level operating systems, you can build and use it on Ubuntu, CentOS, and MacOS. This will even give a certain advantage, because will allow monitoring corporate poppies at the same level as corporate Linux servers.
Native builds and documentation
To facilitate deployment, osquery comes as a regular package for all supported operating systems. There is also detailed documentation on creating your own packages. Therefore, the development and implementation of your own osquery tools should be as easy as possible.
Osquery was developed taking into account the peculiarities of the environments used in order to enable the hot swapping of plugins in an already running system. Using the provided interfaces allows you to more deeply integrate osquery into your infrastructure if one or more of the used plug-ins no longer meets your needs.
More details here .
Modular source code
Osquery consists of high-performance modular components with a well-documented public API. These components can be easily assembled together to create interesting new applications and tools. Details about the API are here .
After talking with several external companies, I realized that such monitoring of the low-level behavior of operating systems is not Facebook’s unique problem. A few months later, we released osquery as binary files for a limited number of companies. They successfully implemented and tested osquery on their equipment, and we got a great feedback from them.
And now we are pleased to announce that it is time for open source osquery. You can find all the code and documentation on GitHub .
We look forward to hearing feedback from the community. We will do all the work with osquery on GitHub. This will facilitate the work of third-party developers. We hope you see the potential in osquery and do beautiful things with us.