Palantir 101. What is allowed to mere mortals to know about the second highest private company in Silicon Valley

(Thanks for the translation to Alexei Vorsin)
Good morning and welcome to GovCon7. My name is Sod Abdulli and I am a leading implementation engineer for Palantir Technologies, and this is Palantir 101. In the next half hour or forty-five minutes, I’d like to talk about who we are, what Palantir is, and what he does for the organizations with which we we are working, and also, towards the end of the event, we will hold a small presentation.
Before moving on to all this, I want to start with a couple of stories that should shed light on the fact that we and Palantir are thinking about the problem of analysis in the Big Data world.

The first story is a story about chess.
Many of you know that in 1997 IBM created the Deep Blue supercomputer, which defeated Garry Kasparov, who at that time was the best chess player in the world. Now chess playing at the tournament level can be installed in a simple mobile phone, and the question of who is stronger in chess, a person or a computer, is no longer relevant.
An interesting new question: “What will happen if a person and a computer play chess together as a team?”
Firstly, such teams have shown high efficiency, and, in fact, this is quite expected, since people are good at chess, computers are very good at chess, but they are good for various reasons: computers have a serious tactical advantage, they can evaluate many thousands of combinations every second; people have experience, ability to tricks, intuition and the ability to feel the opponent, which is difficult for the computer.
These forces are compatible and the human / computer team is able to defeat the teams of the strongest players and the associations of the strongest supercomputers.
The second is a little more complicated: you can decide that since the team game is stronger, you can take the strongest player and the strongest supercomputer, put them together, and they will bend the whole chess world. You will be mistaken.
In addition to the player’s own strength, which is a skill, and the computer’s own strength, which is equipment and a program, there is a third factor - the quality of interaction between them. How easy is it for a player to formulate a request? Is it possible for him to concentrate on what he is really good at, leaving the rest to the computer? These questions are close to the point of view that we profess at Palantir to help organizations effectively interact with Big Data.
All our efforts, therefore, are focused on reducing the number of unnecessary operator calls to data (frictions).
The next story is about PayPal, maybe you heard about it.

When PayPal started, there were several more companies involved in the online payment industry, including financial heavyweights: Citibank, Western Union, and Internet giant Ebay. All have their own settlement systems. In the end, they, one by one, retired, and the PayPal startup succeeded.
How did it happen?
There is one important point about payment systems of those times: everyone coped with the main problem, transactions from one account to another, more or less; another problem, no less important, was the completion of the transaction so that the Russians did not pull it off.
A huge number of transactions, a huge stream of incoming data, and very little time to verify all this - there really are opportunities for a scam. Buyers are not going to wait for weeks while you conduct an investigation on each payment, so what you need is the ability to quickly decide whether the transaction is clean or suspicious. This is now, basically, a problem that PayPal and others have come up differently:
Many decided that since we have a lot of solutions and not enough time, then the person will not cope and you need to maximize these decisions into operations, formalize them , make it repeatable and fast, - algorithmic, in other words. This was not enough.
PayPal started with the same thing: a lot of solutions, a lot of data to process and little time - but the conclusion was the opposite - to maximize the efficiency of a person using hardware. Thus, the focus has shifted to helping people make decisions faster, speed up processing and facilitate the search for information. Ebay bought PayPal for one and a half billion dollars. They solved the problem for themselves in this way.
The people who created PayPal became famous in Silicon Valley, some of them founded Palantir, well, you heard about it.
Palantir has the following tasks: to provide quick analysis and decision-making with an ever-growing flow of incoming data, which is relevant in payment systems, including in combating fraudsters, as well as in law enforcement, in medicine, in intelligence, in the military sphere . The amount of data is growing constantly, as is the need to make decisions using data.

What is Palantir ?, - you may ask. One suggestion is an analytical infrastructure.
I use the analytical word very, very deliberately, the fact is that Palantir is definitely not a visualization tool, (apparently it’s too often for someone to explain this) at first, many people think so. An interface is just an interface, there’s still a lot of interesting things inside, a little later we'll see it.
Palantir, also, is not a closed environment, and was originally conceived as open as possible. In practice, this means that Palantir supports the open data format and any data, in whatever form it exists, can be downloaded and uploaded. Also, this means compatibility with any third-party applications if you use them. We use an open and public application programming interface, that is, third-party companies can create new applications and expand functionality on our platform, like on smartphones. Finally, Palantir is not one database over all the others (one database to rule them all - an allusion to The Lord of the Rings), that is, the idea is not to replace all your developments with Palantir, but to supplement them and simplify your work.
If we talk about what exactly Palantir does, then four main layers can be distinguished, starting with the base one:
1. Data integration.
2. Search and research.
3. Knowledge management.
4. Collaboration.
Now in more detail:

1. Data integration is what started Palantir. This means that we take all the data that you have, in any form, and integrate it into your single database, into a single accessible environment. It is fast, takes days and weeks, not months. This is a flexible system and allows you to integrate not only traditional data sources, but also specialized ones, for example GPS data, maps or video. This is a roomy system capable of handling billions of data-related records.

2. Search and research, - the second large layer of work. Palantir gives you the ability to search and access all data through one single search string, and this is not only about finding what you know, but also about tools that provide you with what you did not know. This is a conceptual search based on the relationships between data, on networks of such relationships, on what can be called the essence of things, this is a persistent search, and after I formulate some basic requirements for the information I want to see, Palantir will warn me about any information falling under the nature of the request (pattern - pattern, trend, diagram; most likely, this is the essence of the request, which the smart system catches). This is a search by time and place, so that we can understand what was happening there or there. The search was made with an eye to the desire to reduce the periods of the operator's handling of data. This is not only familiar to us search by names, file types and databases, but also, for example, search by people, events. I can directly ask: "Show me all the taxis that have stayed here for three weeks," - or: "Show me a map with tags of all the crimes that have occurred in my area over the past six months, and what is the difference with the previous half-year." It is simple and does not require special programming or development efforts for each request.

3. Generally speaking, search is important, but not enough. You can get information that Sod is a Palantir employee, but you need additional information to use it, such as: where did this information come from when it was entered into the system, who has access to the information. This is the third layer - knowledge management, the idea is that every knowledge was once created by someone, entered into the system somewhere, somehow changed over time, has such and such access level, and all this is taken into account. Both data and metadata are important.

4. Something we are missing. Data and metadata are abundant, they are cheap, both in collection and storage. The most valuable resource is analysis, what your analysts produce is the human perception of raw information. We designed Palantir to not only simplify the analysis, but also to be able to share the results. This is the fourth layer - teamwork. We give the opportunity to share the results, create a full-fledged picture of the phenomenon through the efforts of many analysts. The idea itself was deeper than we originally expected. The idea is to have a big picture of the world. If we see different data, different sides of the phenomenon, we work separately. At the same time, having developed different pictures of the world, we are looking for ways to test them, compare and synthesize the general. For example, if there is a large software project, hundreds of specialists can make changes to it every day. Using this approach, we get product version control, the ability to control the process. We get a space where we can, starting with the big picture, make a little change, test the hypotheses and finally collect the big picture again. This, moreover, is a protected way of working - everyone sees only the part that he is allowed to see.

All in all, Palantir:
- Scalable, allows many people to work with petabytes of data, in addition, this data can continue to grow during the work.
- Protected, that is, every fucking piece of information is monitored, and Palantir is already working with some of the most sensitive (most likely Sod means environments that are highly dependent on security) environments in the world.
- Low-risk, that is, Palantir is not such a project where you planted a seed, and wait six months or a year until it comes up, it can be integrated in weeks.
- Proven to be effective. It is already used in health care, in law enforcement, banks are used to counter fraudsters.
Technology is ready, technology is working. Now let's see how.
We will now be demonstrating the capabilities of Palantir, a day of counter-terrorism analyst investigating terrorist financing activities in North Africa. You will see how I find something in Egypt. See the whole process from the moment new information arrives. I will investigate and summarize.
I logged in, this is how the program looks.

On the left is the filling, new incoming documents, in our case, new intelligence “from the fields”. Let's see my new tips.
Agent CT-Blue, from Cairo, reports that he attended Al-Mooja charity event, where there were several invited organizations. Attaches cards of three invited, non-Americans, who discussed the attack on a significant object in an American city. I will take this data, enter it into Palantir and see if there is anything we can find.

As you can see, there are blue links here, this means that one of my colleagues has already worked on the document and made it more convenient, assigned tags, and that part of this data has something in common with those that are already in the system. This blue link leads to the dossier that we have on the guy, Mike Fikri. I will add a phone number, designating it for Palantir as follows. This is a way to give meaning and structure to an unstructured report.

I drag these guys onto a graph (Wiktionary: a graph is a collection of objects with connections), the main tool for link analysis in Palantir, to find out how they are connected to each other, whether they have connections with anyone else. Mike has a photo, so we definitely have information about him.

We now see information collected from various sources, such as: raw information, reports, databases, or external sources - for example, there is information about payments and telephone conversations. This is a kind of review of a person. We see different spellings of the name, address, the name is written in two languages, that is, we can work with information in many languages.

We see several phone numbers, various attachments, that is, you can attach video, audio or images here. Finally, there are connections with others. Mike is associated with twenty-three calls and two payments entered into the system. Back to the graph, it is convenient to analyze the relationship on it.
I’d rather not go into details at this level or read the dossier, but ask a more general, more direct question: “How are these three connected not only with each other, but in a more general sense, based on the information that Palantir now has ? " To do this, we have a wonderful tool called "Look around the bush." I am creating a new search. Palantir asks what kind of coincidence or connection I want to see. I can see who these guys are connected to and those who are related to them. I can see exactly where the information matches. I can ask if this address, email or name popped up somewhere. Also, we can see who is connected through these events (calls and payments), who was on the other hand, and with whom they are connected. That is, Palantir allows you to ask questions fairly accurately.

I don’t need to change anything in the request, so I just use the ones created in advance.
Several questions are asked here:
- Are the guys part of a group and are there any other members in these groups?
- What events (such events as a phone call, including) did the guys take part in and who else is connected with these events?

There are four degrees of separation, that is, questions were asked more than once, and repeated over and over again to get a more complete outline. As you can see, the diagram turned out to be quite large, so I will complete the search and begin the study.

The first thing I want to do, having obtained such a voluminous result, is to ask Palantir what I am looking at, and for this I use the Histogram tool, which gives a brief idea of everything that I look at.

I look at 14 different people, and I also see something that is repeated often, for example, four guys live in one place, in Brooklyn, California, they are now highlighted. Three live together in Vancouver, three in Toronto, it is easy to see these groups when they are singled out. I can also see how many of them live in San Francisco, in Berkeley, in Daily City.
Here is a popular domain for mail, hotmail.com, you can see who has mail on hotmail, who has aol (America on line). You can see the coincidence by last name, nationality, etc.
It is important to remember here that Palantir not only allows you to own all of these types of information, but also the types of information that you want to own are customizable. In the context of counter-terrorism, these are the things that you will pay attention to: people and their nationality, biography features, events, such as calls and payments. If we look at another area, such as cybersecurity, then instead of people, for example, computers and servers may be interesting, and the event will be the traffic between them. In healthcare, these will be outbreaks of disease.
Let's take a closer look.
The first thing that catches your eye is that only one of the three guys we suspect is connected with something. We remove the rest, and I did not delete them, I just hid them for a while.

The remaining, Mike Fickrey, is associated with this interesting group of people. There is a closer look, then one of my colleagues is observing them as a group of terror suspects from a cell working in the Bay area cell. Mike is connected with them through another guy, this immediately increases my interest in him. Our suspect from Cairo may be associated with an attack somewhere in the United States. We also see that it is associated with a large and interesting group of actors here. How exactly is connected?

The true subject Mike Fickrey is connected to those guys through an unknown entity called MF. This is suspicious to me, since it looks like Mike's initials, let's see.

I open connections between these guys to see what they have in common: they both live in California, both Iranians, and for two they have one phone number. This is certainly not reliable, but I would like to test the hypothesis that Mike and MP are one person.

In Palantir, it’s quite simple, I select the Resolve command for these two, the program combines all the original information about them, and in addition, information about where each piece of data came from and when it appeared in the system, this information still exists . Now we have a combined view, which includes all this information, all the combined addresses and phone numbers from independent records.
At any time, I may receive information that makes my hypothesis invalid - this is not so important, I, like any of my colleagues, can easily cancel the union and restore the original records. Information may also be consonant with this hypothesis. I want to share this hypothesis with my colleagues, for this I highlight and publish this information. Until now, I have worked in my personal space, now everyone who watches about Mike will see my assumption that he and MF are the same person.
Now we see that Mike Fickrey is directly connected with this large and interesting group of people. Here you see many different kinds of activities: payments, calls, shared addresses and flights, that is, these people moved together. It’s quite difficult to understand what is happening here, who pays when these events took place. There are two ways to make this information more complete, which I want to show you.

First, I want to show you directions. When there is a payment, someone pays another, and I want to visualize it. Palantir has a great tool, flows. Now we see where the money appears and where it goes. The red dots have different sizes to show the size of the cash flows, so you see little activity there, and really serious traffic here. Large amounts move from this person to these three groups of people. It is interesting.

Two things about threads.
Firstly, the flows are fully extensible, and this is not only a tool for visualizing payments, but also for any other movements. In this study, we can also look at calls. This is how the network of calls between them will look.

Secondly, and this shows our openness, the thread tool was developed as a third-party application, although it looks and works like our own development. This demonstrates the degree of platform customization for different tasks.
And so, we saw cash flows. This guy is connected to the rest by sending them money. We also take a look at time: we know that payments are coming, that guys are traveling. What is the order? Does money go at the same time or in any sequence? Before or after flights?
We have another tool, it is called a timeline and does exactly what you thought: it shows events in time. It will be easier if I highlight events with color.

Flights will be blue, payments are green, calls are red. Now we see a slightly clearer picture of what is happening. We see that the first payment passed on 10/20/2007 and there are active calls in front of him. We see a phone call following a payment in a short time. The next day the same thing is repeated with the second payment. The third payment was made two days later, again followed by a call. If we look a little further, we will see many flights. These three groups living in Vancouver, Toronto and Mexico City, received large sums of money from a guy associated with our suspect, phoned after the money, and boarded the plane the next few days. Where did they go?
Let's go back to the histogram. Let's see what we can find out about airports. We see three flights: Vancouver, Mexico City, Toronto, and the fourth, all together, in Chicago. I found something interesting. Intelligence has been received that a group of people may be associated with an attack on a city in the United States. We found that one of them involved transferring large sums of money to three different groups outside the United States, and found that all three groups traveled to the same American city. For me, it looks like how an activated cell can behave, as if an operation is being prepared.
Now I can share my results, my analysis with colleagues, with law enforcement agencies. I will take screenshots of the most important confirmations: the timeline and graph, and export in a convenient (adjustable) format, as a presentation in PowerPoint.
Here Palantir reproduces all my steps during the research: first I looked at this guy, then built this network, then I found out a number of details. Now I’ll explain what this particular timeline means. Now I have almost finished material, I am almost ready to share it, after I finish a little bit.

We have now gone through the full life cycle, a peculiar day in the life of the analyst Palantir. We started with the receipt of new information “from the fields”, conducted a small investigation and analysis, did a little research and shared with the rest. Have you noticed what it took most of the time? Most of the time was spent talking about the analytical process itself, about the questions that I formulated, about who these guys are and what we know about them. We did not spend time searching the databases by name. You have not seen me fascinating with different types of data or with something secret (quiveries). You also did not see that I had to spend a lot of time reworking my work for a convenient (adjustable) format, to make a presentation or report on my work. You saw how I spend time on what I, as an analyst, are good at, to apply my special knowledge, to use my intuition, to follow in the footsteps that interest me. I left the computer with what it is good at: searching for information, converting information into different formats to make it convenient. All this is subordinated to the idea of reducing the amount of interaction of me, as an analyst, with information. Give me the opportunity to quickly answer questions, quickly conduct research and quickly share the results with others.
I hope this was a useful demonstration, a very superficial demonstration of what Palantir is, believe me. There is also a fantastic ability to specialize in the territory (geospecial ability), which we did not even touch. Many opportunities for working with large-scale data.

I am also pleased to note our growing mobile capabilities, which opens up access to all the capabilities of Palantir on a smartphone. Thank you, I hope you will communicate with us and our customers.
All the best to you, thank you for your time.