Mapping Noise Using KSQL, Raspberry Pi, and Radio
- Transfer
At first glance, this story has everything to earn the status of a romantic fast on the eve of March 8: airplanes, love, a little spying and, finally, a cat (more precisely, a cat). It is hard to imagine that all this is directly related to Kafka, KSQL and the experiment “how to find the noisiest aircraft using home information technology”. It’s difficult, but it’s necessary: it was such an experiment that Simon Obury conducted, and we translated an article of his authorship with a description of all the details of the process.
Our new cat named Snowflake wakes up early. The sounds of airplanes flying over our house awaken her. But what if I, using Apache Kafka, KSQL and Raspberry Pi, could determine which plane keeps my cat awake? It would be nice to create an entertaining tracking panel, on which the cat could switch its attention - and give me some more sleep.
In outline
We transfer planes from the sky to graphics using Kafka and KSQL
Aircraft determine their location using GPS receivers. The on-board transmitter periodically reports the location, identification number, altitude and speed of the ship using short radio transmissions. These broadcasts of broadcast automatic dependent surveillance ( AZN-V ) are essentially data packets that are open for access from ground stations.
One microcomputer, such as the Raspberry Pi, and several auxiliary components are all that is required to receive messages from the aircraft's on-board transmitters scurrying over my house.
The airborne signals of the aircraft look like a tangled ball of messages and require systematization. Recognizing these chaotic data streams is like listening to a conversation at a noisy party. Therefore, to find a plane that worries my cat, I decided to use a combination of Kafka and KSQL.
Awakened Cat and Raspberry Pi
Collection of AZN-B readings using the Raspberry Pi
To collect on-board transmissions, I used the Raspberry Pi and RTL2832U - a USB modem that was originally sold as a device for watching digital TV on a computer. On Raspberry Pi I installed dump1090 - a program that receives data from AZN-V via RTL2832U using a small antenna.
My software radio from Raspberry Pi and RTL2832U
Convert AZN-B signals to Kafka themes
Now that I have received a stream of raw AZN-B signals, we should pay attention to traffic. Raspberry Pi does not have enough power for serious computing, so I had to transfer data processing to my local cluster on Kafka.
Received messages are either divided into location messages , or messages about the identification of the board . The location looks like a message of the form: "the 7c6db8 board flies at an altitude of 6,250 feet at the coordinate -33.8,151.0 . " Information about the identification of the board will look like: "the board 7c451c flies along the route QJE1726 . "
Small python scriptFor my Raspberry Pi, it shares all of the incoming AZN-B messages. I used the Confluent Rest Proxy proxy to distribute data from the Raspberry Pi to the location-topic and ident-topic themes on Kafka. The proxy server provides a RESTful interface for the Kafka cluster, which makes it easy to create messages using a simple REST call on Pi.
I wanted to understand which planes fly over my roof and which routes. The OpenFlights database allows you to compare the airport code, for example 7C6DB8 assigned by the International Civil Aviation Organization (ICAO), with the type of aircraft - in our case, the Boeing 737. I uploaded my mapping data to the icao-to-aircraft theme .
KSQLprovides an "SQL engine", which makes it possible to process data in real time on Apache Kafka topics. For example, to find the on-board code 7C6DB8, we can write the following query:
CREATE TABLE icao_to_aircraft WITH (KAFKA_TOPIC='ICAO_TO_AIRCRAFT_REKEY', VALUE_FORMAT='AVRO', KEY='ICAO');
ksql> SELECT manufacturer, aircraft, registration \
FROM icao_to_aircraft \
WHERE icao = '7C6DB8';
Boeing | B738 | VH-VYI
Similarly, in the callsign-details topic, I uploaded the callsigns (i.e. QFA563, this is Qantas’s flight from Brisbane to Sydney).
CREATE TABLE callsign_details WITH (KAFKA_TOPIC='CALLSIGN_DETAILS_REKEY', VALUE_FORMAT='AVRO', KEY='CALLSIGN');
ksql> SELECT operatorname, fromairport, toairport \
FROM callsign_details \
WHERE callsign = 'QFA563';
Qantas | Brisbane | Sydney
Now let's take a look at the location-topic data stream . Here we can observe a constant stream of incoming messages about the location of a flying airplane.
kafka-avro-console-consumer --bootstrap-server localhost:9092 --property --topic location-topic
{"ico":"7C6DB8","height":"6250","location":"-33.807724,151.091495"}
The KSQL query will look like this:
ksql> SELECT TIMESTAMPTOSTRING(rowtime, 'yyyy-MM-dd HH:mm:ss'), \
ico, height, location \
FROM location_stream \
WHERE ico = '7C6DB8';
2018-09-19 07:13:33 | 7C6DB8 | 6250.0 | -33.807724,151.091495
KSQL: stream harmonization ...
The real value of KSQL lies in the ability to combine incoming location data streams with topic source data (see 03_ksql.sql ) —that is, adding useful information to the raw data stream. This is very similar to a “left join” in a traditional database. The result is another Kafka theme created without a single line of Java code!
source> CREATE STREAM location_and_details_stream AS \
SELECT l.ico, l.height, l.location, t.aircraft \
FROM location_stream l \
LEFT JOIN icao_to_aircraft t ON l.ico = t.icao;
In addition, you receive a KSQL query. The data stream will look like this:
ksql> SELECT TIMESTAMPTOSTRING(rowtime, 'yy-MM-dd HH:mm:ss') \
, manufacturer \
, aircraft \
, registration \
, height \
, location \
FROM location_and_details_stream;
18-09-27 09:53:28 | Boeing | B738 | VH-YIA | 7225 | -33.821,151.052
18-09-27 09:53:31 | Boeing | B738 | VH-YIA | 7375 | -33.819,151.049
18-09-27 09:53:32 | Boeing | B738 | VH-YIA | 7425 | -33.818,151.048
In addition, we can combine the callsign input stream with the fixed callsign_details theme :
CREATE STREAM ident_callsign_stream AS \
SELECT i.ico \
, c.operatorname \
, c.callsign \
, c.fromairport \
, c.toairport \
FROM ident_stream i \
LEFT JOIN callsign_details c ON i.indentification = c.callsign;
ksql> SELECT TIMESTAMPTOSTRING(rowtime, 'yy-MM-dd HH:mm:ss') \
, operatorname \
, callsign \
, fromairport \
, toairport \
FROM ident_callsign_stream ;
18-09-27 13:33:19 | Qantas | QFA926 | Sydney | Cairns
18-09-27 13:44:11 | China Eastern | CES777 | Kunming | Sydney
18-09-27 14:00:54 | Air New Zealand | ANZ110 | Sydney | Auckland
Now we have two informative topics:
- location_and_details_stream , which provides a stream of updated information about the location and speed of the aircraft;
- ident_callsign_stream , which describes the details of the flight, including the airline and destination.
With these constantly updated themes, we can create some great overview panels. I used Kafka Connect to upload Kafka themes populated by KSQL to Elasticsearch (full scripts here ).
Kibana Dashboard
Here is an example of an overview panel showing the location of an airplane on a map. In addition, you can see a chart by airline, a graph of flight altitude, and word clouds by main destination. The heat map shows the areas where aircraft are concentrated, that is, the areas with the highest noise level.
Back to the cat
Today the cat woke me up at around 6 a.m. Can KSQL help me find the plane that flew over this house at an altitude of less than 3,500 feet at that time?
select timestamptostring(rowtime, 'yyyy-MM-dd HH:mm:ss')
, manufacturer
, aircraft
, registration
, height
from location_and_details_stream
where height < 3500 and rowtime > stringtotimestamp('18-09-27 06:10', 'yy-MM-dd HH:mm') and rowtime < stringtotimestamp('18-09-27 06:20', 'yy-MM-dd HH:mm');
2018-09-27 06:15:39 | Airbus | A388 | A6-EOD | 2100.0
2018-09-27 06:15:58 | Airbus | A388 | A6-EOD | 3050.0
Awesome! I can identify a plane over my roof at 6:15 in the morning. It turns out that Snowflake was woken up by the Airbus A380 (a huge liner, by the way), which flew to Dubai.
Just a couple of days off, and you have a streaming processing system with KSQL. Which, moreover, allows you to quickly find interesting data events. Although Snowflake may be skeptical about them.