STM32. We connect smart cards of the ISO7816 standard
Much has been said about smart cards, but the process of interacting with cards on a physical level until recently has remained a mystery to me. In my article, I would like to highlight the issue of working with smart cards using the interface described in part 3 of the ISO7816 standard . I admit honestly that I spent a lot of time extracting information, but everything turned out to be extremely simple. If interested, come on under the cat.
So what do we have at the entrance? A 3-volt stone and an ISO7816-2 format card , like this:
There are 3 options for VCC input : 1.8 V, 3 V, 5 V (card classes A, B, C, respectively), RST is used to reset the card's state machine (active level is low), I / O is a data line, which is a regular UART , CLK is used to clock the card processor (if the card is inactive, the frequency should not be fed, respectively), VPP pin is used to program the card.
This is how real hackers connect cards:
The interface is the synchronous mode of the USART driver, which means that we synchronize the transmission of each bit of information with the frequency at the CLK pin , but there is one important difference from other synchronous interfaces (like the same SPI ): to clock one bit of information, you need more than one pulse per CLK , and 372 pulses (this magic number is written in 3 parts of ISO7816 , and is called ETU (Elementary Time Unit)), i.e., one data bit is clocked by every 372th (in the ideal case) front. The frequency itself should lie in the range from 1 to 5 MHz.
Now let's deal with the data line ( I / O ). As I said, this is a normal UART with the following parameters:
In principle, we don’t need to know anything more about the hardware properties of this interface. So let's move on to setting up the driver.
Here I will immediately throw a piece of initialization code written in the Standard Peripheral Library :
Here I would like to focus on two points (pre-divider and exchange rate). What is the issue here? On the one hand, you need to set the speed to 9600, and on the other hand, the frequency multiple of the system speed.
Perhaps, in most cases, if ultra-low power is not required, the system frequency is selected to the maximum (in my case it is 168 MHz), the USART module that I use is clocked from the APB2 bus, the maximum frequency of which is 84 MHz, that means the frequency we have chosen should fall in the range from 1 to 5 MHz and be a multiple of 84 MHz, but for a speed of 9600 the frequency will be 9600 * 372 = 3.5712 MHz. How to be here? The developers of the standard provided this moment and laid down a possible deviation from the nominal values up to 20%, so we can calmly round the frequency, say, to 3.5 MHz and choose a speed of 3500000/372 = 9409, the discrepancy here will be less than 2% which is quite acceptable. We must divide the value of the divider by 2, since it is set in steps of 2 (i.e., the value 1 corresponds to the division by 2, 2 - by 4, 3 - by 6, etc.). We get (84 / 3,5) / 2 = 12:
Further, what I would like to dwell on is the handling of parity errors. For this, there is a specially provided time interval, which is called Guard Time (in our country it is 16 bits). What is Guard Time ? Guard Time is the time interval during which the receiver must set a low level on the I / O line in case of a parity error ( NACK ), to which the transmitter must send the same frame again. I won’t particularly discuss the usefulness of this feature, although, purely my opinion, if such errors, in principle, exist, then the exchange channel can be considered unreliable, and such measures are unlikely to help here.
With the driver settings, I think everything is clear, so let's move on to the process of initializing the exchange with the card.
To start the card you need to perform a “cold” reset. It represents the following sequence:
Everything is simple, perform a reset, wait for a response. If the first bit of the response did not arrive within 40,000 cycles (t3), you must set RST to low and deactivate I / O and CLK .
What is this response? ATR (Answer-to-Reset) is the following structure (the size of each field is 1 byte):
1. TS is the initiating byte . It can take one of two values: 3Fh and 3Bh:
2. T0 - format byte . Consists of 2 octets:
3. TA1 . Contains parameters for frequency adjustment:
4. TB1 . Contains VPP output characteristics:
5. TC1 . It contains the parameter N - an additional increment of Guard Time (set in ETU units ), can take a value from 0 to 254, a value of 255 indicates that the interval between the first edges of two adjacent frames is reduced to 11 ETUs .
6. TD1 . There is a little confusion here, since ISO7816 does not reveal the structure of this byte, but everything is pretty intelligently written in source [1]. It consists of 2 octets:
7. TA2 . It contains only one significant bit (senior), it indicates the possibility of switching to another version of the protocol (0 - switching is possible, 1 - switching is not possible), if the byte is not transmitted, it is considered equal to 0
8. T1, ..., TK - historical bytes . They contain information about the card, by whom, when it was issued, etc., the format of this field is not regulated by the standard
9. TCK - byte of the checksum . It is calculated by modulo 2 (xor) addition of all the preceding bytes (present only in the T1 protocol).
Now let's try to figure out what is needed here. Most of all, we are interested in the fields TA1 and TA2, they tell us what actions we should take, namely, choose one of two modes:
If the most significant bit is TA2 = 0, then we use the “negotiation” mode, otherwise - the specified mode.
Negotiated exchange is a process called PTS (Protocol Type Selection). This process consists in sending a pairing device a sequence that tells the card that it is ready to apply the new settings. In turn, the card should respond in the same sequence, after which both the card and the pairing device can begin to work with the new settings. Byte settings TA1 of the ATR frame tells us which settings to apply . The parameters Fi and Di are not the values themselves, but the numbers in the table. From the table we can find the values F (Clock rate conversion factor) and D (Bit rate adjustment factor) corresponding to these numbers :
Fi-F table.
Table Di-D.
* RFU - reserved for future use The
quotient of dividing F and D is the new ETU value , i.e. we will be able to select any frequency and speed, but taking into account that the relationship between them should be equal to the quotient the F / D .
Now more about the PTS frame itself :
1. PTSS - initiating byte (always FFh)
2. PTS0 - format byte . Determines which fields are present in the frame; the most significant octet is a bit mask:
3. PTS1 . It contains the requested values of Fi and Di received in byte TA1 ATR , if the byte is not transmitted, then Fi and Di are considered equal to 1.
4. PTS2 . Indicates whether parameter N specified in TC1 ATR
5. PTS3 will be applied . Reserved.
6. PCK - checksum byte . It is calculated by modulo 2 (xor) addition by all previous bytes.
It's simple, we form a sequence, send, wait for an answer, compare, if it matches, we rebuild the speed on Fclk/ ( F / D ).
If the card does not support the “negotiation” mode, we just continue to work.
To consolidate the material, we will try to make out a simple example. This is an ordinary Beeline SIM card. Here is the ATR she throws:
The PTS frame, in this case, will look like this:
In my article, I omitted some details related, for example, programming smart cards, and also did not consider the protocols of the channel and application layers, but there are several reasons for this. Firstly, each of these points draws on a separate article, if not more, and secondly , in my opinion, there is plenty of information on the APDU protocol on the Internet.
Well, I really hope that my work will not go unnoticed, or at least satisfy the curiosity of the afflicted. Anyway, thanks to everyone who mastered, I will be glad to answer questions, and get a couple of other kicks for the jambs. In the end, I strongly advise everyone to read the wonderful series of articles on cryptographic Java maps . Good to all!
I must say right away that we are talking about a processor with hardware support for ISO7816 (for example, STM32F4xx ), writing a software emulator is still mania, which takes place either if you press it very hard, or if there is too much free time.
CONCLUSIONS AND WIRING DIAGRAM
So what do we have at the entrance? A 3-volt stone and an ISO7816-2 format card , like this:
- VCC - Power
- RST - reset input
- I / O - bidirectional data line
- CLK - clocking
- GND - Earth
- VPP - programming output
There are 3 options for VCC input : 1.8 V, 3 V, 5 V (card classes A, B, C, respectively), RST is used to reset the card's state machine (active level is low), I / O is a data line, which is a regular UART , CLK is used to clock the card processor (if the card is inactive, the frequency should not be fed, respectively), VPP pin is used to program the card.
This is how real hackers connect cards:
INTERFACE
The interface is the synchronous mode of the USART driver, which means that we synchronize the transmission of each bit of information with the frequency at the CLK pin , but there is one important difference from other synchronous interfaces (like the same SPI ): to clock one bit of information, you need more than one pulse per CLK , and 372 pulses (this magic number is written in 3 parts of ISO7816 , and is called ETU (Elementary Time Unit)), i.e., one data bit is clocked by every 372th (in the ideal case) front. The frequency itself should lie in the range from 1 to 5 MHz.
Now let's deal with the data line ( I / O ). As I said, this is a normal UART with the following parameters:
- Data Bit: 8
- Stop bit: 1.5
- Par Bit: Even (even)
- Speed (at start): 9600 Baud
In principle, we don’t need to know anything more about the hardware properties of this interface. So let's move on to setting up the driver.
DRIVER SETTING
Here I will immediately throw a piece of initialization code written in the Standard Peripheral Library :
RCC_ClocksTypeDef RCC_Clocks;
USART_InitTypeDef USART_InitStructure;
USART_ClockInitTypeDef USART_ClockInitStructure;
NVIC_InitTypeDef NVIC_InitStructure;
/// Запросим частоту на шине
RCC_GetClocksFreq(&RCC_Clocks);
/// Включим тактирование драйвера
SC_USART_APB_PERIPH_CLOCK(RCC_APB2Periph_USART1, ENABLE);
/// Зададим предделитель
USART_SetPrescaler(USART1, (RCC_Clocks.PCLK2_Frequency / CLK_FREQ) / 2);
/// Зададим Guard Time
USART_SetGuardTime(USART1, 16);
/// Сконфигурируем синхронную часть (вывод CLK)
USART_ClockInitStructure.USART_Clock = USART_Clock_Enable;
USART_ClockInitStructure.USART_CPOL = USART_CPOL_Low;
USART_ClockInitStructure.USART_CPHA = USART_CPHA_1Edge;
USART_ClockInitStructure.USART_LastBit = USART_LastBit_Enable;
USART_ClockInit(USART1, &USART_ClockInitStructure);
/// Сконфигурируем асинхронную часть (вывод I/O)
USART_InitStructure.USART_BaudRate = CLK_FREQ / ETU;
USART_InitStructure.USART_WordLength = USART_WordLength_9b;
USART_InitStructure.USART_StopBits = USART_StopBits_1_5;
USART_InitStructure.USART_Parity = USART_Parity_Even;
USART_InitStructure.USART_Mode = USART_Mode_Rx | USART_Mode_Tx;
USART_InitStructure.USART_HardwareFlowControl = USART_HardwareFlowControl_None;
USART_Init(USART1, &USART_InitStructure);
/// Разрешим передачу NACK
USART_SmartCardNACKCmd(USART1, ENABLE);
/// Включим режим работы со смарт-картами
USART_SmartCardCmd(USART1, ENABLE);
/// Подадим питание на драйвер
USART_Cmd(USART1, ENABLE);
/// Разрешим 2 прерывания (по приему и по ошибке паритета)
USART_ITConfig(USART1, USART_IT_RXNE, ENABLE);
USART_ITConfig(USART1, USART_IT_PE, ENABLE);
/// Разрешим прерывания соответствующего канала
NVIC_InitStructure.NVIC_IRQChannel = USART1_IRQn;
NVIC_InitStructure.NVIC_IRQChannelPreemptionPriority = 0;
NVIC_InitStructure.NVIC_IRQChannelSubPriority = 0;
NVIC_InitStructure.NVIC_IRQChannelCmd = ENABLE;
NVIC_Init(&NVIC_InitStructure);
I lowered the output settings so as not to clutter up the code, but there is one important point, the I / O output must be configured as Open-Drain , since the standard provides for the possibility of finding the line in the Z-state when the card decides where to pull it.
Here I would like to focus on two points (pre-divider and exchange rate). What is the issue here? On the one hand, you need to set the speed to 9600, and on the other hand, the frequency multiple of the system speed.
Perhaps, in most cases, if ultra-low power is not required, the system frequency is selected to the maximum (in my case it is 168 MHz), the USART module that I use is clocked from the APB2 bus, the maximum frequency of which is 84 MHz, that means the frequency we have chosen should fall in the range from 1 to 5 MHz and be a multiple of 84 MHz, but for a speed of 9600 the frequency will be 9600 * 372 = 3.5712 MHz. How to be here? The developers of the standard provided this moment and laid down a possible deviation from the nominal values up to 20%, so we can calmly round the frequency, say, to 3.5 MHz and choose a speed of 3500000/372 = 9409, the discrepancy here will be less than 2% which is quite acceptable. We must divide the value of the divider by 2, since it is set in steps of 2 (i.e., the value 1 corresponds to the division by 2, 2 - by 4, 3 - by 6, etc.). We get (84 / 3,5) / 2 = 12:
- Frequency ( CLK ): 3.5 MHz
- Speed ( I / O ): 9409 Baud
- Presclaer: 12
Further, what I would like to dwell on is the handling of parity errors. For this, there is a specially provided time interval, which is called Guard Time (in our country it is 16 bits). What is Guard Time ? Guard Time is the time interval during which the receiver must set a low level on the I / O line in case of a parity error ( NACK ), to which the transmitter must send the same frame again. I won’t particularly discuss the usefulness of this feature, although, purely my opinion, if such errors, in principle, exist, then the exchange channel can be considered unreliable, and such measures are unlikely to help here.
With the driver settings, I think everything is clear, so let's move on to the process of initializing the exchange with the card.
START
To start the card you need to perform a “cold” reset. It represents the following sequence:
- Put on RST low level
- Power up the VCC
- Submit Frequency to CLK
- Wait for a time interval of 40,000 CLK cycles
- Put on RST high level
- Wait for a response for 40,000 cycles
Everything is simple, perform a reset, wait for a response. If the first bit of the response did not arrive within 40,000 cycles (t3), you must set RST to low and deactivate I / O and CLK .
ATR
What is this response? ATR (Answer-to-Reset) is the following structure (the size of each field is 1 byte):
- TS: Initial character
- TO: Format character
- TAi: Interface character [codes FI, DI]
- TBi: Interface character [codes II, PI1]
- TCi: Interface character [codes N]
- TDi: Interface character [codes Yi + 1, T]
- T1, ..., TK: Historical characters (max, 15)
- TCK: Check character
1. TS is the initiating byte . It can take one of two values: 3Fh and 3Bh:
- 3Fh - Inverse Convention - inverse polarity, i.e. 0 is transmitted high, and 1 - low (an important point, odd will be used here to control parity, i.e., odd):
- 3Bh - Direct Convention - direct polarity - the same, but exactly the opposite (parity - even, i.e., even)
2. T0 - format byte . Consists of 2 octets:
- Y1 (high octet) is a bitmask that shows which fields follow:
- b5 - TA1 is transmitted
- b6 - TB1 is transmitted
- b7 - TC1 is transmitted
- b8 - TD1 is transmitted
- K (low octet) - the number of "historical" bytes
3. TA1 . Contains parameters for frequency adjustment:
- FI (high octet) - dividend
- DI (low octet) - divisor
4. TB1 . Contains VPP output characteristics:
- II (bits b7 - b6) - maximum programming current
- PI (bits b5 - b1) - programming voltage
5. TC1 . It contains the parameter N - an additional increment of Guard Time (set in ETU units ), can take a value from 0 to 254, a value of 255 indicates that the interval between the first edges of two adjacent frames is reduced to 11 ETUs .
6. TD1 . There is a little confusion here, since ISO7816 does not reveal the structure of this byte, but everything is pretty intelligently written in source [1]. It consists of 2 octets:
- Y2 (high octet) is a bitmask that shows which fields follow:
- b5 - TA2 is transmitted
- b6 - TB2 is transmitted
- b7 - TC2 transmitted
- b8 - TD2 is transmitted
- T (low octet) - protocol used (0 - T0, 1 - T1, other values are reserved)
7. TA2 . It contains only one significant bit (senior), it indicates the possibility of switching to another version of the protocol (0 - switching is possible, 1 - switching is not possible), if the byte is not transmitted, it is considered equal to 0
8. T1, ..., TK - historical bytes . They contain information about the card, by whom, when it was issued, etc., the format of this field is not regulated by the standard
9. TCK - byte of the checksum . It is calculated by modulo 2 (xor) addition of all the preceding bytes (present only in the T1 protocol).
Now let's try to figure out what is needed here. Most of all, we are interested in the fields TA1 and TA2, they tell us what actions we should take, namely, choose one of two modes:
- Negotiation mode
- Specific mode
If the most significant bit is TA2 = 0, then we use the “negotiation” mode, otherwise - the specified mode.
Pts
Negotiated exchange is a process called PTS (Protocol Type Selection). This process consists in sending a pairing device a sequence that tells the card that it is ready to apply the new settings. In turn, the card should respond in the same sequence, after which both the card and the pairing device can begin to work with the new settings. Byte settings TA1 of the ATR frame tells us which settings to apply . The parameters Fi and Di are not the values themselves, but the numbers in the table. From the table we can find the values F (Clock rate conversion factor) and D (Bit rate adjustment factor) corresponding to these numbers :
Fi-F table.
FI | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 |
F | internal clk | 372 | 558 | 744 | 1116 | 1488 | 1860 | RFU |
FI | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 |
F | RFU | 512 | 768 | 1024 | 1536 | 2048 | RFU | RFU |
Table Di-D.
DI | 0000 | 0001 | 0010 | 0011 | 0100 | 0101 | 0110 | 0111 |
D | RFU | 1 | 2 | 4 | 8 | 16 | RFU | RFU |
DI | 1000 | 1001 | 1010 | 1011 | 1100 | 1101 | 1110 | 1111 |
D | RFU | RFU | 1/2 | 1/4 | 1/8 | 1/16 | 1/32 | 1/64 |
* RFU - reserved for future use The
quotient of dividing F and D is the new ETU value , i.e. we will be able to select any frequency and speed, but taking into account that the relationship between them should be equal to the quotient the F / D .
Now more about the PTS frame itself :
- PTSS: Initial character (Mandatory)
- PTS0: Format character (Mandatory)
- PTS1 (Optional)
- PTS2 (Optional)
- PTS3 (Optional)
- PCK: Check character (Mandatory)
1. PTSS - initiating byte (always FFh)
2. PTS0 - format byte . Determines which fields are present in the frame; the most significant octet is a bit mask:
- b5 - PTS1 is transmitted
- b6 - PTS2 is transmitted
- b7 - PTS3 is being transmitted
- b8 - always 0, reserved
- T (low octet) - protocol used (0 - T0, 1 - T1, other values are reserved)
3. PTS1 . It contains the requested values of Fi and Di received in byte TA1 ATR , if the byte is not transmitted, then Fi and Di are considered equal to 1.
4. PTS2 . Indicates whether parameter N specified in TC1 ATR
5. PTS3 will be applied . Reserved.
6. PCK - checksum byte . It is calculated by modulo 2 (xor) addition by all previous bytes.
It's simple, we form a sequence, send, wait for an answer, compare, if it matches, we rebuild the speed on Fclk/ ( F / D ).
If the card does not support the “negotiation” mode, we just continue to work.
EXAMPLE
To consolidate the material, we will try to make out a simple example. This is an ordinary Beeline SIM card. Here is the ATR she throws:
3B 3B 94 00 9B 44 20 10 4D AD 40 00 33 90 00 3Bh (TS) - direct convention 3Bh (T0) (0011 1011) - expect TA1 and TB1, the number of "historical" bytes = 11 94h (TA1) - Fi = 9, Di = 4, we find F and D according to tables 1 and 2 (F = 512, D = 8), the new ETU = 512/8 = 64 00h (TB1) - VPP is not supported
The PTS frame, in this case, will look like this:
FF 10 94 7B FFh (PTSS) - Initial Byte 10h (PTS0) (0001 0000) - transmit PTS0, protocol T0 94h (PTS1) = TA1 7Bh (PCK) = xor (FF 10 94)
CONCLUSION
In my article, I omitted some details related, for example, programming smart cards, and also did not consider the protocols of the channel and application layers, but there are several reasons for this. Firstly, each of these points draws on a separate article, if not more, and secondly , in my opinion, there is plenty of information on the APDU protocol on the Internet.
Well, I really hope that my work will not go unnoticed, or at least satisfy the curiosity of the afflicted. Anyway, thanks to everyone who mastered, I will be glad to answer questions, and get a couple of other kicks for the jambs. In the end, I strongly advise everyone to read the wonderful series of articles on cryptographic Java maps . Good to all!
LINKS
- http://www.cardwerk.com/smartcards/smartcard_standard_ISO7816.aspx - Actually, the standard itself
- http://www.hackersrussia.ru/Cards/ASyncro/ISO7816-3.php ISO7816 in Russian
- http://www.st.com/web/en/resource/technical/document/application_note/CD00166510.pdf - Application Note for STM32F10x
- http://www.smartcard.co.uk/tutorials/sct-itsc.pdf - Good smart card tutorial