Mythical PCM Capture Extraction Tool: extracting sound without reference to TAC


The mythical resource pcet.cisco.com

, inaccessible to mere mortals, A reader who is faced with the need to capture PCM data on Cisco ISR routers easily finds a comprehensive step-by-step guide on this topic, but he surely stumbles on the item “the received file should be sent to TAC for further sound extraction” . Is it possible to do without this?

Background


The customer’s telephone exchange needed to be connected both to one well-known SIP-operator (public IP-address) and to the SIP-server of the provider that provided local communication (private IP-address). Unfortunately, stations of this model cannot simultaneously connect to SIP servers behind NAT and to those in the private network, since NAT bypass by address spoofing is turned on for all SIP trunks without exception. The client used Cisco 2811 as a router. Since the customer had a free ISDN PRI stream card in the exchange, they decided to purchase the E1 stream card and voice processor into the specified router. The appropriate settings were made (the G.711 A-law codec was used everywhere) - and it worked.

Problem


The situation is standard - everything worked fine, but one day it stopped. Employees of the customer began to complain about a signal during the conversation, "as if someone had pressed a button on the phone." This signal was quickly identified as a DTMF tone, which could sound at the beginning, middle or end of a conversation, or even several times in the continuation of it. It remained to find out who was to blame (or what was to blame) and how to fix the situation.

It should be said that the composition of the equipment used provides an impressive set of tools for testing and debugging. In addition to the full range of Cisco debugging capabilities, there is a very useful real-time ISDN tracing tool that is part of the management console of the PBX used. Unfortunately, nothing helped in this case; to “catch” this signal, that is, to compare it with debugging information, did not work out due to the extreme irregularity of its appearance. Examination of all information on this issue led to a series of documents on the capture of PCM on Cisco equipment.

It is reported that 2900, 3900, 3900e series routers and VG350 gateways make this easy and simple.and the result is a file containing voice data suitable for direct import into audio editors.

Unfortunately, the 2800 series customer’s router did not have the honor of belonging to these noble families and did not possess such capabilities. In IOS with versions below 15.2 (2) T1 (for the customer’s router, such a software version will never be released, because End of Support) in order to capture PCM, a number of commands were required. If the event is expected, then you can wait for it and record:

voice hpi capture buffer 51200000
voice hpi capture destination flash:pcm.dat
test voice port x/x/x pcm-dump caplog fff duration xxx

Or you can record everything at all until the event of interest occurs (carefully, flash memory overflow is possible!):

dial-peer voice x voip/pots
 pcm-dump caplog fff duration xxx

This creates a file with voice and debugging information, which can be extracted from flash memory and copied to a TFTP server.

Another problem, in a way worse than the first


Here the fun began. Numerous citizens, puzzled by the seizure of PCM, poured out their indignation at the technical support forums - they say, I dumped it, imported it into an audio editor and so on, anyway it turns out some garbage! What am I doing wrong? To this, other citizens answered that the situation was completely normal, only powerful TAC specialists can extract the desired sound from this file using the mythical PCM Capture Extraction Tool, available at pcet.cisco.com. I think I'm not the only one watching a wonderful tutorial video on Youtube, tried to follow this link and was brutally humiliated, because it is impossible for mere mortals. Allegedly, the need to involve TAC is connected not with the technical, but with the legal aspects (I wonder how is the ISR G2 different from the ISR in the legal aspect?) - they say, such decoding can be considered illegal listening from some point of view, and in general the decoder changes too often, the version of the DSP used is difficult to determine, etc. We also managed to find the opinion that was allegedly expressed on behalf of the TAC non-Russian engineer - they say that we don’t think about the mechanism, but simply upload the file to the system through the form on the page. There are a number of restrictions on the name and size, but as a result we get .WAV files, as well as some debugging information. In the form of a redistributable distribution, this tool does not exist, and in general the main difficulty is to get it yourself.

However, it was not possible to involve TAC engineers in solving the issue in this case - remember that the equipment is in End of Support status? Accordingly, no service contracts and no support. True, TAC engineers pleased me by saying that my desire to independently solve this problem, that is, to understand the file structure and extract the information stored there, is not a violation of anything and, if successful, will not deprive me of this for life. And on that being said, thanks.

I started by loading one of the received files into a hex editor. What was seen was not very pleasing, although there were familiar words (“C2800NM-ADVENTERPRISEK9-M, 15.1 (4) M10, IP | SLA | IPv6 | IS-IS | FIREWALL | VOICE | PLUS | QoS | HA | NAT | MPLS | VPN | LE ”and“ 28.3.14 ”, the latter is the DSPWARE VERSION value from the output of the show voice dsp detail command). In addition, the structure and frequency were noticeable, so to speak. At this point, I decided to return to the test voice port command:

C2811#test voice port 0/3/0:15.1 pcm-dump  ?
  caplog   Print to caplog, please enable banjo logger
  console  Print to console, possible flood console
  disable  Disable the message dump

In all the descriptions found, attention is paid exclusively to the caplog parameter. Why is the well-deserved console so undeservedly ignored? This should be fixed:

C2811#test voice port 0/3/0:15.1 pcm-dump console ?
  <1-7>  PCM stream index. Bit0:R_in=0x01 Bit1:S_in=0x02 Bit2:S_out=0x04

Everything is more or less clear here. The following parameter encodes the PCM stream numbers to be captured. The R_in stream contains audio data transmitted from VoIP to PSTN. The S_in stream contains audio data transmitted from the PSTN to VoIP BEFORE DSP processing. The S_out stream contains audio data transmitted from the PSTN to VoIP AFTER the DSP processing. The numbers of the flows can be added up, getting any desired combinations - you can capture any one, any two, or all three streams at once. However, I don’t see the reasons why it makes sense to capture not all three data streams at once - unless in order to understand the structure of the information coming to the console. In this case, you can start an unlimited capture (do not forget that there should be enough free space on the flash card) with a subsequent manual stop:

test voice port 0/3/0:15.1 pcm-dump console 1

[conversation]

test voice port 0/3/0:15.1 pcm-dump disable

You can immediately specify the required duration in seconds:

C2811#test voice port 0/3/0:15.1 pcm-dump console 1 duration ?
  <0-255>  capturing time in sec

Capturing three channels in the console looks something like this:

047581: Jan 19 11:52:15.491:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
047582: Jan 19 11:52:15.495:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 00 00 07 00 09 00 0E 00 0E 00 0D 00 0D 00 03 FF FC 00 00 00 04 00 02 00 00 FF F9 00 03 00 0B 00 04 00 0B 00 0C 00 01 00 00 00 01 FF FE 00 06 00 02 00 03 00 07 00 06 00 00 00 05 FF FD 00 02 00 07 00 03 00 04 00 08 FF FF 00 02 00 04 FF FF FF F8 FF F5 FF F3 FF FA FF F8 FF F1 FF F1 FF F2 FF F4 FF EF FF EF FF EE FF F0 FF F8 FF FB FF F4 FF F3 FF FA FF F4 FF F2 FF F8 FF FF 00 04 FF FB FF F5 FF F1 FF FC FF FD FF FE FF F6 FF F3 FF F2 FF ED FF EA FF EB FF F7 FF F5 FF F2 FF F2 FF F6 FF FC
047583: Jan 19 11:52:15.499:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 01 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08

[and many more similar posts]

Capturing one channel:

027761: Jan 15 00:59:53.549:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00

[and many more similar messages]

Let's execute the following commands:

test voice port 0/3/0:15.1 pcm-dump con 1 duration 10
test voice port 0/3/0:15.1 pcm-dump dis
test voice port 0/3/0:15.1 pcm-dump con 2 duration 10
test voice port 0/3/0:15.1 pcm-dump dis
test voice port 0/3/0:15.1 pcm-dump con 4 duration 10
test voice port 0/3/0:15.1 pcm-dump dis
test voice port 0/3/0:15.1 pcm-dump con 7 duration 10
test voice port 0/3/0:15.1 pcm-dump dis

Let us return to the structure of the received text messages. File for R_in:

047648: Jan 19 12:10:48.525:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 00 00 18 00 28 00 28 00 18 00 28 00 28 00 18 00 28 00 18 00 18 00 18 00 18 00 08 00 18 00 18 00 08 00 18 00 08 FF F8 00 08 FF F8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF E8 FF D8 FF D8 FF E8 FF E8 FF E8 FF E8 FF D8 FF E8 FF E8 FF E8 FF E8 FF E8 FF F8 00 08 00 08 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 FF F8 00 08 00 08 FF F8 00 08 00 08 00 08 00 08 00 18 00 08 00 18 00 18 00 18 00 18 00 18 00 18 00 08 00 18 00 18 00 18 00 18 00 08 00 08 00 08 00 08 00 18 00 18 FF F8 00 08

[...]

048647: Jan 19 12:10:58.517:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 00 00 08 00 08 FF E8 FF F8 FF E8 FF F8 FF F8 FF E8 FF E8 FF E8 FF E8 FF F8 FF E8 00 08 FF F8 FF F8 FF F8 FF F8 00 18 FF F8 00 08 00 08 FF F8 FF F8 FF E8 FF F8 FF E8 FF F8 FF E8 FF F8 FF F8 FF F8 FF F8 FF E8 FF D8 FF F8 FF E8 FF E8 FF E8 FF E8 FF F8 FF F8 FF D8 FF E8 FF E8 FF F8 FF E8 FF F8 FF F8 FF E8 FF F8 FF F8 FF E8 FF E8 FF F8 FF E8 FF E8 00 08 FF F8 FF F8 FF F8 FF F8 FF F8 FF E8 FF E8 FF F8 FF E8 00 08 FF F8 00 18 00 08 00 08 FF E8 00 08 00 08 00 08 00 08 00 18 00 08 00 08

The six-digit number before the date is a unique identifier, in total we have exactly one thousand messages with numbers from 047648 to 048647 inclusive.

File for S_in:

050234: Jan 19 12:12:58.636:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 01 FF E8 FD 90 FE F8 FF B8 02 B0 01 C8 FF 48 03 D0 FE 98 08 40 FD F0 FD 70 FF D8 FD D0 00 C8 FC D0 04 E0 FA E0 02 10 04 A0 F9 E0 02 90 FD F0 FE A8 00 B8 02 50 02 70 FE E8 02 90 FD 90 00 F8 04 60 FA 60 00 A8 FC D0 02 30 01 E8 FD D0 03 50 FA 20 FF D8 01 78 FF 28 03 90 01 08 FB E0 01 78 FF 98 FE C8 FF A8 FF 38 06 A0 FA A0 00 A8 04 A0 FD 10 00 E8 FE 08 00 B8 FE D8 FD B0 01 98 01 78 01 D8 00 38 FF C8 FF 08 00 28 01 58 FE B8 01 D8 FE E8 FE 08 01 C8 FF 68 FF A8 00 08 00 38 FF 28 FE 48

[...]

051233: Jan 19 12:13:08.624:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 01 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 08 00 18 00 08 00 08 00 08 00 18 00 18 00 18 00 18 00 18 00 18 00 08 00 08 FF F8 FF F8 FF F8 FF E8 FF E8 FF E8 FF D8 FF E8 FF D8 FF D8 FF D8 FF E8 FF E8 FF E8 FF F8 FF F8 00 08 00 18 00 08 00 08 00 18 00 18 00 18 00 18 00 18 00 08 00 08 00 08 00 08 00 08 FF F8 00 08 FF F8 00 08 00 08 00 08 FF F8 FF F8 FF F8 FF F8 00 08 00 08 00 08 00 08 FF F8 00 08 00 08 00 08 00 18 00 08 00 18 00 18 FF F8 FF F8 FF F8 FF F8 FF F8 FF E8 FF E8

Also one thousand messages exactly.

File for S_out:

051234: Jan 19 12:14:00.396:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 02 00 06 FF F6 00 76 00 F6 00 36 FF C6 00 68 FF F8 FF 88 00 78 FF A8 FE B8 00 18 00 98 FF 98 FF C8 01 38 FF 98 FF 18 01 58 00 58 FE F6 FF A8 01 A8 00 E8 00 38 00 18 FF 08 FE 56 FF 06 00 A6 00 86 00 E6 FF 86 FF 58 FF D8 00 18 00 86 00 26 00 66 FF F6 FF E6 FF 56 00 56 00 76 FE F6 FF E6 00 16 FF C6 FF B6 FF C6 00 86 FF B6 FF 76 FF F6 00 A6 00 36 FF E6 00 88 00 08 00 38 FF E8 FF F6 00 18 00 3A 00 8A 00 2A FF BA FF 5A FF D8 00 18 00 58 00 5A FF 7A FF AA FF E8 00 18 00 18 FF C8 FF F8

[...]

052233: Jan 19 12:14:10.388:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 02 00 C0 01 8C FD 52 01 00 01 C2 FE E8 FE BE 01 3E 01 60 FF 4E FE 56 FE 44 02 0E 02 14 FF E8 FF 46 FF B4 FE F6 01 64 00 C2 FC D2 03 84 00 84 FC BC FD E8 01 5E 04 66 FD B6 FE 62 00 60 FF C6 01 6E 00 50 FE AA FE AC 01 3C 02 2C FC E8 00 16 01 3E 00 38 00 EC FF 26 FF 40 00 26 01 30 00 22 00 20 FF 74 FE 9C 01 A8 01 16 FF A8 FD EE FF 84 01 E0 FE 9C 00 64 FF A4 00 7A 01 96 FE BC FF 44 00 86 FF E0 FE F8 01 8A FF 22 FF 00 02 12 00 5A FE 70 00 18 00 B0 FE D6 01 B6 FE 1A FF E6 02 6E FE 22

One thousand messages exactly.

File for all threads:

052234: Jan 19 12:15:02.576:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 00 00 08 00 38 00 78 00 B8 00 18 00 18 FF F8 FF C8 FF F8 FF E8 FF B8 FF 58 00 08 FF 98 FF A8 00 58 00 48 00 08 FF 88 FF C8 00 28 00 48 00 68 00 98 00 68 00 A8 00 B8 00 18 FF 98 FF 68 FF C8 FF A8 FF 88 FF 78 FF 58 00 38 FF C8 FF 78 00 48 00 48 00 38 00 88 01 78 00 78 FF D8 00 38 FF A8 FF 78 FF E8 00 28 FF A8 FF 98 00 88 00 A8 00 18 FF 88 FF E8 00 78 FF E8 FF B8 00 58 00 58 FF 48 FF C8 00 48 00 08 FF C8 00 38 00 78 FF D8 00 28 00 68 00 18 FF B8 FF 78 FF C8 00 68 00 38 FF D8 FF A8

[...]

055211: Jan 19 12:15:12.516:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 01 FF A8 01 28 FE 28 00 18 FF 78 FE E8 01 A8 00 08 02 10 00 48 FE C8 00 28 00 98 00 38 FF 68 FF 18 FE C8 00 A8 FF 68 01 98 00 08 FF C8 01 18 00 B8 FF E8 FF 08 FF F8 FE 18 00 28 FF 68 FE F8 00 E8 00 78 00 F8 00 78 FE E8 00 78 FE D8 00 48 01 A8 00 E8 00 B8 FF F8 FF C8 FF 58 FF 08 00 C8 FF B8 FF 58 01 D8 FE 78 00 08 FF 38 FE 28 FF E8 00 C8 FF 88 01 48 00 D8 00 68 00 08 FF 68 FF D8 00 68 FF B8 FF 58 00 78 00 08 00 68 FF A8 01 18 FE 58 00 18 00 28 FF E8 00 F8 00 28 01 08 00 B8 FE E8
055212: Jan 19 12:15:12.520:  len=172, ch_id=1, pak_id=143, proc_id=0, <==  Payload:  00 07 00 02 FF AC 01
055234: Jan 19 12:15:32.851: %ISDN-6-DISCONNECT: Interface Serial0/3/0:0  disconnected from 2606699 , call lasted 372 seconds

Here the picture is somewhat different. Obviously, for three message flows there will be three times more than for one. However, in the full message log taken as an example, there are only 2978 pieces, plus one cut off at the very end. Some sources indicate that in cases when diagnostic messages arrive in the buffer faster than the router can output them to the log, some messages may be lost in whole or in part.

You can note that in each message, the data begins with the same sequence of bytes (R_in: 00 07 00 00, S_in: 00 07 00 01, S_out: 00 07 00 02; the file for all streams contains messages with all the indicated sequences).

Obviously, these four bytes represent a certain service header, which, when considering, can be safely ignored, since it simply indicates the stream to which the message belongs. This assumption is in good agreement with the fact that for a ten-second interval, a thousand messages with 160 bytes of useful information each fall into the log, that is, 160,000 bytes of useful information. With a sampling frequency of 8000 Hertz, a time interval of 10 seconds, and a sampling depth of 16 bits for one audio channel (mono), this is the amount of data that is obtained.

First rough processing


Using a text editor that supports regular expressions and a regular expression (......:). * (<== Payload: 00 07 00 01) I delete the lines (050234: Jan 19 12: 12: 58.636: len = 172 , ch_id = 1, pak_id = 143, proc_id = 0, <== Payload: 00 07 00 01) from the corresponding file. Exactly 1000 replacements occur. After converting ASCII data to binary in a hexadecimal editor (inserting an ASCII-HEX buffer into the file), we get a file of exactly 160,000 bytes in size.

It should be noted that the following occurs. The data goes to the router’s log in the same order in which it is stored in its memory (big-endian order). That is, for example, the value 0x1234 is stored in two consecutive cells: 12 34. In this form, we copy the text to the buffer, in the same order the bytes are in the binary file.

I dare to remind the reader that in the memory of personal computers with x86-processors the recording order from the lower byte to the highest byte (English little-endian, literally: “spiky”) is adopted, and therefore it is sometimes called the Intel order (by the name of the creator x86 architecture).

At the same time, byte order from high to low is standard for TCP / IP protocols (English big-endian, literally: "blunt"). It is used in data packet headers and in many higher-level protocols designed for use over TCP / IP. Therefore, the byte order from high to low is often called the network byte order. This byte order is used not only by the IBM 360/370/390, Motorola 68000, SPARC processors (hence the third name is the Motorola, Motorola byte order byte order), but also by the processors of some Cisco routers.

Next, the resulting RAW file is imported into the audio editor. As a result, it is possible to get the correct sound file when specifying the following parameters: Signed 16 bit, Big-endian, 1 channel (mono), frequency 8000 Hz.

All is audible!

This is definitely the first little success. It should be developed - try to understand the structure of the .DAT file, after which everyone can write their own program in their favorite programming language, which will extract sound from this file.

Upon closer inspection of the file, I found that the sequence 00 07 00 0X, shifted by F4 bytes relative to each other, also alternates in the file. A two-byte word with the value 00 F4 (big-endian, do not forget this) is present in all cases with an offset of -0x0E relative to the sequence 00 07 00 0X. We can assume that it refers to the header of the data packet and indicates the size of the current packet (the offset of the next packet header relative to the header of the current packet).

Having examined the last voice packet in the file, you can see that for F4 bytes from the end of the file there is a sequence 00 00 00 00. Suppose that it marks the beginning of the packet. Thus, the word with the value 00 F4 is located at offset 0x42 from the beginning of the voice packet. At the same time, for each voice packet in the file, exactly 0xA0, that is 160 bytes of information, is placed between the sequences 00 07 00 0X and 00 00 00 00, which is in good agreement with the dependencies established during the analysis of the text log. It can be assumed that this is 160 bytes of payload - voice data. In the illustrations, I highlight these bytes in magenta to indicate their importance.

Let's analyze the headers:



The assumed position of the identifiers len, ch_id, pak_id, proc_id is highlighted in blue (recall the text log: 792308: Jan 22 16: 30: 42.458: len = 172, ch_id = 1, pak_id = 143, proc_id = 0, <== Payload :), the orange value in the analyzed packets matches ch_id.

The value 172 = 0xAC, located at offset 0x48, corresponds to the size of a piece of data from this offset to the end of the packet.

However, we still have an unidentified array of bytes containing overhead information at the very beginning of the file. It is necessary to identify patterns, otherwise the decision cannot be considered acceptable.



Two lines ending with zero are found containing text information about the router (in the figure they are highlighted in yellow-green):

1) text 28.3.14 (corresponds to DSPWARE VERSION from the output of the show voice dsp detail command) is located at offset 0x5C from the beginning of the file;

2) the text C2800NM-ADVENTERPRISEK9-M, 15.1 (4) M10, IP | SLA | IPv6 | IS-IS | FIREWALL | VOICE | PLUS | QoS | HA | NAT | MPLS | VPN | LE) is located at offset 0xC0 from the beginning of the file .

You can also try to find in this file some particular values ​​related to a specific router (such as MACs and serial numbers) or a packet (CRC). In the meantime, we should not go into research, since these data are not of particular interest to us.

At offset 0x42, the file contains the word 01 24, and at offset 0x0124, the first voice packet begins. This might be a coincidence, however, checking on a number of files allows us to conclude that the pattern is confirmed for all files and packages in them. This is also true for the unknown purpose and content of packets that are clearly voiceless.

So, the structure of the voice packet:

packet_sign		[0x00]	0x04 байта	всегда 	00 00 00 00
UNKNOWN			[0x04] 	0x3E байтов		
packet_size		[0x42]	0x02	байта
UNKNOWN			[0x44] 	0x04	байта	всегда 	00 00 00 00
len				[0x48]	0x02 байта
ch_id			[0x4A]	0x02 байта
pak_id			[0x4C]	0x02 байта
proc_id			[0x4E]	0x02 байта
UNKNOWN			[0x50] 	0x02 байта	всегда 	00 07
stream_id			[0x52]	0x02 байта	R_in		00 00
									S_in		00 01
									S_out	00 02
raw_data			[0x54]	0xA0 байтов

Armed with the above, anyone can independently write in their favorite programming language a program that generates files containing a voice from a DAT file — at least RAW for subsequent import into the audio editor and processing, at least immediately WAV.

Total


Of course, this is only a partial decision, not claiming to be complete. For example, when using codecs other than G.711, identifying and unpacking audio data will be much more difficult. Surely other side effects will appear. For these reasons, I do not publish the code by which the automation of processing input data was achieved.

These assumptions made it possible to easily channelize the PCM data file received on dialpire and containing several simultaneous conversations with various ch_id. Thanks to this, it was possible to track and identify the problem DTMF, which was the whole cause.

In addition, as it turned out, the Chinese GSM gateway was connected via SIP. PBX subscribers heard tones only when making calls through this gateway. It was possible to identify that the gateway on which RFC2833 is turned on transmits strange packets (either they are generated by the gateway itself, or come from the operator’s network, but certainly not from the remote subscriber) packets that are the router (on which in turn, RFC2833 is also turned on) it is perceived as RTP NTE, after which it sends a full tone to the stream, which the PBX subscribers hear and to detect which all the manipulations that formed the basis of this material were performed.

Since the gateway, due to its extreme cheapness, does not allow collecting debugging information and it was not possible to determine the degree of its fault, it was just in case that the firmware of the gateway was updated to the current version. This did not help, the signal continued to appear. Next, the gateway was switched to SIP INFO mode (on Cisco ISR routers this mode is always on). No complaints yet.

Also popular now: