Zverik October 9, 2014 at 19:22

Talking radio on the knee

At some point, suffering from laziness to update the OpenStreetMap news blog, I decided to make a weekly broadcast about the project. Instead of three hours of extorting a text, an hour should discuss new things and current issues with familiar OSM participants. Great idea, rejoiced and parted. Six months later, with the words “well, damn it, stop messing around, let's call Skype the day after tomorrow”, I began to figure out how to record sound from a microphone and Skype in this Linux system, while transmitting it to the Internet. This story is about setting up PulseAudio, about skype and mumble, and about the amazing JACK. It turned out that raising your own radio with guests on the air is easier than drawing a logo for it.

Labyrinths of PulseAudio

Soon we decided - I’m hastily collecting the broadcast. The presenters are combined into a Skype group, respectively, you need to record the input and output of the sound card (since the sound from the microphone does not copy Skype into the headphones). About here I learned that PulseAudio is great: it has modules and a powerful application for routing sound, which for some reason is called the volume control.

What does the PulseAudio “constructor” look like for the user? For each sound card there is an input device ( source , usually a microphone) and an output device ( sink , usually speakers). Applications take sound from source and output to sink . But how then to record sound from the speakers, that is, to give the application sound from the sink? For this, all sinks have a monitor, which is also source. It turns out such a directed graph. But how to direct the sound from source to sink without an application in the middle? The module-loopback module is a virtual application that simply copies sound. Often it is connected to hear the microphone in the headphones (and woe to you, if you run the loopback from the microphone to the speakers - I managed to include this configuration in the autorun, and was very scared). We stick the headphones into the audio output and dial without any sudo:

pactl load-module module-loopback

This module will allow you to make the simplest recording scheme of your voice and co-hosts simultaneously:

To configure Skype, gstreamer (see below) and loopback to enter and exit, you need to start the corresponding sound consumers (for example, call the Skype “Sound Test Service”) and find the application in the "volume control" ( pavucontrol ). In the “Playback” tab, select sink (headphones in our case), in “Recording” - source (microphone). To see the loopback module in the window, select “All Streams” at the bottom.

In the picture I choose sourcefor skype. Two buttons on the right allow you to silence the input and adjust the volume of the left and right channels separately (by default they are connected). Be sure to check that the signal level in all tabs is 100%, and decrease only if necessary.

The most difficult thing in this scheme was to speak. Yes, it’s commonplace to pronounce coherent sentences. Because loopback occurs with a delay of 0.2 seconds: as if you were constantly interrupting yourself. Reflex - to shut up and let the one who is in the headphones talk. But he, too, is silent! Somehow it reduced this effect by drowning out one channel in the headphones.

Icecast Repeater

For live broadcasting on the Internet, you need a server like shoutcast, the most popular of which is open - icecast2. It is found in all popular Linux distributions. After installation, just edit /etc/icecast2/icecast.xmlit by writing your passwords in the "authentication" section and adjusting it just below "hostname". You also need to check that /etc/default/icecast2the ENABLE parameter is true. Next, run the icecast2 daemon and write it to autorun. You may need to open tcp ports (by default - 8000 and 8001).

You can see the status and statistics of the server by going to port 8000 with a browser. I lacked the distribution of listeners by country and city - for this I wrote a simple php script that uses the PECL GeoIP module.

Slow down, gstreamer writes

To send sound to the icecast server, strange people on the Internet put some muddy programs like Darkice or LiquidSoap, but in reality the usual gstreamer is enough. It’s very easy for them to record the sound from any source into an mp3 file (or ogg, to taste):

SOURCE=alsa_output.pci-0000_00_1b.0.analog-stereo.monitor
DATE=$(date +%y%m%d-%H%M)
gst-launch-0.10 pulsesrc device=$SOURCE ! audioconvert ! audio/x-raw-int,channels=1 ! lamemp3enc bitrate=64 cbr=true ! filesink location=radio-$DATE.mp3

Available Sources for SOURCE Variable Show Command

pactl list short sources

Although the source can simply be selected in the volume control. Other instructions from the network will tell you how to send sound to the server:

gst-launch-0.10 pulsesrc device=$SOURCE ! audioconvert ! audio/x-raw-int,channels=1 ! lamemp3enc bitrate=64 cbr=true ! shout2send ip=$IP port=8000 password=$PASSWORD mount=radio 'streamname=Beta Radio'

Finally, the thread mechanism allows you to do both this and that with one command:

gst-launch-0.10 pulsesrc device=$SOURCE ! audioconvert ! audio/x-raw-int,channels=1 ! lamemp3enc bitrate=64 cbr=true ! tee name=t ! queue ! shout2send ip=$IP port=8000 password=$PASSWORD mount=radio 'streamname=Beta Radio' t. ! queue ! filesink location=radio-$DATE.mp3

Voila! I note that when I forgot to translate the stream into mono, the computer could not cope with the encoding and sent some kind of slander to the Internet. So be sure to check the stream and the recorded file. You can run several gstreamers to write to files of different devices: for example, a microphone and skype separately, then to clean them separately and collect the podcast in the editor.

This is not an archiver, this is Audacity

The transfer has passed, you need to publish the podcast on the Internet. You can immediately, but not comme il faut: the track is noisy, the voices are quiet, pauses for ten seconds annoying. So there is no escaping black podcast work: launch Audacity and open a recording there. First you need to remove the void from the beginning and the end, then - especially gaping pauses and occasional NDA violations by guests. I noticed that the correct background music does not interfere with such shredding: the jumps in rhythm are almost invisible. So individual channel recordings have not yet come in handy.

Then we remove the clicks (Effect → Click Removal), find a quiet place without music and without breathing (for this it’s good to record the broadcast after the transfer, when the music is over) and we get a noise profile from these 0.5-1 seconds (Effect → Noise Removal → Get Noise Profile). And, dropping the selection, delete noise there. I'm not a professional, I did not change the default settings. I just check that the procedure has not deleted the useful frequencies - it has not made a voice dull or louder - and I am content with this.

Judging by dozens of instructions in Russian only, the most difficult part is sound compression (Effect → Compressor). There is nothing to do with archiving; it is a banal compression of a wave in amplitude. As I understand it, the algorithm takes wave fragments with an amplitude greater than the “Threshold”, and compresses these tops (not the entire wave) in the number of times specified in the “Ratio”. I am impatient, therefore I put 8: 1, but, they say, several approaches of 3-4 are better. Here are some of the related articles:

When saving to MP3 (File → Export → MP3 Files), select the “Preset / Standard” mode (this is very good quality, believe me), and then be sure to enter the tag values. I put the genre “Vocal”, although there are probably more suitable options.

VLC, thunderstorm DJ

This configuration allowed me to record a clumsy zero issue, but it didn’t fit for the first one: at the “lively cards evening” a week earlier, I recorded several interviews on the recorder and wanted to reproduce them in the program. Many also recommend that quiet music be played under the voices of the hosts, which not only masks long pauses and makes the transmission more dynamic, but also shows that the connection is not disconnected.

Music, of course, should be with an open license, the so-called podsafe . Many sites recommend musicalley , but it's creepy, so I went digging around on promodj. The requirements are simple: without vocals, without drums, preferably in the style of old consoles (I was impressed by the music from the video diaries of otaku.ru). Of course, I did not find good tracks from these data, but I collected a couple of acceptable ones. I let one in the background, the rest I turn on half an hour before the broadcast, so that early listeners do not get bored.

Since the radio is on my lap, the source of the music and the interview is the VLC player. In it, directly from the menu, you can specify in which sink to send sound. But here's the problem: interviewees on Skype must also hear the interviews in order to know what to discuss.

Back to Count

The second useful PulseAudio module is module-null-sink , a virtual device with input and output. It can be used as an intermediate: in my example, the microphone and VLC are added there. The mixed result gets Skype on input, and the co-hosts of the entire program listen to background music, and sometimes interviews. Let me remind you:

applications (in the diagram - green) take sound from source and / or output it to sink;
You can’t transfer sound from one application to another right away, you need an intermediate module-null-sink ;
You cannot transfer sound from source directly to sink, you need an intermediate module-loopback .

It turned out the scheme that I used to broadcast and record several episodes:

When the interview starts, the sound from Skype can be redirected to the headphones, and the microphone can either be muffled (the button in the volume control) or sent directly to Skype. Finally, having slightly complicated the scheme, you can simultaneously transfer the entire “sandwich” to the Internet, and only write the speech of the presenters to the file, which, after processing, put on the music files. Delighted with the PulseAudio routing capabilities, I wrote and debugged a script to prepare for the broadcast . True, I never used it in production: right after writing, I found the strength to take and configure JACK.

Blog on the knee

The main result of the weekly broadcasts is not fifteen satisfied listeners, but a podcast that hundreds of people listen to. You need a blog to publish and discuss it. Out of habit, I chose Birman e2 : simple and easy, even supports audio files out of the box.

Comments did not appear, and the blog did not solve the main problem, which I did not know about. Several listeners complained that the RSS blog does not accept their podcast application. It turns out that RSS records should contain links to audio files in enclosure and a dozen of their parameters in special tags. Egeya doesn’t know how to do this, but I didn’t want to put something heavy, so I made a copy of the podcast on rpod.ru: About 10% of the audience listen to us through this site, but it’s easy to subscribe there. And there is a volume control on the player.

It would be right not to show off and go in the footsteps of the German podcast about OSM, which rolled the Podlove plugin onto regular WordPress . The result is somewhat technically monstrous, but nice looking, and allows, for example, quick navigation within the podcast by tags. I found another option this week: the We Build podcast blog is made on GitHub Pages , i.e. on static Jekyll templates.

Publishing only on rpod is fraught: for example, I couldn’t upload the fifth program due to an idiotic error about exceeding the volume, which is not explained (files are only 40 megabytes), only a couple of disappointed replicas of the podcasters that encountered it.

JACK TIME

Sound processing experts in our chat room urged me to switch to the advanced audio subsystem JACK, and of course I could not resist, pushed by curiosity, the desire to get rid of the 200 ms loopback delay and hope for the charm of IDJC (see below). It turned out that 1) to work with JACK, you no longer need to remove or disable PulseAudio, and it can be turned on or off as desired - for example, only for recording a program. And 2) for some reason, all the necessary packages are already installed in Fedora out of the box. It remains only to install qjackctl and run it.

Of course, this panel immediately complained about the curve setting. He says, let's turn on realtime scheduling in your Linux. Again, it turned out that you do not need to change the kernel for this , just edit/etc/security/limits.conf. And again, someone in Fedor did it for me: judging by /etc/security/limits.d/95-jack.conf, it’s enough to add yourself to the jackuser group. And for this you need to log in. I never logged out in KDE, I had to google it.

Starting QjackCtl again. In the settings I specified the necessary microphone and output, and most importantly - in the “Misc” tab, turn on “Enable D-Bus interface”, well, the next checkmark, about turning off JACK on exit. It seems that this is all the audio subsystem settings, you can click Start - and start the coveted Internet DJ Console.

And finally, IDJC

Many people recommend this console for organizing a simple radio station. She is imprisoned for broadcasting music and mixes, but she knows how to properly handle the voice of a DJ. As an advanced console relies, it scares the number of buttons and levers:

In fact, everything, of course, is simple. Two music panels, below is a crossfader for switching between them (the “Pass” button switches smoothly). There is a DJ channel and a broadcast channel (“Stream”), they are separate: for example, you can turn off music for a DJ, or vice versa, check the second playlist while the first one plays on the radio. On the right are a bunch of indicators, the most interesting is the number under the headphones, the number of listeners. The buttons below enable the broadcast of the host’s voice and streams from other channels (in the picture - the final version, after Mumble, about which is further). Some descriptions and a link to the textbookon the project website .

All settings are hidden under the button in the lower left. In “Preferences” I turned on separate volume controls for playlists: in the first lay a quiet background melody repeated in a circle, in the second - pre-air songs and interviews. The “Channels” tab can be confusing: it simultaneously configures the buttons at the bottom of the player that open the channels, and these channels themselves. There you can see that the sound from the microphone is driven through a bunch of filters, including High Pass, limiter, noise gate, etc. If you spend time setting up (not only for yourself, but also for co-hosts), post-processing may not be necessary. I did not spend. To start, the first channel is enough, which is with a microphone.

Broadcast is configured in “Output”. There can be up to six (in different quality and formats). At the bottom there is a button for writing to a file. For some reason, it works for me only once per session; I have to restart it. What is not obvious in this window - to enable the broadcast, you need to open “Individual Controls” and click on the button with the server address there. Perhaps it’s easier to turn on the checkmarks above and click “Connect”, I don’t know.

This vermicelli is the QjackCtl window under the “Connect” button. The system section is a microphone and speakers, only Skype is broadcast in PulseAudio, so I did not make a separate sink for it. JACK does all this routing by default, I just added vectors from PulseAudio to the voip_in of the IDJC section. This makes the buttons with phones work: red is useless, but green mixes the sound from skype to broadcast. When you need to start an interview, I stretch out additional links from str_out to PulseAudio (banal drag and drop). To remove a connection, you need to select the source on the left, the receiver on the right, click on the last right button and “Disconnect”.

True, the new configuration did not solve the loopback delay problem. All the same 0.2 seconds. But it’s easier to lift, and somehow in an adult way.

Do you have a Go Mic for sale?

In less than a month, the lapel microphone completely threw me out with noise and low volume, and I decided to replace it. All the guides for newbies in the podcasts recommended taking a USB mic, and between the two-hundred-dollar Rode and the two-thousand- ruble Samson Go Mic I chose the latter. It is not difficult to find it in stores, it is more difficult to maintain a serious facial expression, dictating the name to the seller. It is about the size of a good lighter.

And, of course, in comparison with my past microphones - heaven and earth. The sensitivity is such that you can hear the purring of a cat sleeping in the next room, and the discharge of water three floors below. But you don’t need to scream, you do not need to hold the microphone near your mouth, and thanks to the USB connector, the cheap external sound card has gone to the shelf (my laptop does not have a microphone input). The pop-filter does not seem to be critical for this microphone, but you need to look for the stand.

Suddenly, the Go Mic solved a problem with the loopback delay in the headphones: it has its own sound card inside, which mixes the signal from the computer with the voice in the microphone. Since the signal does not scroll in the computer, the delay is zero. Finally, my own echo does not bother me. It was not easy to turn off loopback in IDJC: I found the checkboxes “In The DJ's Mix”, but if you press the VoIP button, the echo returns, and what to do with it is not clear. I decided to abandon the buttons with the phones by screwing Skype onto the fourth channel and making a button for it next to “DJ” - like for Mumble.

Mumble instead of Skype

The problem with voice conferences on Skype is unpredictability. The host with the broadcast will almost certainly not be the administrator, and the voices of the co-hosts will go long and hard. Someone accidentally disconnects - and takes with them all the interlocutors. Finally, Linux skype sometimes completely crashes just like that, unmotivated. There are several alternatives, I chose the most popular and open source: Mumble. Attracted by the presence of customers, not only for all desktop OSes, but also for iOS and Android.

The server, Murmur, is generally put in two counts: register in the /etc/murmur/murmur.iniserver password (default settings are adequate), start and register the daemon in autorun, open port 64738 for tcp and udp.

It’s more difficult with a client. After installation, of course, you need to go through the setup wizard. Mandatory. Do not forget to plug in the headphones, because there will be an echo. After a couple of restarts, I managed to connect to the server somehow, but to connect Mumble with JACK - no. It turned out that this subsystem is simply not supported. The corresponding patch was downloaded almost four years ago, but was never accepted. Therefore, you need to download the sources, roll the patch yourself and build the application.

It sounds complicated, but from my past with ArchLinux I remember AUR. This is an amazing repository, the best I've ever seen in Linux, and the only thing I regret when I left archa. AUR database entries are instructions for preparing the installation package from ~~dirt and~~ source ~~sticks~~ and patches, while for the user it’s no more complicated than usualpacman -S. For Mumble there is an assembly from a snapshot, an assembly from a guitar, an assembly from a guitar using the JACK patch and an assembly from the sources of the stable version with this patch. Open pkgbuild last , download the patch and follow the instructions from prepare()and build(). Assembly instructions (without a patch) are also in the mumble wiki . The main thing is not to forget about the package jack-...-devel. The assembled application did not install in /usr, I start from the assembly directory. In the settings you will need to enable JACK and disable all tinsel, such as echo and positional audio.

I can’t get rid of Skype: I want to leave it for receiving calls from listeners and for guests who do not have time to configure the mumble. The signal routing graph turned into an incomprehensible vermicelli:

I’ll try to explain. The sound from the microphone goes to channels 1 and 2 of IDJC (as usual), to the mumble and to Skype. Only dj_out from IDJC goes into the headphones. The mambla output is screwed to channel 3, skype to 4 (stereo in mono), no shamanism with voip. Finally, skype and mumble are connected in both directions so that the guest and the presenters hear each other.

What to do with the recordings of interviews that should be heard in skype and mumble? Alas, with your hands to extend a few more connections from str_out to and fro. And turn off the broadcast of all four channels in the IDJC, of course. Despite the complexity of the resulting system, I don’t want to roll back to PulseAudio: editing connections in QjackCtl is faster and more visible than in the “volume control”, and working in one IDJC is easier than coordinating VLC, gstreamer and “control”.

Headless radio

Even without regard to the technical part (I'm sure podcasting professionals will criticize me), there is room to develop the system. Now I do not like the dependence of the broadcast on my laptop. But what if I'm on the road, or I forgot my laptop? To do this, I plan to push a robot to the murmur server, which at hour H will include the recording of all the interlocutors in a file (as the Mumble desktop client can), and with a button somewhere on the web with a password, it will start and then stop broadcasting to the icecast server. The main thing is to keep the weekly schedule online, and you can collect the podcast later. In this configuration, I will be able to broadcast even from a smartphone.

Need a rod and pop filter for a microphone, yes. Still lazily poke connectors in JACK, you need to write a script. Set up microphone filters so that you don’t have to clean the recording from noise every time in Audacity. Finally, fix the RSS feed on the podcast blog. All these are trifles, the main thing is that we have taken and done, now our community has a predictable, regular broadcast, in which you can participate both through the IRC channel and in voice. Still, to hear the voices of the participants is not at all what to read them on the forum: the community seemed to have gained a new dimension, people became visible behind the words.

Tags: