Android NDK: working with OpenSL ES

Good day, the Guardians.

I previously wrote about OpenAL. Comrade zagayevskiy later wrote a good article on OpenSL ES. In one of our games, so as not to rewrite all the code for working with sound, we did not rewrite everything on OpenSL ES (with the port on Android). There were not many sounds used in the game, so there were no problems with OpenAL. But in the last game, we used a lot of sounds (the specifics of the game oblige), and here we faced a big problem (delays in playing - the smaller of them). It was decided to rewrite everything on OpenSL ES . To do this, I wrote a couple of wrappers, about which I have already talked. I decided to share this on the hub, maybe someone will come in handy.

  1. Short description of OpenSL ES .
  2. Audio content .
  3. A bit about wrappers .
  4. The principle of working with objects .
  5. Initialization of the library (context) .
  6. Work with sounds .
  7. PCM playback .
  8. Play compressed formats .
  9. Conclusion .
  10. Add. information .

OpenSL ES Short Description

This thing is available with Android API 9 (Android 2.3) and higher. Some features are only available in Android API 14 (Android 4.0) and higher. OpenSL ES provides a C language interface, which can also be called from C ++, which provides the same capabilities as the parts of the Android Java API for working with sounds:

Note : although it is based on OpenSL ES, this API is not a complete implementation of any profile from OpenSL ES 1.0.1.

Liba, as you might have guessed, is written in pure C. Therefore, there is no complete OOP there. Special structures are used (let's call them pseudo-object-oriented structures (:), which are ordinary C structures that contain pointers to functions that take pointers to the structure as the first argument. Something like this in C ++, but explicitly. In OpenSL ES are two kinds of structures:

  • An object ( SLObjectItf) is an abstraction of a set of resources designed to perform a certain range of tasks and store information about these resources. When creating an object, its type is determined, which determines the range of tasks that can be solved with its help.
  • Interface ( SLEngineItf, SLSeekItf etc.) is an abstraction of a set of interconnected functionalities provided by a particular object. The interface includes many methods used to perform actions on an object. An interface is of a type defining an exact list of methods supported by that interface. An interface is defined by its identifier, which can be used in the code to refer to the type of interface (for example SL_IID_VOLUME, SL_IID_SEEK). All the constants and names of the interfaces are pretty obvious, so there should not be any special problems.

To summarize : objects are used to allocate resources and obtain interfaces. And only then with the help of these interfaces we work with the object. One object can have several interfaces (for changing the volume, for changing the position, etc.). Depending on the device (or object type), some interfaces may not be available. I will say in advance that you can stream audio from your assets directory using SLDataLocator_AndroidFDone that supports an interface for moving a position along a track. At the same time, you can load the entire file into the buffer (using SLDataLocator_AndroidFD), and play from there already. But this object does not support the interface SL_IID_SEEK, so moving around the track will not work = /

Audio content

There are many ways to pack audio content into an application:
  • Resources . By placing audio files in res / raw / directories, you can easily access them using the API for Resources . However, there is no direct native access to these resources, so you have to copy them from Java code.
  • Assets . By placing audio files in the assets / directory , you can access them from C ++ using the native manager. See the android / asset_manager.h and android / asset_manager_jni.h headers for more information.
  • Network . You can use the URI data locator to play audio directly from the network. Do not forget about the necessary permissions for this (:
  • Local file system . The URI data locator supports the file: scheme for accessing local files, provided that the files are available to the application (well, that is, reading files from the internal storage of another application will fail). Please note that in Android, access to files is limited by the Linux user ID and group ID mechanisms.
  • Record . Your application can record audio from a microphone, save content, and later play.
  • Compiled and linked inline . You can directly push audio content into the library and then play with buffer queue data locator. This is very suitable for short songs in PCM format. PCM data is converted to hex string using the bin2c tool.
  • Real time generation . An application can generate (synthesize) PCM data on the fly and then play it back with a buffer queue data locator.

A bit about my wrappers

I’m generally a fan of OOP, so I try to somehow group a certain functional of C-methods and wrap my classes so that it is convenient to work in the future. By analogy with the way I did this for OpenAL , classes appeared:

  1. OSLContext. He is responsible for initializing the library and instantiating the desired buffers.
  2. OSLSound. The base class for working with sounds.
  3. OSLWav. Class for working with WAV. Inherited from OSLSound to maintain a common interface for operation. To work with ogg, you can then create the OSLOgg class, as I did in OpenAL. I made such a distinction, since the download process is fundamentally different for these formats. WAV is a clean format, it’s enough to read bytes there, ogg still needs to be decompressed using Ogg Vorbis , I’m generally silent about mp3 (:
  4. OSLMp3. Class for working with Mp3. Inherited from OSLSound to maintain a common interface for operation. The class doesn’t practically realize anything at all because I stream mp3. But if you want to decode mp3 using some lame or something else, then in the load (char * filename) method you can implement decoding and use BufferPlayer.
  5. OSLPlayer. Actually, the main class for working with sound. The fact is that the mechanism of work in OpenSL ES is not the same as in OpenAL. OpenAL has a special structure for the buffer and the sound source (on which we hang the buffer). In OpenSL ES, everything revolves around players that are different.
  6. OSLBufferPlayer. We use this player when we want to load the entire file into memory. Typically used for short sound effects (shot, explosion, etc.). As already said, it does not support the interface SL_IID_SEEK, so moving around the track will not work.
  7. OSLAssetPlayer, allows you to stream from the assets directory (that is, do not load the entire file into memory). Use to play long tracks (background music, for example).

The principle of working with objects

The whole cycle of working with objects is something like this:
  1. Get an object by specifying the desired interfaces.
  2. Implement it by calling (*obj)->Realize(obj, async).
  3. Get the required interfaces by calling (*obj)-> GetInterface (obj, ID, &itf)
  4. Work through interfaces.
  5. Delete an object and clear used resources by calling (*obj)->Destroy(obj).

Initializing the library (context)

First we need to add a section LOCAL_LDLIBS file in jni directory flag lOpenSLES: LOCAL_LDLIBS += -lOpenSLESand two headers to connect file:

Now you need to create an object through which we will work with the library (something similar to the context in OpenAL) using the method slCreateEngine. The resulting object becomes the central object for accessing the OpenSL ES API. Next, initialize the object using the method Realize.
result = slCreateEngine(&engineObj, //pointer to object
		0, // count of elements is array of additional options
		NULL, // array of additional options
		lEngineMixIIDCount, // interface count
		lEngineMixIIDs, // array of interface ids
if (result != SL_RESULT_SUCCESS ) {
	LOGE("Error after slCreateEngine");
result = (*engineObj)->Realize(engineObj, SL_BOOLEAN_FALSE );
if (result != SL_RESULT_SUCCESS ) {
	LOGE("Error after Realize");

Now you need to get an interface SL_IID_ENGINEthrough which you will be able to access the speakers, play sounds, etc.
result = (*engineObj)->GetInterface(engineObj, SL_IID_ENGINE, &engine);
if (result != SL_RESULT_SUCCESS ) {
	LOGE("Error after GetInterface");

It remains to get and initialize the OutputMix object for working with speakers using the method CreateOutputMix:
result = (*engine)->CreateOutputMix(engine, &outputMixObj, lOutputMixIIDCount, lOutputMixIIDs, lOutputMixReqs);
	if(result != SL_RESULT_SUCCESS){
			LOGE("Error after CreateOutputMix");
	result = (*outputMixObj)->Realize(outputMixObj, SL_BOOLEAN_FALSE);
	if(result != SL_RESULT_SUCCESS){
			LOGE("Error after Realize");

In addition to initializing the main objects in the constructor of my wrapper OSLContext, all the necessary players are initialized. The maximum possible number of players is limited. I recommend creating no more than 20.
void OSLContext::initPlayers(){
	for(int i = 0; i< MAX_ASSET_PLAYERS_COUNT; ++i)
		assetPlayers[i] = new OSLAssetPlayer(this);
	for(int i = 0; i< MAX_BUF_PLAYERS_COUNT; ++i)
			bufPlayers[i] = new OSLBufferPlayer(this);

Working with sounds

In fact, you can divide into two categories the types of sounds: clean (not compressed data) PCM, which are contained in WAV and compressed formats (mp3, ogg, etc.). Mp3 and ogg can be decoded and get the same uncompressed PCM audio data. For work with PCM we use BufferPlayer. For compressed AssetPlayer formats, since decoding files will be quite expensive. If you take mp3, you won’t be able to decode it hardware on old phones, and using third-party software solutions it takes more than a dozen seconds to decode, which, you see, is not acceptable. In addition, such PCM data will weigh too much.

When the method is called, player () requests a free player from the context ( OSLContext). If sound looping is necessary, then we get OSLAssetPlayer, in another case OSLBufferPlayer.

PCM Play

I will not write about reading WAV itself again, you can see about this in an article about OpenAL. In the same article I will tell how to create BufferPlayer using the received PCM data.

Initializing BufferPlayer for PCM
locatorBufferQueue.numBuffers = 16;
// описание формата аудио, об этом чуть ниже расскажу
SLDataFormat_PCM formatPCM;
formatPCM.formatType = SL_DATAFORMAT_PCM;
formatPCM.numChannels = 2;
formatPCM.samplesPerSec = SL_SAMPLINGRATE_44_1;// header.samplesPerSec*1000;
formatPCM.bitsPerSample = SL_PCMSAMPLEFORMAT_FIXED_16 ;//header.bitsPerSample;
formatPCM.containerSize = SL_PCMSAMPLEFORMAT_FIXED_16;// header.fmtSize;
audioSrc.pLocator = &locatorBufferQueue;
audioSrc.pFormat = &formatPCM;
locatorOutMix.locatorType = SL_DATALOCATOR_OUTPUTMIX;
locatorOutMix.outputMix =  context->getOutputMixObject();
audioSnk.pLocator = &locatorOutMix;
audioSnk.pFormat = NULL;
 // создание плеера
const SLboolean req[2] = {SL_BOOLEAN_TRUE,SL_BOOLEAN_TRUE};
result = (*context->getEngine())->CreateAudioPlayer(context->getEngine(),
		&playerObj, &audioSrc, &audioSnk,2, ids, req);
assert(SL_RESULT_SUCCESS == result);
result = (*playerObj)->Realize(playerObj, SL_BOOLEAN_FALSE );
assert(SL_RESULT_SUCCESS == result);
if (result != SL_RESULT_SUCCESS ) {
	LOGE("Can not CreateAudioPlayer %d", result);
	playerObj = NULL;
// получение интерфейса 
result = (*playerObj)->GetInterface(playerObj, SL_IID_PLAY, &player);
assert(SL_RESULT_SUCCESS == result);
// получение интерфейса для работы с громкостью
result = (*playerObj)->GetInterface(playerObj, SL_IID_VOLUME, &fdPlayerVolume);
assert(SL_RESULT_SUCCESS == result);
result = (*playerObj)->GetInterface(playerObj,       SL_IID_ANDROIDSIMPLEBUFFERQUEUE, &bufferQueue);
assert(SL_RESULT_SUCCESS == result);

In general, there is nothing complicated. That's just there is a HUGE problem. Pay attention to the structure SLDataFormat_PCM. Why did I explicitly fill in the parameters myself, and did not read the wave file from the headers? Because I have all the WAV files in a single format, i.e. the same number of channels, frequency, bit rate, etc. The fact is that if you create a buffer and specify 2 channels in the parameters, and try to play a track with 1 channel, the application will crash. The only option is to reinitialize the entire buffer if the file has a different format. But the whole charm is just that we initialize the player 1 time, and then just change the buffer on it. Therefore, there are two options, either to create several players with different parameters, or to bring all your .wav files to the same format. Well, or reinitialize the buffer every time -_-

In addition to the interface for volume, there are two other interfaces:

  • SL_IID_MUTESOLO for channel control (for multi-channel audio only, this is indicated in the numChannels field of the SLDataFormat_PCM structure).
  • SL_IID_EFFECTSEND for applying effects (according to the specification - only the reverb effect).

Adding sound to the queue when choosing a player and installing sound on it:
void OSLBufferPlayer::setSound(OSLSound * sound){
	if(bufferQueue == NULL)
		LOGD("bufferQueue is null");
	this->sound = sound;
	(*bufferQueue)->Enqueue(bufferQueue, sound->getBuffer() , sound->getSize());

Play compressed formats

In WAV, storing all sounds is not an option. And not because the files themselves take up a lot of space (although this too), just when you load them into memory, there simply will not be enough RAM for this (:

I will create classes for each of the formats so that in the future, if necessary, write part on decoding them. For mp3 there is a class OSLMp3that, in fact, only stores the file name in order to be installed on the player in the future. The same can be done for ogg and other supported formats.

I will give a complete initialization method, explanations in the comments .

Initializing AssetPlayer for working with compressed formats
	void OSLAssetPlayer::init(char * filename){
	SLresult result;
	AAsset* asset = AAssetManager_open(mgr, filename, AASSET_MODE_UNKNOWN);
	if (NULL == asset) {
		return JNI_FALSE;
	// открываем дескриптор
	off_t start, length;
	int fd = AAsset_openFileDescriptor(asset, &start, &length);
	assert(0 <= fd);
	// настраиваем данные по файлу
	SLDataLocator_AndroidFD loc_fd = {SL_DATALOCATOR_ANDROIDFD, fd, start, length};
	SLDataSource audioSrc = {&loc_fd, &format_mime};
	SLDataLocator_OutputMix loc_outmix = {SL_DATALOCATOR_OUTPUTMIX,  context->getOutputMixObject()};
	SLDataSink audioSnk = {&loc_outmix, NULL};
	// создаём плеер
	const SLInterfaceID ids[3] = {SL_IID_SEEK, SL_IID_MUTESOLO, SL_IID_VOLUME};
	result = (*context->getEngine())->CreateAudioPlayer(context->getEngine(), &playerObj, &audioSrc, &audioSnk,
			3, ids, req);
	assert(SL_RESULT_SUCCESS == result);
	// реализуем плеер
	result = (*playerObj)->Realize(playerObj, SL_BOOLEAN_FALSE);
	assert(SL_RESULT_SUCCESS == result);
	// получаем интерфейс для работы со звуком
	result = (*playerObj)->GetInterface(playerObj, SL_IID_PLAY, &player);
	assert(SL_RESULT_SUCCESS == result);
	// получение интерфейса для перемещения по файлу
	result = (*playerObj)->GetInterface(playerObj, SL_IID_SEEK, &fdPlayerSeek);
	assert(SL_RESULT_SUCCESS == result);
	// получение интерфейса для управления каналами
	result = (*playerObj)->GetInterface(playerObj, SL_IID_MUTESOLO, &fdPlayerMuteSolo);
	assert(SL_RESULT_SUCCESS == result);
	// получение интерфейса для управления громокстью
	result = (*playerObj)->GetInterface(playerObj, SL_IID_VOLUME, &fdPlayerVolume);
	assert(SL_RESULT_SUCCESS == result);
	// задаём необходимо ли зацикливание файла
	result = (*fdPlayerSeek)->SetLoop(fdPlayerSeek, sound->isLooping() ? SL_BOOLEAN_TRUE : SL_BOOLEAN_FALSE, 0, SL_TIME_UNKNOWN);
	assert(SL_RESULT_SUCCESS == result);
//	return JNI_TRUE;


OpenSL ES is easy enough to learn. And he has a lot of opportunities (for example, you can record audio). It is only a pity that the problems are cross-platform. OpenAL is cross-platform, but on Android it does not behave very well. OpenSL has a couple of minuses, strange behavior of callbacks, not all specification features are supported, etc. But in general, ease of implementation and stable operation cover these disadvantages.

Sources can be taken on

Add. infa

Interesting reading on the topic:
  1. The Standard for Embedded Audio Acceleration on the developer's site.
  2. The Khronos Group Inc. OpenSL ES Specification .
  3. Android NDK. Application development for Android in C / C ++.
  4. Gg vorbis

Also popular now: