Raspberry Pi: Encode H.264 Live Video

In one of Itseez 's computer vision projects , we are using the Raspberry Pi to process the video stream from a webcam, and we recently encountered the problem of recording video to a flash card. The difficulty was that the CPU resources were consumed by other more important tasks, but still it was necessary to save the video. And there was no preference for which codec to compress and which format to use, if only this did not affect fps (the number of frames per second). Having tried a large number of software codecs from RAW to H.264 ( OpenCV wrapper was usedover FFmpeg), we came to the conclusion that nothing will come of it, because at high load, fps sagged from 20 to 5 frames per second, while the picture is black and white with a resolution of 320x240. A little googling, we found out that the Raspberry Pi processor has a hardware encoder with support for the H.264 standard (as far as I know, a license was purchased only for it). Plus, the interaction with the encoder was implemented according to the OpenMAX standard, so it was decided to take up writing code using OpenMAX and see what happens. It turned out, by the way, very not bad!

Below is a sample video before applying hardware acceleration:

.

OpenMAX (Open Media Acceleration) is a cross-platform API that provides a set of tools for hardware acceleration of video and audio processing and work with various multimedia systems, designed for use independently of the OS or hardware platform. I’ll make a reservation right away that the Raspberry Pi does not implement the “clean” OpenMAX IL (Integration Layer) API, but some adapted version for the Broadcom chip. Therefore, an attempt to reuse the code on another board may fail. In addition, it was decided to use the OpenMAX wrapper provided by Raspberry Pi developers - ilcient. In the Raspbian wheezy distributionalready by default there are ready-made libraries and examples of using OpenMAX, which are located in the / opt / vc / directory. The / opt / vc / src / hello_pi / libs / ilclient / subdirectory contains the source code for the wrappers over OpenMAX. These are ilclient.c ilclient.h and ilcore.c files.
Let's get back to the task. There is a camera image, single-channel (that is, black and white), with a resolution of 320x240, in our case it is the IplImage structure from OpenCV, and you need to save it in an AVI container, after running it through the H.264 codec. From here follow the following subtasks and the ways in which they were solved:

Before encoding it is necessary to bring the image to some color model, for example YUV420p, we will do this using the swscale module from the set of FFmpeg libraries version 0.7.1.
We encode the received buffer using OpenMAX, having previously configured it so that the input buffer containing the image in YUV420p will be the input buffer, and the output buffer with the image after processing it with the H.264 codec.
We save the compressed image in an AVI container using the same FFmpeg.

So on the points:

Conversion

Everything is simple here: we create a conversion context and two AVPicture structures. The first is for a single-channel image, the second is for YUV420p:

#define WIDTH 320#define HEIGHT 240
AVFrame *input_frame = avcodec_alloc_frame();
r = avpicture_alloc((AVPicture *) input_frame,
                    PIX_FMT_GRAY8,
                    WIDTH,
                    HEIGHT);
AVFrame *omx_input_frame = avcodec_alloc_frame();
r = avpicture_alloc((AVPicture *) omx_input_frame,
                    PIX_FMT_YUV420P,
                    WIDTH,
                    HEIGHT);
SwsContext *img_convert_ctx = sws_getContext(WIDTH,
                                             HEIGHT,
                                             PIX_FMT_GRAY8,
                                          WIDTH,
                                          HEIGHT,
                                             PIX_FMT_YUV420P,
                                             SWS_BICUBIC, NULL, NULL, NULL);

Conversion, respectively, is as follows:

avpicture_fill ((AVPicture *) input_frame,
           (uint8_t *) frame->imageData,
                PIX_FMT_GRAY8,
                WIDTH,
                HEIGHT);
buf->nFilledLen = avpicture_fill ((AVPicture *) omx_input_frame,
                                  buf->pBuffer,
                                  PIX_FMT_YUV420P,
                                  WIDTH,
                                  HEIGHT);
sws_scale(img_convert_ctx,
          (constuint8_t* const*)input_frame->data,
          input_frame->linesize,
          0,
          HEIGHT,
          omx_input_frame->data,
          omx_input_frame->linesize);

Where buf is the codec input buffer, and frame is IplImage * from the camera.

Coding

Here it’s more complicated, it’s especially important to correctly and in the right sequence initialize the encoder:

OMX_VIDEO_PARAM_PORTFORMATTYPE format;
OMX_PARAM_PORTDEFINITIONTYPE def;
COMPONENT_T *video_encode;
ILCLIENT_T *client;
OMX_BUFFERHEADERTYPE *buf; //входной буфер
OMX_BUFFERHEADERTYPE *out; //выходной буферint r = 0;
#define VIDEO_ENCODE_PORT_IN 200#define VIDEO_ENCODE_PORT_OUT 201#define BITRATE 400000#define FPS 25
bcm_host_init();
client = ilclient_init();
OMX_Init();
ilclient_create_component(client, &video_encode, "video_encode", 
                          (ILCLIENT_CREATE_FLAGS_T)(ILCLIENT_DISABLE_ALL_PORTS | 
                          ILCLIENT_ENABLE_INPUT_BUFFERS | 
                          ILCLIENT_ENABLE_OUTPUT_BUFFERS));
memset(&def, 0, sizeof(OMX_PARAM_PORTDEFINITIONTYPE));
def.nSize = sizeof(OMX_PARAM_PORTDEFINITIONTYPE);
def.nVersion.nVersion = OMX_VERSION;
def.nPortIndex = VIDEO_ENCODE_PORT_IN;
OMX_GetParameter(ILC_GET_HANDLE(video_encode), OMX_IndexParamPortDefinition, &def);
def.format.video.nFrameWidth = WIDTH;
def.format.video.nFrameHeight = HEIGHT;
def.format.video.xFramerate = FPS << 16;
def.format.video.nSliceHeight = def.format.video.nFrameHeight;
def.format.video.nStride = def.format.video.nFrameWidth;
def.format.video.eColorFormat = OMX_COLOR_FormatYUV420PackedPlanar;
r = OMX_SetParameter(ILC_GET_HANDLE(video_encode),
                   OMX_IndexParamPortDefinition, 
                     &def);

Here the client is created and the parameters of the input buffer are set: the height and width of the image, fps and color scheme. Port 200 is the input port defined by the developers for the video_encode component driver, 201 is the output port of this component. For other operations (video decoding, audio encoding, decoding, etc.), other ports are used accordingly.

memset(&format, 0, sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE));
format.nSize = sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE);
format.nVersion.nVersion = OMX_VERSION;
format.nPortIndex = VIDEO_ENCODE_PORT_OUT;
format.eCompressionFormat = OMX_VIDEO_CodingAVC;
r = OMX_SetParameter(ILC_GET_HANDLE(video_encode),
                   OMX_IndexParamVideoPortFormat, 
                     &format);
OMX_VIDEO_PARAM_BITRATETYPE bitrateType;
memset(&bitrateType, 0, sizeof(OMX_VIDEO_PARAM_BITRATETYPE));
bitrateType.nSize = sizeof(OMX_VIDEO_PARAM_PORTFORMATTYPE);
bitrateType.nVersion.nVersion = OMX_VERSION;
bitrateType.eControlRate = OMX_Video_ControlRateVariable;
bitrateType.nTargetBitrate = BITRATE;
bitrateType.nPortIndex = VIDEO_ENCODE_PORT_OUT;
r = OMX_SetParameter(ILC_GET_HANDLE(video_encode),
                     OMX_IndexParamVideoBitrate, &bitrateType);
ilclient_change_component_state(video_encode, OMX_StateIdle);

Above, the output buffer and bit rate are set. The parameter format.eCompressionFormat = OMX_VIDEO_CodingAVC just determines that the image will be encoded in H.264. The optimal bitrate was calculated manually, as described here: www.ezs3.com/public/What_bitrate_should_I_use_when_encoding_my_video_How_do_I_optimize_my_video_for_the_web.cfm .

ilclient_enable_port_buffers(video_encode, VIDEO_ENCODE_PORT_IN, NULL, NULL, NULL);
ilclient_enable_port_buffers(video_encode, VIDEO_ENCODE_PORT_OUT, NULL, NULL, NULL);
ilclient_change_component_state(video_encode, OMX_StateExecuting);

Next, turn on the buffers and put the driver in the execution state.
Actually, the encoding itself:

buf = ilclient_get_input_buffer(video_encode, VIDEO_ENCODE_PORT_IN, 1);
OMX_EmptyThisBuffer(ILC_GET_HANDLE(video_encode), buf);
out = ilclient_get_output_buffer(video_encode, VIDEO_ENCODE_PORT_OUT, 1);
OMX_FillThisBuffer(ILC_GET_HANDLE(video_encode), out);

Save video

Here, too, is nothing complicated for those who used FFmpeg. Initializing the context of the output format:

AVCodecContext *cc;
char *out_file_name; //имя файла с расширением .avi
AVOutputFormat *fmt;
AVFormatContext *oc;
AVStream *video_st;
av_register_all();
fmt = av_guess_format(NULL, out_file_name, NULL);
oc = avformat_alloc_context();
oc->debug = 1;
oc->start_time_realtime = AV_NOPTS_VALUE;
oc->start_time = AV_NOPTS_VALUE;
oc->duration = 0;
oc->bit_rate = 0;
oc->oformat = fmt;
snprintf(oc->filename, sizeof(out_file_name), "%s", out_file_name);
video_st = avformat_new_stream(oc, NULL);
cc = video_st->codec;
cc->width = WIDTH;
cc->height = HEIGHT;
cc->codec_id = CODEC_ID_H264;
cc->codec_type = AVMEDIA_TYPE_VIDEO;
cc->bit_rate = BITRATE;
cc->profile = FF_PROFILE_H264_HIGH;
cc->level = 41;
cc->time_base.den = FPS;
cc->time_base.num = 1;
video_st->time_base.den = FPS;
video_st->time_base.num = 1;
video_st->r_frame_rate.num = FPS;
video_st->r_frame_rate.den = 1;
video_st->start_time = AV_NOPTS_VALUE;
cc->sample_aspect_ratio.num = video_st->sample_aspect_ratio.num;
cc->sample_aspect_ratio.den = video_st->sample_aspect_ratio.den;

Next, open the file for recording and write the header and information about the content format:

avio_open(&oc->pb, out_file_name, URL_WRONLY);
avformat_write_header(oc, NULL);
if (oc->oformat->flags & AVFMT_GLOBALHEADER)
    oc->streams[0]->codec->flags |= CODEC_FLAG_GLOBAL_HEADER;
av_dump_format(oc, 0, out_file_name, 1);

The process of saving the encoded image:

AVPacket pkt;
AVRational omxtimebase = { 1, FPS};
OMX_TICKS tick = out->nTimeStamp;
av_init_packet(&pkt);
pkt.stream_index = video_st->index;
pkt.data= out->pBuffer;
pkt.size= out->nFilledLen;
if (out->nFlags & OMX_BUFFERFLAG_SYNCFRAME)
    pkt.flags |= AV_PKT_FLAG_KEY;
pkt.pts = av_rescale_q(((((uint64_t)tick.nHighPart)<<32) | tick.nLowPart), 
                       omxtimebase,
                       oc->streams[video_st->index]->time_base);
pkt.dts = AV_NOPTS_VALUE;
av_write_frame(oc, &pkt);
out->nFilledLen = 0;

The av_rescale_q function casts the codec timestamp to the corresponding frame timestamp in the container.
To build, you will need to include the following header files:

#include"opencv2/core/core_c.h"#include"opencv2/imgproc/imgproc_c.h"#include"libavcodec/avcodec.h"#include"libavformat/avformat.h"#include"libswscale/swscale.h"#include"libavutil/opt.h"#include"libavutil/avutil.h"#include"libavutil/mathematics.h"#include"libavformat/avio.h"#include"bcm_host.h"#include"ilclient.h"

Accordingly, you will also have to build or install FFmpeg and OpenCV, although there is nothing stopping you from using other libraries to save the video to a file. Files "bcm_host.h" and "ilclient.h" can be found in the subdirectories of the path / opt / vc /. ilclient.c and ilcore.с, in which the OpenMAX client code is located, are assembled together with the project.
The following libraries are required for linking:

-L/opt/vc/lib -lbcm_host -lopenmaxil -lbcm_host -lvcos -lvchiq_arm –lpthread

Well, plus you will need to specify the FFmpeg and OpenCV libraries, for example, as shown below:

-L/usr/local/lib -lavcodec -lavformat -lavutil -lswscale \
-L/usr/local/lib -lopencv_imgproc -lopencv_core

That, in fact, is all. I will only add that when using the built-in fps encoder of our system with the saved video function on and without it, they practically do not differ, while earlier when using software codecs fps fell by 40-60%. See for yourself:

Tags:

Raspberry Pi: Encode H.264 Live Video

Conversion

Coding

Save video

Also popular now: