NeoNN April 1, 2013 at 01:10

Video recording using Directshow.NET

Tutorial

Good afternoon, dear Habrausers. Some time ago I had to work on a simple windows-application, which required to produce audio and video from various devices. In particular, it was necessary to capture audio from six channels of the MAudio card, and hd video capture from two AverMedia capture cards, the signal to which came from video cameras through the component input. Also, it was necessary to take screenshots from a document camera connected via a USB interface. The application was decided to write in C #, and the video was produced using the DirectShow.NET library.

Based on the solution to this problem, the idea arose to write an article and share experience regarding video capture. Maybe this information will be useful to someone. Who cares - I ask for a cat.

Instead of a preface.

Although MediaFoundation is being increasingly used to perform such tasks , this platform, in my opinion, is not yet widespread, even taking into account the fact that Microsoft will gradually abandon the use and support of DirectShow in new versions of Windows, starting from the 8th. There are various computer vision libraries that support video recording, such as OpenCV or AForge, but with simple video capture, their powerful processing functionality is not very necessary, and inside such libraries can often use DirectShow.

On the Internet there are quite a few articles and materials about what DirectShow is and how it works, and the information in this article skipped on Habré, so I’ll try not to flaunt terms that I myself don’t know, but consider everything from the practical point of view - how a person who has not previously been familiar with Directshow can write their video recording application in C #, where to start and where to move, and I will talk about the problems that I had to face.

For an example ( see the code on GitHub ) for this article I will use a simple usb capture card EasyCap:

1. Where to start. Requirements, Tools, and Information

The tools that you need are:

1) K-Lite Codec Pack Mega and GraphStudio tool - for rapid prototyping of a video capture graph.

2) GraphEditPlus - a commercial analog of GraphStudio, which allows you to generate code. designed graph in C ++ and C #. A 30-day trial version is available, the limitation of which is that the generated code cannot be copied to the clipboard.
3) Development environment in C # - in my case it will be Visual Studio 12.
4) DirectShow.net library .
5) Windows MediaLib library .

Unfortunately, it was not possible to find a complete and structured guide on how to make a video recording application on the Internet, but some of the pages provided truly invaluable help, first of all:

1) A small page with which information became a catalyst for the whole process. You can also find clear descriptions of the classes and interfaces of DirectShow.net there. A very useful page and many thanks to its author.
2) Open source code like this , which helped deal with crossbars and other issues.
3) MSDN, which has a whole section devoted to DirectShow programming.

2. Filters, video graph creation and visual editor

DirectShow graphs are constructed from filters that are connected to each other by input and output pins.
More details about this are here .

For the simplest case, you can build a graph in GraphStudio, for example, for an integrated video camera as follows:

But for our task, several filters are required, and the graph (taking into account the recording in WMV format) should look like this:

Filters on this graph:

SMI Grabber Device (filter group - WDM Streaming Capture Devices) - a filter that is a capture device, it is from it that we receive video (as well as audio) streams. But in this case, not the audio stream coming from the capture device is recorded, but the stream from the microphone (Filter "Microphone ..."from the Audio Capture Sources group).

SM BDA Crossbar Filter is a crossbar filter for a capture device; it is its setting that determines the switching of the input signal, whether it comes from the SVideo input or from the composite input.

Smart Tee is a stream splitter with two outputs, the stream from the Capture output is written to a file, and the
stream from the Preview output goes to the preview window through the AVI Decompressor filter . It should be noted that the
AVI Decompressor -> Video Renderer chain is created automatically when you select the Preview -> Render Pin option.
(as a digression, I note that there are different types of renderer filters, and one of the most advanced is Enhanced Video Renderer, but the usual filter is used in this example)

WM ASF Writer is a filter that provides the easiest way to record a video file of the required quality into WMV format . At the same time, it is possible to change the recording quality, including the user one.

By running this graph, you can verify the correctness of the video source record.

3. DirectShow.net library and translation of the graph into code

3.1. Code Generation in GraphEditPlus

The next step is to translate the resulting graph into code. The GraphEditPlus tool provides invaluable assistance in this matter. In general, this graph editor is more convenient than GraphStudio from the K-Lite suite, but its most important feature is the ability to generate code for the constructed graph:

Unfortunately , this tool cannot customize the settings code for certain filters, such as Crossbar or WM ASF Writer, but as a first step, it is invaluable.

3.2. Video application

Again, the code for a simple application written specifically for this article can be viewed and downloaded here . I apologize in advance for its non-optimality and violation of SOLID, as this is just a test case.

In this application, the main operations on the graph (creating, destroying, searching for pins, crossbar routing, starting, stopping, etc.) are defined in the abstract VideoCapturebase class , and successor classes such as VideoCapturePreview , VideoCaptureAsfRecord or VideoCaptureScreenshots implement an abstract method of constructing a graph from filters BuildGraph(), adding new filters to the chain. Class ControlVideoCommon contains operations for creating a window and binding a graph to it, the operation of stopping and destroying a graph, as well as several other utilitarian operations.

3.3. Not always obvious points

3.3.1. Adding Devices

If there are several devices of the same type (several identical capture cards, for example),
then they will have the same guid, but different DisplayName parameters. In this case, you need to find all the devices using the following code:

private readonly List _captures = new List();
//...
//search for devices 
foreach (var device in DsDevice.GetDevicesOfCat(FilterCategory.VideoInputDevice))
{
	if (device.Name.ToLower().Contains(deviceNamePart.ToLower()) 
	{
		_captures.Add(device);
	}
}
//full device paths that differ
var devicePath1 = _captures[0].DevicePath;
var devicePath2 = _captures[1].DevicePath;
//...

Further, when creating graphs, the paths devicePath1and devicePath2obtained by this method are used

3.3.2. Crossbar routing

The video capture device may or may not have a crossbar for using different types of video inputs (for example, AverMedia and EasyCap from this example have, but the built-in webcam or BlackMagic capture card does not). Therefore, it is necessary that binding to the crossbar is automatic.

To do this, in the base class, a method is defined
FixCrossbarRouting(ref IBaseFilter captureFilter, PhysicalConnectorType? physicalConnectorType)that searches and connects the crossbar (if any) with switching to the required input type:

/// 
/// Configure crossbar inputs and connect crossbar to input device
/// 
/// Filter to find and connect crossbar for
/// crossbar connector type to be used
/// 
protected int FixCrossbarRouting(ref IBaseFilter captureFilter, PhysicalConnectorType? physicalConnectorType)
{
    object obj = null;
    //fixing crossbar routing
    int hr = CaptureGraphBuilder.FindInterface(FindDirection.UpstreamOnly, null, captureFilter,
                                        typeof(DirectShowLib.IAMCrossbar).GUID, out obj);
    if (hr == 0 && obj != null)
    {
        //found something, check if it is a crossbar
        var crossbar = obj as IAMCrossbar;
        if (crossbar == null)
            throw new Exception("Crossbar object has not been created");
        int numOutPin;
        int numInPin;
        crossbar.get_PinCounts(out numOutPin, out numInPin);
        //for all output pins
        for (int iOut = 0; iOut < numOutPin; iOut++)
        {
            int pinIndexRelatedOut;
            PhysicalConnectorType physicalConnectorTypeOut;
            crossbar.get_CrossbarPinInfo(false, iOut, out pinIndexRelatedOut, out physicalConnectorTypeOut);
            //for all input pins
            for (int iIn = 0; iIn < numInPin; iIn++)
            {
                // check if we can make a connection between the input pin -> output pin
                hr = crossbar.CanRoute(iOut, iIn);
                if (hr == 0)
                {
                    //it is possible, get input pin info
                    int pinIndexRelatedIn;
                    PhysicalConnectorType physicalConnectorTypeIn;
                    crossbar.get_CrossbarPinInfo(true, iIn, out pinIndexRelatedIn, out physicalConnectorTypeIn);
                    //bool indication if current input oin can be connected to output pin
                    bool canRoute = physicalConnectorTypeIn == physicalConnectorType;
                    //get video from composite channel (CVBS)
                    //should output pin be connected to current input pin
                    if (canRoute)
                    {
                        //connect input pin to output pin
                        hr = crossbar.Route(iOut, iIn);
                        if (hr != 0) throw new Exception("Output and input pins cannot be connected");
                    }
                } //if(hr==0)
            } //for(iIn...)
        } //for(iOut...)
    } //if(hr==0 && obj!=null)
    return hr;
}

3.3.3. Release of resources

If you do not release the resources of the created graph when it is destroyed, then creating another instance of the graph that uses the same filters as in the first one will fail, therefore, you must call the method DisposeFilters()in which the filters are deleted from the destroyed graph. After some experiments, the following code worked fine.

if (Graph == null) return;
IEnumFilters ef;
var f = new IBaseFilter[1];
int hr = Graph.EnumFilters(out ef);
if (hr == 0)
{
    while (0 == ef.Next(1, f, IntPtr.Zero))
    {
        Graph.RemoveFilter(f[0]);
        ef.Reset();
    }
}
Graph = null;

3.3.4. Stream configuration (frame rate, resolution, etc.)

Video capture devices can produce different video stream configurations, between which you can switch. For example, an hd camera can produce both a 640 by 480 picture at 60 frames per second and an hd quality picture with a frame rate of 30 frames per second. For frame rates, there are even fractional digits like 29.97 frames per second. To configure such parameters, you need to create an object streamConfigObjectusing the FindInterfaceinterface method CaptureGraphBuilder2, bring it to the interface IAMStreamConfig, call the method GetFormatto get an object of type AMMEdiaType, get the header:

var infoHeader = (VideoInfoHeader)Marshal.PtrToStructure(mediaType.formatPtr, typeof(VideoInfoHeader));

and continue to perform operations on its parameters
AvgTimePerFrame,
BmiHeader.Width,
BmiHeader.Height
and others.

In the code, this can be seen in the methods ConfigureResolutionand ConfigureFramerateclass VideoCaptureAsfRecord.

3.3.5. Screenshots

In order to be able to take screenshots from the video stream, you must inherit the class in which the graph (VideoCaptureScreenshots) is built from ISampleGrabberCB, and redefine two methods - BufferCBand SampleCB.
SampleCBmay be empty, and in BufferCBcopying the resulting array:

if ((pBuffer != IntPtr.Zero) && (bufferLen > 1000) && (bufferLen <= _savedArray.Length))
{
    Marshal.Copy(pBuffer, _savedArray, 0, bufferLen);
}

as well as calling the handler:

_invokerControl.BeginInvoke(new CaptureDone(OnCaptureDone))

in which the
SetCallbackSamlpleGrabber'a method should be called

_iSampleGrabber.SetCallback(null, 0);

In the method, BuildGraphwhen you turn on the filter SampleGrabberin the chain, you should configure it, and do the tuning after adding the other filters (magic, but it doesn’t work otherwise). In the test case, the ConfigureSampleGrabberInitial()and methods are responsible for this ConfigureSampleGrabberFinal(). During initial setup, it is determined AMMEdiaType, and during final setup, installation VideoInfoHeaderand invocation of two methods ISampleGrabber: SetBufferSamples(false)and SetOneShot(false).
The first is necessary to disable the buffering of samples passing through the filter, and the second - so that the callback function of the screenshot can be pulled several times.

3.3.6. Wmv format, .prx and WindowsMediaLib files

In order to ensure acceptable recording quality, it is necessary to redefine the recording settings of the wmv file.
The easiest way to do this is to create a custom profile file with the extension .prx and override the parameters responsible for the quality of the stream in it. An example of this file in the code is good.prx.

To read the profile files and create a profile from them in the method ConfigProfileFromFile(IBaseFilter asfWriter, string filename), we used the WMLib class from the Team MediaPortal project, distributed under the GPL license. Once created, the profile is applied to ASF Writer through an ConfigureFilterUsingProfile(wmProfile)interface method IConfigAsfWriter.

Instead of an afterword or Big Problem that I had to face

Mpeg4Layer3, Codecs, AVIMux and audio and video synchronization

At the very beginning of the application development, the idea was to record video in Mpeg4 format, and sound in Layer3 format, combining all this with AVI MUX into a single file, as in the following graph:

where in place of the XVid Codec filter there could be any filter from video compressors in Mpeg-4. There were attempts to use both xvid and ffdshow, and some other filters, however, after several attempts to force the graph to record video, it became clear that not everything is as simple as it seems at first glance. There was a problem of breaking the recording some time after it began. The reason here, apparently, lies in the fact that when mixing video and audio in the AVI MUX container, the video and audio tracks are not automatically synchronized, and even with the adjustment of the correct frequency, the graph could stop at a random moment, while the recording was interrupted, and during playback it was possible notice that audio and video are out of sync.

Unfortunately, I can’t talk about the solution to this problem, since I had to deal with it in a radical way - by transferring to recording in wmv format using ASF Writer.

If this article is read by someone who has encountered and is familiar with this problem, I will be glad to advise.

Thank you very much for your attention and interest, I hope this article was not deadly boring, and I also hope that this material can be of practical use to someone.

Tags: