Sunday, January 13, 2019

week 5 post - Task for the week - explain my project to a newcomer to my community.

Hi,
My project is about video decoders and encoders in kernel space.
A codec is a compressed format for videos, just like zip is used to compress text/pdf/doc/whatever files, a codec is used to compress a video/audio file.
There are hardware chips that are dedicated for coding and decoding video, the code interacting with the hardware should be a driver in kernel space.
There is an intend to specify a uniform API of how userspace applications should interact with such drivers. In order to test the userpsace code, there is a driver called vicodec. This driver is implemented in software only so it does not need special hardware.
Having such POC driver is a good way to both decide for the correct API to publish and for userspace to test the code without a need of specific hardware.
The vicodec driver should behave just like a "real" driver. So for example most (all?) hardware needs to have the buffers of the video a multiple of some power of 2. So the vicodec should support that as well - whenever the client sets the video dimensions "width X height" the vicodec will round it to the closest multiple of (in vicodec case) 8.
The vicodec exports two pseudo files to userspace:  /dev/video0 - the encoder, and  /dev/video1 - the decoder.
There are two APIs one for encoding - compressing a raw video, and one for decoding - decompressing a compressed video.
The APIs are described as a state machine. Basically the application and the driver first agree on the video type and dimensions, then the application asks the driver to allocate buffers.
The userspace application and the driver exchange the buffers. The idea is that they both share the buffers  -  the buffers are allocated by the driver and the user then maps them to his memspace with mmap. They can then exchange buffers by queueing and dequeueing them into/from a queue.

There are two queues of buffers:
The Output queue - In this queue the buffers are filled with data by the usersapce application and then the application queue them in the queue. The driver can dequeue buffers from this queue and process them. (The term "Output queue" was a bit confusing for me since I expected that "output" means buffers that the driver send to userspace (The output of the driver) but actually its the opposite)

The Capture queue - This is the queue to which the driver queue buffers that it generated and the userspace can dequeue them and read them.

One thing to note is that both userspace and the driver queue and dequeue to/from both queues.
In the Capture queue for example, after the userspace dequeued a buffer in order to read it it should then queue the buffer back to the queue so the driver can reuse it.

Another notation related to codecs drivers in streaming - the exchange of buffers where userspace feeds the driver with output buffers and receive back capture buffers can happen only when the driver is "streaming" The userspace can command the driver to start and stop the streaming on each buffer separately. The buffers exchange can happen only when both queues are "streaming" this is like pressing the "Play" button.

When I started the project, the vicodec was already live and kicking but it lacked some features that were needed.

My first task was to add support to video dimensions that are not a multiple of 8 by rounding the dimensions. This included adding support in the API that allow userspace to crop the dimension back to the original.

The second task is to add support for the decoder to read the video dimensions from the header of the compressed video.
In the compressed video, each compressed frame starts with a header. The header has some information such as the dimensions of the frame and the colorspace. So the requirement is that the usespace doesn't have to know and negotiate with the driver about the video capture format/dimensions. The format and dimension is decided later when the driver starts to receive the compressed data with the header. Then there is a sequence called "source change event" where the driver informs the userspace about the dimensions and the user space should restart the streaming.

This is quite complicated to implement, since in addition to the correct behavior expected from userspace by the API, the driver should be peppered to any incorrect behavior and avoid crashes/memleaks etc. There are all kind scenarios that need to be considered.

Links to the APIs:
deocder API

encoder API