Embedded RTP Network Media Streamer with Lip-Sync
A standards compliant RTP Media Server was needed to stream H.264 video and G.711 audio from a TI Davnici DSP based surveillance camera running under Linux. The existing infrastructure consisted of an encoder that compresses video using Baseline H.264 running under Linux. Ginngi was asked to implement an embedded RTP server capable of streaming synchronized audio and video over RTP with full lip synchronization.
After a thorough search we found that there was no existing streaming implementation available for the Davinci. Since RTP is a mature standard we opted to use the open-source LIVE555 package from LiveMedia as the base. First the software was ported to the ARM processor of the Davinci using the cross-compilation tools of the ARM running under Fedora 8. Then we developed C++ classes that implemented H.264 framing within the LIVE555 which itself interfaced to a Linux custom device driver. The device driver was written in C and interfaced with an FPGA H.264 encoder to get the raw video into the system. Smart buffer management was implemented to ensure real-time performance from the time the video entered the system on the interrupt to the point where the LIVE555 package framer class accessed it to build the frame for streaming.
The Lip synchronization algorithm was found to be non-trivial and was achieved by carefully implementing timestamps for each of the video and audio streams as well as the timestamps for the RTCP packets that defined the exact timing relationship between the wall clocks and the source media clocks. In this way precision lip-sync was achieved using the standard RTCP mechanism instead of dedicated NTP clocks that the customer had been using prior.
ginngi engineering