Video Big Data High-efficiency Processing

Release time：2017-10-05

With a rapid increase of imaging devices such as surveillance cameras and mobile phones, massive images and videos have been captured. The data volume even accounts for more than half of the total Internet traffic and digital universe. Thus, how to efficiently compress and analyze these image and video data has become a major technological challenge in the field of information technology. If there are no major technical breakthroughs, it is impossible to real-time aggregate images and videos from tens of millions cameras distributed in nation-wide, and impossible to realize full-time video analysis and recognition so as to meet with any emergence requirement. Consequently, the "massive data" cannot become the "big data".

Supported by grants from the National Basic Research Program of China and the National Natural Science Foundation of China, a team in our school proposed and implemented an efficient and effective multimedia big data processing framework, which was based on the feature analysis and compression towards machine analysis and recognition on images and videos. The main contributions are summarized as follows:

1) In visual feature descriptor compression, a high-efficient local visual feature extraction method was proposed, and several visual feature selective aggregation, scalable compression and extensible indexing technologies had been developed. These technologies can compress the visual feature for an image to 4KB and below meanwhile maintaining the matching precision, and enable the visual search from a database with ten millions of images on a single-core CPU with less than one second processing time. The team also led the development of MPEG Standard on Compact Descriptor for Visual Search (CDVS), with the standard number of ISO/IEC 15938-13. Meanwhile, a large-scale visual search system was developed and then successfully applied in 10 billion image search engine in some Internet Giants such as Baidu and Tencent.

2) For the first time, a systematic framework for scene video modeling and coding was invented. In this framework, several novel methods were proposed, including adaptive background modeling and updating, background-model-based prediction and reference modes, and cloud-based images and video coding. When plugged in the existing video coding standard such as H.264/AVC and HEVC/H.265, these technologies can double the video coding efficiency for surveillance videos. In addition, the team led the development of a national standard GB/T 20090.2-2013 and its corresponding international standard IEEE 1857-2013, which is recognized as the first international standard on video coding that was led by Chinese scientists. In addition, these technologies were also transferred to industries, which utilized them to develop several high-efficient surveillance video capturing devices and storage systems.

3) In precise object analysis and recognition, several novel algorithms were proposed to address the challenges of object detection, segmentation and recognition in complex scenes, including selective eigenbackground modeling, complementary saliency prior learning for salient object segmentation, and group-sensitive multiple kernel learning for visual object recognition. They were ranked as one of the best algorithms in the TRECVid surveillance event detection tasks from 2009 to 2012. In addition, the team also won the best paper award of EURASIP Journal on Image and Video Processing in 2015.

Totally, the team obtained more than 50 Chinese/US patents, published one academic book and 120 papers on referred journal and conferences, including 45 papers on top journals such as IEEE TPAMI/TIP/TCSVT/ TMM, and IJCV. They got more than 3000 other-citations in Google Scholar, including over 100 academicians, and IEEE/ACM Fellows.

The technologies were also transferred to and applied in several products of some Internet Giants such as Baidu and Tencent, and were used by hundreds of millions users. Moreover, the surveillance video coding and analysis technologies were also integrated into some video surveillance systems across different provinces in China such as Shandong, Guizhou and Beijing, consequently leading to totally more than 1 billion RMB in sales (675 million RMB in recent three years).

Due to these achievements, the team was awarded the first prize of technology invention by Chinese Electronics Society in 2015 and the first prize of technology invention by the Ministry of Education in 2016.