*Result*: NN-VVC: A Hybrid Learned-Conventional Video Codec Targeting Humans and Machines.

Title:
NN-VVC: A Hybrid Learned-Conventional Video Codec Targeting Humans and Machines.
Source:
International Journal of Semantic Computing; Dec2024, Vol. 18 Issue 4, p689-712, 24p
Database:
Complementary Index

*Further Information*

*Advancements in artificial intelligence have significantly increased the use of images and videos in machine analysis algorithms, predominantly neural networks. However, the traditional methods of compressing, storing and transmitting media have been optimized for human viewers rather than machines. Current research in coding images and videos for machine analysis has evolved in two distinct paths. The first is characterized by End-to-End (E2E) learned codes, which show promising results in image coding but have yet to match the performance of leading Conventional Video Codecs (CVC) and suffer from a lack of interoperability. The second path optimizes CVC, such as the Versatile Video Coding (VVC) standard, for machine-oriented reconstruction. Although CVC-based approaches enjoy widespread hardware and software compatibility and interoperability, they often fall short in machine task performance, especially at lower bitrates. This paper proposes a novel hybrid codec for machines named NN-VVC, which combines the advantages of an E2E-learned image codec and a CVC to achieve high performance in both image and video coding for machines. Our experiments show that the proposed system achieved up to − 43.20% and − 26.8% Bjøntegaard Delta rate reduction over VVC for image and video data, respectively, when evaluated on multiple different datasets and machine vision tasks according to the common test conditions designed by the VCM study group in MPEG standardization activities. Furthermore, to improve reconstruction quality, we introduce a human-focused branch into our codec, enhancing the visual appeal of reconstructions intended for human supervision of the machine-oriented main branch. [ABSTRACT FROM AUTHOR]

Copyright of International Journal of Semantic Computing is the property of World Scientific Publishing Company and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)*