PnP-VCVE

Plug-and-Play Versatile Compressed Video Enhancement

Huimin Zeng, Jiacheng Li, Zhiwei Xiong

🔥 Update

🔥 2025/6/25: We provided Dockerfile and requirement files for preparing the running environment.

2025/6/5: We released code and checkpoint for this project.

2025/4/10: Our Plug-and-Play was accepted to CVPR 2025

Introduction

Video compression effectively reduces the size of files, making it possible for real-time cloud computing, while it comes at the cost of visual quality, posing challenges to the robustness of downstream vision models. In this work, we present a versatile codec-aware enhancement framework that reuses codec information to adaptively enhance videos under different compression settings, assisting various downstream vision tasks without introducing computation bottleneck. Extensive experimental results demonstrate the superior quality enhancement performance of our framework over existing enhancement methods, as well as its versatility in assisting multiple downstream tasks on compressed videos as a plug-and-play module.

Overview

The proposed codec-aware framework consists of a compression-aware adaptation (CAA) network that employs a hierarchical adaptation mechanism to estimate parameters of the frame-wise enhancement network, namely the bitstream-aware enhancement (BAE) network. The BAE network further leverages temporal and spatial priors embedded in the bitstream to effectively improve the quality of compressed input frames. image

Dataset

Task Dataset
Quality Enhancement REDS
Video Super-Resolution REDS
Optical Flow Estimation KITTI
Video Object Segmentation DAVIS
Video Inpainting DAVIS
dataset
├── davis_all
├── KITTI
├── REDS_test_HR
├── REDS_test_LR
│   ├── crf15
│   │   ├── mv
│   │   ├── png
│   │   └── video_qp
│   ├── crf25
│   ├── crf35
│   ├── REDS_test_LR.json
│   └── X4

Environment

We provide a Docker image to prepare the running environment, which can be obtained using the following command:

docker pull registry.cn-hangzhou.aliyuncs.com/zenghuimin/zhm_docker:py37-torch18

Train

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_HR/crf15/png –testdir_gt dataset/REDS_test_HR/sharp/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_HR/crf15

### Downstream tasks 
* Video super-resolution (LR REDS4 Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_LR/crf15/png –testdir_gt dataset/REDS_test_LR/X4/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_LR/crf15

Downstream video super-resolution models can be found at [BasicVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr_pp/README.md), [IconVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr/README.md) and [BasicVSR++](https://github.com/open-mmlab/mmagic/blob/main/configs/iconvsr/README.md).

* Optical flow estimation (KITTI Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/KITTI/crf15/png –testdir_gt dataset/KITTI/X4/png –save-path ./HR_davis_LR_128x128_IPB/KITTI/crf15

Downstream optical flow estimation models can be found at [RAFT](https://github.com/princeton-vl/RAFT), [DEQ](https://github.com/locuslab/deq-flow) and [KPAFlow](https://github.com/megvii-research/KPAFlow).

* Video object segmentation & video inpainting (DAVIS Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/davis_all/crf15/png –testdir_gt dataset/davis_all/X4/png –save-path ./HR_davis_LR_128x128_IPB/davis_all/crf15

Downstream video object segmentation models can be found at [STCN](https://github.com/hkchengrex/STCN), [DeAoT](https://github.com/z-x-yang/AOT?tab=readme-ov-file) and [QDMN](https://github.com/yongliu20/QDMN). Video inpainting model can be found at [E2FGVI](https://github.com/MCG-NKU/E2FGVI).

## TODO
* config file for KITTI/DAVIS dataset

## Acknowledgement
This repository is partly built on [MMEditing](https://github.com/open-mmlab/mmagic). We appreciate the authors for creating these brilliant works and sharing code with the community.

## Citation
If you find our Plug-and-Play useful, please star ⭐ this repository and consider citing:
```bibtex
@InProceedings{Zeng_2025_CVPR,
    author    = {Zeng, Huimin and Li, Jiacheng and Xiong, Zhiwei},
    title     = {Plug-and-Play Versatile Compressed Video Enhancement},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {17767-17777}
}