PnP-VCVE

Plug-and-Play Versatile Compressed Video Enhancement

🔥 Update

🔥 2025/6/25: We provided Dockerfile and requirement files for preparing the running environment.

2025/6/5: We released code and checkpoint for this project.

2025/4/10: Our Plug-and-Play was accepted to CVPR 2025

Introduction

Video compression effectively reduces the size of files, making it possible for real-time cloud computing, while it comes at the cost of visual quality, posing challenges to the robustness of downstream vision models. In this work, we present a versatile codec-aware enhancement framework that reuses codec information to adaptively enhance videos under different compression settings, assisting various downstream vision tasks without introducing computation bottleneck. Extensive experimental results demonstrate the superior quality enhancement performance of our framework over existing enhancement methods, as well as its versatility in assisting multiple downstream tasks on compressed videos as a plug-and-play module.

Overview

The proposed codec-aware framework consists of a compression-aware adaptation (CAA) network that employs a hierarchical adaptation mechanism to estimate parameters of the frame-wise enhancement network, namely the bitstream-aware enhancement (BAE) network. The BAE network further leverages temporal and spatial priors embedded in the bitstream to effectively improve the quality of compressed input frames.

Dataset

Task	Dataset
Quality Enhancement	REDS
Video Super-Resolution	REDS
Optical Flow Estimation	KITTI
Video Object Segmentation	DAVIS
Video Inpainting	DAVIS

The adopted official datasets can be found above. We provide compressed datasets on GoogleDrive.
Download all the datasets and structure the data as follows:

dataset
├── davis_all
├── KITTI
├── REDS_test_HR
├── REDS_test_LR
│   ├── crf15
│   │   ├── mv
│   │   ├── png
│   │   └── video_qp
│   ├── crf25
│   ├── crf35
│   ├── REDS_test_LR.json
│   └── X4

Environment

We provide a Docker image to prepare the running environment, which can be obtained using the following command:

docker pull registry.cn-hangzhou.aliyuncs.com/zenghuimin/zhm_docker:py37-torch18

Train

CRF-based adaptation

./tools/dist_train.sh  configs/HR_davis_LR_128x128.py  1   --exp_name HR_davis_LR_128x128

Replacing CRF with slice type (adopted for assisting in downstream tasks)
```
./tools/dist_train.sh  configs/HR_davis_LR_128x128_IPB.py  1   --exp_name HR_davis_LR_128x128_IPB
```
Test

We provide checkpoint on GoogleDrive. Please download and put it at ./checkpoint, structured as follows:
```
checkpoint
├── HR_davis_LR_128x128_IPB.pth
└── HR_davis_LR_128x128.pth
```
Quality enhancement (REDS4 Dataset)

``` ./tools/dist_test.sh configs/HR_davis_LR_128x128.py checkpoint/HR_davis_LR_128x128.pth 1
–testdir_lr dataset/REDS_test_HR/crf15/png –testdir_gt dataset/REDS_test_HR/sharp/png –save-path ./HR_davis_LR_128x128/REDS_test_HR/crf15

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_HR/crf15/png –testdir_gt dataset/REDS_test_HR/sharp/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_HR/crf15

### Downstream tasks 
* Video super-resolution (LR REDS4 Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_LR/crf15/png –testdir_gt dataset/REDS_test_LR/X4/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_LR/crf15

Downstream video super-resolution models can be found at [BasicVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr_pp/README.md), [IconVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr/README.md) and [BasicVSR++](https://github.com/open-mmlab/mmagic/blob/main/configs/iconvsr/README.md).

* Optical flow estimation (KITTI Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/KITTI/crf15/png –testdir_gt dataset/KITTI/X4/png –save-path ./HR_davis_LR_128x128_IPB/KITTI/crf15

Downstream optical flow estimation models can be found at [RAFT](https://github.com/princeton-vl/RAFT), [DEQ](https://github.com/locuslab/deq-flow) and [KPAFlow](https://github.com/megvii-research/KPAFlow).

* Video object segmentation & video inpainting (DAVIS Dataset)

./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/davis_all/crf15/png –testdir_gt dataset/davis_all/X4/png –save-path ./HR_davis_LR_128x128_IPB/davis_all/crf15

Downstream video object segmentation models can be found at [STCN](https://github.com/hkchengrex/STCN), [DeAoT](https://github.com/z-x-yang/AOT?tab=readme-ov-file) and [QDMN](https://github.com/yongliu20/QDMN). Video inpainting model can be found at [E2FGVI](https://github.com/MCG-NKU/E2FGVI).

## TODO
* config file for KITTI/DAVIS dataset

## Acknowledgement
This repository is partly built on [MMEditing](https://github.com/open-mmlab/mmagic). We appreciate the authors for creating these brilliant works and sharing code with the community.

## Citation
If you find our Plug-and-Play useful, please star ⭐ this repository and consider citing:
```bibtex
@InProceedings{Zeng_2025_CVPR,
    author    = {Zeng, Huimin and Li, Jiacheng and Xiong, Zhiwei},
    title     = {Plug-and-Play Versatile Compressed Video Enhancement},
    booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
    month     = {June},
    year      = {2025},
    pages     = {17767-17777}
}