Huimin Zeng, Jiacheng Li, Zhiwei Xiong
🔥 2025/6/25: We provided Dockerfile and requirement files for preparing the running environment.
2025/6/5: We released code and checkpoint for this project.
2025/4/10: Our Plug-and-Play was accepted to CVPR 2025
Video compression effectively reduces the size of files, making it possible for real-time cloud computing, while it comes at the cost of visual quality, posing challenges to the robustness of downstream vision models. In this work, we present a versatile codec-aware enhancement framework that reuses codec information to adaptively enhance videos under different compression settings, assisting various downstream vision tasks without introducing computation bottleneck. Extensive experimental results demonstrate the superior quality enhancement performance of our framework over existing enhancement methods, as well as its versatility in assisting multiple downstream tasks on compressed videos as a plug-and-play module.
The proposed codec-aware framework consists of a compression-aware adaptation (CAA) network that employs a hierarchical adaptation mechanism to estimate parameters of the frame-wise enhancement network, namely the bitstream-aware enhancement (BAE) network. The BAE network further leverages temporal and spatial priors embedded in the bitstream to effectively improve the quality of compressed input frames.
Task | Dataset |
---|---|
Quality Enhancement | REDS |
Video Super-Resolution | REDS |
Optical Flow Estimation | KITTI |
Video Object Segmentation | DAVIS |
Video Inpainting | DAVIS |
dataset
├── davis_all
├── KITTI
├── REDS_test_HR
├── REDS_test_LR
│ ├── crf15
│ │ ├── mv
│ │ ├── png
│ │ └── video_qp
│ ├── crf25
│ ├── crf35
│ ├── REDS_test_LR.json
│ └── X4
We provide a Docker image to prepare the running environment, which can be obtained using the following command:
docker pull registry.cn-hangzhou.aliyuncs.com/zenghuimin/zhm_docker:py37-torch18
./tools/dist_train.sh configs/HR_davis_LR_128x128.py 1 --exp_name HR_davis_LR_128x128
./tools/dist_train.sh configs/HR_davis_LR_128x128_IPB.py 1 --exp_name HR_davis_LR_128x128_IPB
We provide checkpoint on GoogleDrive. Please download and put it at ./checkpoint
, structured as follows:
checkpoint
├── HR_davis_LR_128x128_IPB.pth
└── HR_davis_LR_128x128.pth
```
./tools/dist_test.sh configs/HR_davis_LR_128x128.py checkpoint/HR_davis_LR_128x128.pth 1
–testdir_lr dataset/REDS_test_HR/crf15/png –testdir_gt dataset/REDS_test_HR/sharp/png –save-path ./HR_davis_LR_128x128/REDS_test_HR/crf15
./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_HR/crf15/png –testdir_gt dataset/REDS_test_HR/sharp/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_HR/crf15
### Downstream tasks
* Video super-resolution (LR REDS4 Dataset)
./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/REDS_test_LR/crf15/png –testdir_gt dataset/REDS_test_LR/X4/png –save-path ./HR_davis_LR_128x128_IPB/REDS_test_LR/crf15
Downstream video super-resolution models can be found at [BasicVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr_pp/README.md), [IconVSR](https://github.com/open-mmlab/mmagic/blob/main/configs/basicvsr/README.md) and [BasicVSR++](https://github.com/open-mmlab/mmagic/blob/main/configs/iconvsr/README.md).
* Optical flow estimation (KITTI Dataset)
./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/KITTI/crf15/png –testdir_gt dataset/KITTI/X4/png –save-path ./HR_davis_LR_128x128_IPB/KITTI/crf15
Downstream optical flow estimation models can be found at [RAFT](https://github.com/princeton-vl/RAFT), [DEQ](https://github.com/locuslab/deq-flow) and [KPAFlow](https://github.com/megvii-research/KPAFlow).
* Video object segmentation & video inpainting (DAVIS Dataset)
./tools/dist_test.sh configs/HR_davis_LR_128x128_IPB_LR_test.py checkpoint/HR_davis_LR_128x128_IPB.pth 1
–testdir_lr dataset/davis_all/crf15/png –testdir_gt dataset/davis_all/X4/png –save-path ./HR_davis_LR_128x128_IPB/davis_all/crf15
Downstream video object segmentation models can be found at [STCN](https://github.com/hkchengrex/STCN), [DeAoT](https://github.com/z-x-yang/AOT?tab=readme-ov-file) and [QDMN](https://github.com/yongliu20/QDMN). Video inpainting model can be found at [E2FGVI](https://github.com/MCG-NKU/E2FGVI).
## TODO
* config file for KITTI/DAVIS dataset
## Acknowledgement
This repository is partly built on [MMEditing](https://github.com/open-mmlab/mmagic). We appreciate the authors for creating these brilliant works and sharing code with the community.
## Citation
If you find our Plug-and-Play useful, please star ⭐ this repository and consider citing:
```bibtex
@InProceedings{Zeng_2025_CVPR,
author = {Zeng, Huimin and Li, Jiacheng and Xiong, Zhiwei},
title = {Plug-and-Play Versatile Compressed Video Enhancement},
booktitle = {Proceedings of the Computer Vision and Pattern Recognition Conference (CVPR)},
month = {June},
year = {2025},
pages = {17767-17777}
}