Seeing Dark Videos via Self-Learned Bottleneck Neural Representation

Haofeng Huang1     Wenhan Yang2     Ling-Yu Duan1     Jiaying Liu1

1 Wangxuan Institute of Computer Technology, Peking University     2 Peng Cheng Laboratory

Accepted by AAAI 2024.

Abstract

Enhancing low-light videos in a supervised style presents a set of challenges, including limited data diversity, misalignment, and the domain gap introduced through the dataset construction pipeline. Our paper tackles these challenges by constructing a self-learned enhancement approach that gets rid of the reliance on any external training data. The challenge of self-supervised learning lies in fitting high-quality signal representations solely from input signals. Our work designs a bottleneck neural representation mechanism that extracts those signals. More in detail, we encode the frame-wise representation with a compact deep embedding and utilize a neural network to parameterize the video-level manifold consistently. Then, an entropy constraint is applied to the enhanced results based on the adjacent spatial-temporal context to filter out the degraded visual signals, e.g. noise and frame inconsistency. Last, a novel Chromatic Retinex decomposition is proposed to effectively align the reflectance distribution temporally. It benefits the entropy control on different components of each frame and facilitates noise-to-noise training, successfully suppressing the temporal flicker. Extensive experiments demonstrate the robustness and superior effectiveness of our proposed method.

Method

Key Idea:
(1) Objective bottleneck. We adopt an entropy control mechanism as the objective bottleneck. Without any explicit alignment, it implicitly fuses spatial-temporal information to provide reliable guidance of the center pixel.
(2) Content bottleneck. We adopt neural representation as the content bottleneck, which utilizes the inductive bias of the neural network to predict the noise-free signal, getting rid of any assumption on the noise model.
(3) Chromatic Retinex. We adjust the formulation of the Retinex model into a chromatic illumination form that facilitates suppressing the color bias artifacts.



Figure 1. The framework of the proposed bottleneck neural representation. A constrained deep embedding is first extracted and then transformed into enhanced Retinex-based layer-wise representations. Hybrid neural representation provides richer intrinsic information but still set bottlenecks from the perspective of content. Entropy minimization applies the bottleneck constraint in the objective view to suppress noise and correct illumination. A chromatic Retinex representation helps align layer-wise frames, which facilitates self-supervised learning.

Results

Table 1. Subject results on DRV-dynamic datasets.



Table 2. Objective results on DRV-dynamic dataset.


Citation

@article{huang2024seeing,
    title={Seeing Dark Videos via Self-Learned Bottleneck Neural Representation},
    author={Huang, Haofeng and Yang, Wenhan and Duan, Ling-Yu and Liu, Jiaying},
    booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
    year={2024},
}