Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition

Binhao Wang^1,2,*, Shihao Zhao^1,2,*, Bo Cheng^2,*,†, Qiuyu Ji^1,2, Yuhang Ma²,
Liebucha Wu², Shanyuan Liu², Dawei Leng^2,‡, Yuhui Yin²

¹Wenzhou University ²360 AI Research

* Equal Contribution. † Project Lead. ‡ Corresponding Author.

Paper Code Dataset

Abstract

Recent diffusion-based approaches have made substantial progress in image layer decomposition. However, accurately decomposing complex natural images remains challenging due to difficulties in occlusion completion, robust layer disentanglement, and precise foreground boundaries. Moreover, the scarcity of high-quality multi-layer natural image datasets limits advancement. To address these challenges, we propose RevealLayer, a diffusion-based framework that decomposes an RGB image into multiple RGBA layers, enabling precise layer separation and reliable recovery of occluded content in natural images. RevealLayer incorporates three key components: (1) a Region-Aware Attention module to disentangle hidden and visible layers; (2) an Occlusion-Guided Adapter to leverage contextual information to enhance overlapping regions; and (3) a composite loss to enforce sharp alpha boundaries and suppress residual artifacts. To support training and evaluation, we introduce RevealLayer-100K, a high-quality multi-layer natural image dataset constructed through a collaboration between automated algorithms and human annotation, and further establish RevealLayerBench for benchmarking layer decomposition in general natural scenes. Extensive experiments demonstrate that RevealLayer consistently outperforms existing approaches in layer decomposition.

Overview

Overview of the RevealLayer framework for decomposing an RGB image into multiple RGBA layers

Dataset Construction Pipeline

Construction pipeline of RevealLayer-100K with automated generation and human annotation

Qualitative Comparison on Layer Decomposition

Qualitative comparison on layer decomposition, including text-consistency preservation and matting capability for transparent images and complex boundaries

Qualitative comparison on layer decomposition 2

Qwen-Image-Layered introduces variable-layer decomposition, yet the number, order, and semantic meaning of the generated layers remain ambiguous. CLD uses boundingbox conditioning for controllable decomposition, but it is mostly restricted to stylized poster images and tends to produce residual artifacts and blurred object edges. RevealLayer achieves better bbox-based controllability while suppressing target-related artifacts, completing occluded regions, and preserving visible-region consistency.

More cases of the RevealLayer

Layer decomposition in natural scenes, including cases with multiple layers, complex occlusions, and small foreground objects

Controllability

Region-guided controllable layer decomposition through user-specified bounding boxes

Object Removal

Bounding-box-guided object removal on OBER, removing target objects and their induced effects without requiring object masks or effect masks

Bounding-box-guided object removal on RevealLayerBench, removing large objects without residual artifacts or geometric distortions using only bounding boxes

Image Matting

Bounding-box-guided image matting on AIM-500, extracting foregrounds with complex boundaries without requiring foreground masks

Bounding-box-guided multi-object matting on RW100, separating multiple foreground objects using only bounding boxes

BibTeX

@inproceedings{wang2026reveallayer,
  title={RevealLayer: Disentangling Hidden and Visible Layers via Occlusion-Aware Image Decomposition},
  author={Wang, Binhao and Zhao, Shihao and Cheng, Bo and Ji, Qiuyu and Ma, Yuhang and Wu, Liebucha and Liu, Shanyuan and Leng, Dawei and Yin, Yuhui},
  booktitle={International Conference on Machine Learning},
  year={2026}
}