Zum Hauptinhalt springen
Medizintechnik

EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation

Veröffentlichungsart

Konferenzbeitrag (peer reviewed)

Forschungsprojekt

KIEBITZ

Medien

Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15093. Springer, Cham. 18th European Conference, Milan, Italy, September 29 – October 4, 2024, Proceedings, Part XXXV

Veröffentlichungsdatum

2024-09-30

Band

15093

Seiten

202-220

Herausgeber

Springer Nature Switzerland

ISBN

978-3-031-72761-0

DOI

https://doi.org/10.1007/978-3-031-72761-0_12

Zitierung

Körber, Nikolai; Kromer, Eduard; Siebert, Andreas; Hauke, Sascha; Mueller-Gritschneder, Daniel; Schuller, Björn (2024): EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation. Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15093. Springer, Cham. 18th European Conference, Milan, Italy, September 29 – October 4, 2024, Proceedings, Part XXXV 15093, S. 202-220. DOI: 10.1007/978-3-031-72761-0_12

Peer Reviewed

Ja

Autoren

Nikolai Körber
Prof. Dr. Eduard Kromer
Prof. Ph.D. Andreas Siebert
Prof. Dr. rer. Nat. Sascha Hauke
Daniel Mueller-Gritschneder
Björn Schuller

Medizintechnik

EGIC: Enhanced Low-Bit-Rate Generative Image Compression Guided by Semantic Segmentation

Abstract

We introduce EGIC, an enhanced generative image compression method that allows traversing the distortion-perception curve efficiently from a single model. EGIC is based on two novel building blocks: i) OASIS-C, a conditional pre-trained semantic segmentation-guided discriminator, which provides both spatially and semantically-aware gradient feedback to the generator, conditioned on the latent image distribution, and ii) Output Residual Prediction (ORP), a retrofit solution for multi-realism image compression that allows control over the synthesis process by adjusting the impact of the residual between an MSE-optimized and GAN-optimized decoder output on the GAN-based reconstruction. Together, EGIC forms a powerful codec, outperforming state-of-the-art diffusion and GAN-based methods (e.g., HiFiC, MS-ILLM, and DIRAC-100), while performing almost on par with VTM-20.0 on the distortion end. EGIC is simple to implement, very lightweight, and provides excellent interpolation characteristics, which makes it a promising candidate for practical applications targeting the low bit range.