Collaborating Foundation models for Domain Generalized Semantic Segmentation

1 LTCI, Télécom-Paris, Institut Polytechnique de Paris 2 LIX, Ecole Polytechnique, CNRS, Institut Polytechnique de Paris
Image description

TL;DR

  • Focus: Domain Generalized Semantic Segmentation (DGSS) aims to train models on a labeled source domain to generalize to unseen domains during inference.
  • Limitation of Existing Methods: Traditional Domain Randomization (DR) methods are limited to style diversification and lack content variability.
  • Our Approach: Introduction of the CLOUDS framework, utilizing an assembly of CoLlaborative FOUndation models for Domain Generalized Semantic Segmentation.
  • Components of CLOUDS:
    • CLIP Backbone - For robust feature representation.
    • Large Language Model (LLM) - Provides rich and diverse text prompts to enhance both content and style diversity.
    • Diffusion Model - Generates images while being textually conditioned on the LLM.
    • Segment Anything Model (SAM) - Iteratively refines the pseudo-labels of the segmentation model on the generated images.
  • Performance: CLOUDS significantly outperforms previous methods, showing improvements by 5.6% and 6.7% on averaged mIoU in adapting from synthetic to real DGSS benchmarks under varying weather conditions.

Method

Image description
  • CLIP provides robust feature representations for unseen domains.
  • The frozen backbone ensures preserved generalizability.
Image description
  • Data is generated using a diffusion model.
  • The conditioning on LLM prompts increases content and style diversity.
Image description
  • We use SAM to improve noisy PLs.
  • Class-wise masks and point prompts are extracted for each noisy PL and fed to SAM, resulting in refined masks..

Generated images using Diffusion Model

Image description

Pseudo-Label refinement using SAM

Image description

Qualitative Results of CLOUDS

Image description

Acknowledgements


This paper has been supported by the French National Research Agency (ANR) in the framework of its JCJC (ANR-20-CE23-0027). This work was granted access to the HPC resources of IDRIS under the allocation AD011013071 made by GENCI. We would like to thank I.E.Marouf and T.Delatolas for proofreading.

BibTeX

@article{benigmim2023collaborating,
      title={Collaborating Foundation models for Domain Generalized Semantic Segmentation},
      author={Benigmim, Yasser and Roy, Subhankar and Essid, Slim and Kalogeiton, Vicky and Lathuilière, Stéphane},
      journal={arXiv preprint arXiv:2312.09788},
      year={2023}
    }