Recent Open-Vocabulary Semantic Segmentation (OVSS) models extend the CLIP model to segmentation while maintaining the use of multiple templates for constructing class-wise averaged text embeddings. Our method, FLOSS, challenges this approach by:
This research was partially funded by the French Agence Nationale de la Recherche (ANR) with the project SIGHT (ANR-20-CE23-0016). We sincerely thank Telecom Paris for providing the resources necessary to run our experiments and Nacereddine Laddaoui for his invaluable help with infrastructure. We are also grateful to Ivan Lopes for proofreading.
@misc{benigmim2025flossfreelunchopenvocabulary,
title={FLOSS: Free Lunch in Open-vocabulary Semantic Segmentation},
author={Yasser Benigmim and Mohammad Fahes and Tuan-Hung Vu and Andrei Bursuc and Raoul de Charette},
year={2025},
eprint={2504.10487},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2504.10487},
}