[TIP'26] COMBINER: Composed Image Retrieval Guided by Attribute-based Neighbor Relations

1School of Software, Shandong University,
2School of Computer Science and Technology, Harbin Institute of Technology (Shenzhen),
3School of Data Science, City University of Hong Kong,
4Department of Computer Science and Engineering, Southern University of Science and Technology

*Corresponding author.

Abstract

MY ALT TEXT

Illustration of Relations Modeling

MY ALT TEXT

Example of (a) Pairwise Relations, (b) Neighbor Relations, and (c) Visually Similar Images in Relations Modeling. In this figure, Q denotes the multimodal query, T denotes the target image, and C denotes the candidate image. Fig. 1(c) illustrates the traditional neighbor relations modeling methodology brings both candidate images C1 and C2 close to Q1. However, C2 is visually similar but attribute-unrelated with Q1 ("carpet" does not match the query "bedding"). Therefore, C2 should not be brought close to Q1.


Attribute Prototype-Based Similarity Measure

MY ALT TEXT

Schematic of our proposed similarity measure method based on attribute prototypes


Framework: COMposed image retrieval network guided By attrIbute-based NEighbor Relations (COMBINER)

MY ALT TEXT

Overall framework of COMBINER, which consists of (a) Adaptive Semantic Disentanglement, (b) Unified Prototype-based Composition, and (c) Dual Relations Modeling.


Experiment

MY ALT TEXT

Performance comparison on the FashionIQ dataset with respect to R@k (%). The top-performing results across all methods are highlighted in blue, and the best results among baseline methods are indicated with underlining.

MY ALT TEXT

Performance comparison on CIRR with respect to R@k(%) and Rsubset@k(%). The overall best results are colored in blue, while the best results over baselines are underlined.

MY ALT TEXT

Performance comparison on Shoes with respect to R@k(%). The overall best results are colored in blue, while the best results over baselines are underlined.

MY ALT TEXT

Ablation Studies of COMBINER with different components and various settings on FashionIQ, Shoes, and CIRR.

MY ALT TEXT

Influence of hyper-parameters ρ, κ, and μ on FashionIQ and CIRR datasets.

MY ALT TEXT

Influence of (a) Attribute Prototype Number U and (b) Semantic Cluster Number H on FashionIQ and CIRR datasets.

MY ALT TEXT

Case study on (a) FashionIQ, (b) Shoes, (c) CIRR, and failure cases on (d) CIRR and (e) FashionIQ.

MY ALT TEXT

Attention Visualizations on (a) Dresses, (b) Shirts, (c) Tops&Tees, and (d) CIRR datasets.

MY ALT TEXT

Visualization of Semantic Cluster Neighbors on (a) FashionIQ and (b) CIRR datasets.

BibTeX


        Paper BibTex Coming Soon