https://www.ecva.net/papers/eccv_2022/papers_ECCV/papers/136740450.pdf
Abstract
Visible-Infrared Re-Identification $($VI-ReID$)$ is challenging in image retrievals. The modality discrepancy will easily make huge intraclass variations. Most existing methods either bridge different modalities through modality-invariance or generate the intermediate modality for better performance. Differently, this paper proposes a novel framework, named Modality Synergy Complement Learning Network $($MSCLNet$)$ with Cascaded Aggregation. Its basic idea is to synergize two modalities to construct diverse representations of identity-discriminative semantics and less noise. Then, we complement synergistic representations under the advantages of the two modalities. Furthermore, we propose the Cascaded Aggregation strategy for fine-grained optimization of the feature distribution, which progressively aggregates feature embeddings from the subclass, intra-class, and inter-class. Extensive experiments on SYSU-MM01 and RegDB datasets show that MSCLNet outperforms the state-of-the-art by a large margin. On the large-scale SYSU-MM01 dataset, our model can achieve 76.99% and 71.64% in terms of Rank-1 accuracy and mAP value.

- Task: Visible-Infrared Re-Identification
- Problem Definition: the modality discrepancy
- Approach: synergize two modalities to construct diverse representations of identity-discriminative semantics and less noise