| Full name | MatchAttention for High-Resolution Cross-View Matching |
| Description | Cross-view matching is fundamentally achieved through cross-attention mechanisms. However, matching of high-resolution images remains challenging due to the quadratic complexity and lack of explicit matching constraints in the existing cross-attention. This paper proposes an attention mechanism, MatchAttention, that dynamically matches relative positions. The relative position determines the attention sampling center of the key-value pairs given a query. Continuous and differentiable sliding-window attention sampling is achieved by the proposed BilinearSoftmax. The relative positions are iteratively updated through residual connections across layers by embedding them into the feature channels. Since the relative position is exactly the learning target for cross-view matching, an efficient hierarchical cross-view decoder, MatchDecoder, is designed with MatchAttention as its core component. To handle cross-view occlusions, gated cross-MatchAttention and a consistency-constrained loss are |
| Parameters | MatchStereo-B, 76M params |
| Publication title | MatchAttention: Matching the Relative Positions for High-Resolution Cross-View Matching |
| Publication authors | Tingman Yan, Tao Liu, Xilian Yang, Qunfei Zhao, Zeyang Xia |
| Publication venue | Arxiv, 2025 |
| Publication URL | https://arxiv.org/abs/2510.14260 |
| Programming language(s) | Pytorch, CUDA |
| Hardware | RTX 4090 |
| Source code or download URL | https://github.com/TingmanYan/MatchAttention |
| Submission creation date | 15 Aug, 2025 |
| Last edited | 17 Oct, 2025 |