With the rapid growth of multimodal data on the web, cross-modal retrieval has become increasingly important for applications such as multimedia search and content recommendation. It aims to align ...
We are providing an unedited version of this manuscript to give early access to its findings. Before final publication, the manuscript will undergo further editing. Please note there may be errors ...