
arxiv.org/abs/2501.02669
Preview meta tags from the arxiv.org website.
Linked Hostnames
25- 30 links toarxiv.org
- 12 links toinfo.arxiv.org
- 2 links tohuggingface.co
- 2 links tosubscribe.sorryapp.com
- 1 link toalphaxiv.org
- 1 link toapi.semanticscholar.org
- 1 link tocore.ac.uk
- 1 link todagshub.com
Thumbnail

Search Engine Appearance
https://arxiv.org/abs/2501.02669
Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
Abstract page for arXiv paper 2501.02669: Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
Bing
Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
https://arxiv.org/abs/2501.02669
Abstract page for arXiv paper 2501.02669: Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
DuckDuckGo

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
Abstract page for arXiv paper 2501.02669: Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
General Meta Tags
21- title[2501.02669] Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
- titleopen search
- titleopen navigation menu
- titlecontact arXiv
- titlesubscribe to arXiv mailings
Open Graph Meta Tags
10- og:typewebsite
- og:site_namearXiv.org
- og:titleGeneralizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
- og:urlhttps://arxiv.org/abs/2501.02669v2
- og:image/static/browse/0.3.4/images/arxiv-logo-fb.png
Twitter Meta Tags
6- twitter:site@arxiv
- twitter:cardsummary
- twitter:titleGeneralizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate...
- twitter:descriptionVision Language Models (VLMs) are impressive at visual question answering and image captioning. But they underperform on multi-step visual reasoning -- even compared to LLMs on the same tasks...
- twitter:imagehttps://static.arxiv.org/icons/twitter/arxiv-logo-twitter-square.png
Link Tags
12- apple-touch-icon/static/browse/0.3.4/images/icons/apple-touch-icon.png
- canonicalhttps://arxiv.org/abs/2501.02669
- icon/static/browse/0.3.4/images/icons/favicon-32x32.png
- icon/static/browse/0.3.4/images/icons/favicon-16x16.png
- manifest/static/browse/0.3.4/images/icons/site.webmanifest
Links
67- http://arxiv.org/licenses/nonexclusive-distrib/1.0
- http://gotit.pub/faq
- http://www.bibsonomy.org/BibtexHandler?requTask=upload&url=https://arxiv.org/abs/2501.02669&description=Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?
- https://alphaxiv.org
- https://api.semanticscholar.org/arXiv:2501.02669