A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models Paper • 2307.12980 • Published Jul 24, 2023 • 1
Visual Question Decomposition on Multimodal Large Language Models Paper • 2409.19339 • Published Sep 28, 2024 • 8
Benchmarking Robustness of Adaptation Methods on Pre-trained Vision-Language Models Paper • 2306.02080 • Published Jun 3, 2023
Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? Paper • 2404.03411 • Published Apr 4, 2024 • 9
Stop Reasoning! When Multimodal LLMs with Chain-of-Thought Reasoning Meets Adversarial Images Paper • 2402.14899 • Published Feb 22, 2024 • 1