跳至正文
LLaVA-UHD v4: The Definitive Guide to Efficient Visual Encoding in Multimodal Large Language Models