On August 25,The Irresistible Daughter in Law Alibaba Cloud launched an open-source Large Vision Language Model (LVLM) named Qwen-VL. The LVLM is based on Alibaba Cloud’s 7 billion parameter foundational language model Qwen-7B. In addition to capabilities such as image-text recognition, description, and question answering, Qwen-VL introduces new features including visual location recognition and image-text comprehension, the company said in a statement. These functions enable the model to identify locations in pictures and to provide users with guidance based on the information extracted from images, the firm added. The model can be applied in various scenarios including image and document-based question answering, image caption generation, and fine-grained visual recognition. Currently, both Qwen-VL and its visual AI assistant Qwen-VL-Chat are available for free and commercial use on Alibaba’s “Model as a Service” platform ModelScope. [Alibaba Cloud statement, in Chinese]
Related Articles
2025-06-26 20:23
685 views
44 GPU Fortnite Benchmark: The Best Graphics Cards for Playing Battle Royale
Although we've thoroughly benchmarked PlayerUnknown's Battlegrounds at this point and it's based on
Read More
2025-06-26 18:56
606 views
TikTok is removing millions of accounts that spread misinformation
TikTok continues to combat growing concerns around the prevalence of misinformation on the platform
Read More
2025-06-26 18:42
213 views
'Hocus Pocus 2' review: Cheeky, nostalgic, and practically magic
Broom, ho! The long awaited sequel to 1993's Hocus Pocusis here, and it’s a wickedly fun nosta
Read More