Last Monday, Alibaba's Qwen team developed a new image-generation artificial intelligence (AI) model. Qwen VLo is a successor to the Qwen 2.5 vision language model that includes significant advancements over previous editions. The most recent AI image model can generate images both from text and from images. It also allows text input in a variety of languages, including English and Chinese. Apart from picture production, the AI model can make inline modifications to both created and input photos.
Qwen VLo accepts prompts in several languages
The new model was introduced in a post on X (previously known as Twitter) by the Qwen team's official account. The model's technical designation is Qwen3-235B-A22B, and it may be downloaded for free via the company's chat interface here. Users can access the model without logging in.
Meet Qwen-VLo, your AI creative engine:
— Qwen (@Alibaba_Qwen) June 27, 2025
• Concept-to-Polish: Turn rough sketches or text prompts into high-res visuals
• On-the-Fly Edits: Refine product shots, adjust layouts or styles with simple commands
• Global-Ready: Generate image in multiple languages
• Progressive… pic.twitter.com/qFjCQvbAS3
Gadgets 360 team members examined the AI model and discovered that its picture production capabilities were comparable to Google's Imagen 2. The instruction execution and picture output quality are marginally inferior than Imagen-3 and OpenAI's GPT-4o-powered image generating functionality. However, it generates quicker than both of them and has a greater rate limit.
Aside from picture production and editing, the Qwen VLo can conduct image annotation tasks including edge recognition, segmentation, prediction mapping, and more. The business stated that future versions of the model will be able to receive various input photos and blend them based on user requirements.
Text rendering has also been enhanced using the most recent AI picture generator. We were able to create correct text across several typefaces while testing the model. Finally, the Qwen VLo can handle photos with dynamic aspect ratios as input, including extreme ratios like 4:1 and 1:3. The business hopes to offer the ability to make photos in other aspect ratios soon.