UIGEN T2 Training

We've recently completed training UIGEN-T2 (just the 7B variant), the latest iteration of our advanced UI generation model, building upon the powerful Qwen2.5-Coder-7B foundation. This time, instead of the 400 samples used previously, we substantially expanded our dataset to 50,000 carefully curated examples. This 125x increase has significantly boosted UIGEN-T2's performance, enabling it to handle a wider variety of UI components and design nuances by learning from a vastly more diverse range of patterns, styles, and interactions instead of just focusing on landing pages. You can even now use things like timers and dynamic, active, elements.

One key innovation in UIGEN-T2 is the introduction of a new method for UI-based reasoning designed explicitly to scale with the base model's capabilities. We achieved this through an innovative two-step process. First, we trained a smaller, specialized model to generate detailed, clear reasoning traces that serve as instructional examples for UIGEN-T2. For instance, when tasked with creating a stopwatch, this model might produce a reasoning trace like this (... means we skipped text for a smaller example):

<|begin_of_thought|>
When approaching the challenge of crafting an elegant stopwatch UI, my first instinct is to dissect what truly makes such an interface delightful yet functional—hence, I consider both aesthetic appeal and usability grounded in established heuristics like Nielsen’s “aesthetic and minimalist design” alongside Gestalt principles... Starting from layout decomposition, the core elements naturally segment into: display area... control buttons... placing the large digital clock prominently aligns with Fitts’ Law... The glassmorphism effect here enhances visual separation... typography choices—the use of a monospace font family ("Fira Code" via Google Fonts) supports readability... On interaction flow... iconography paired with labels inside buttons provides dual coding... Tailwind CSS v4 enables utility-driven consistency... layering custom styles (glass background blur, gradients, shadows) enriches depth perception... critical reflection concerns responsiveness: flexbox layouts combined with relative sizing guarantee graceful adaptation... In sum, this solution carefully blends modern aesthetics with fundamental usability heuristics... resulting in an intuitive stopwatch experience.
<|end_of_thought|>

This approach capitalizes on our observation that smaller models can effectively teach larger ones when it comes to reasoning tasks, thereby avoiding confusion that often arises from overly complex instructions generated by much larger models.

Moreover, we've enhanced transparency and facilitated community engagement by publishing the LoRA (Low-Rank Adaptation) checkpoints at every significant step of our training process for UIGEN-T2 itself. Each checkpoint was trained with a high LoRA rank of 128 using Parameter-Efficient Fine-Tuning (PEFT), ensuring that the model retains robust representational capacity while remaining efficient and scalable for adaptation and further research by the community.

We also addressed user experience directly by refining our chat interface. Previously, users sometimes had to explicitly prompt the model to "think" using specific syntax, which disrupted conversational flow. We've streamlined the interaction to be more natural, eliminating the double-prompt issue and making the model's outputs, guided by its internal reasoning, seamlessly integrated into regular conversational usage.

Looking forward, we plan to further optimize UIGEN-T2 using reinforcement learning (RL). We aim to align the model even more closely with practical design principles, using verifiable rewards based on adherence to established heuristics (like Nielsen's principles mentioned in the reasoning), precise use of Tailwind CSS classes, and accessibility guidelines (like WCAG considerations). Additionally, comprehensive support for Bootstrap-based UIs is on our roadmap, promising even greater versatility for developers.

UIGEN-T2 embodies our commitment to continually pushing the boundaries of UI generation, ensuring the outputs are not only visually compelling but also intuitively designed and user-friendly.