“By separating the input tokens from output tokens, Mu’s one-time encoding greatly reduces computation and memory overhead,” Pradeep said.
The encoder–decoder approach was significantly faster than LLMs such as Microsoft’s Phi-3.5, which is a decoder-only model. “When comparing Mu to a similarly fine-tuned Phi-3.5-mini, we found that Mu is nearly comparable in performance despite being one-tenth of the size,” Pradeep said.
Those gains are crucial for on-device and real-time applications. “Managing the extensive array of Windows settings posed its own challenges, particularly with overlapping functionalities,” Pradeep said.
Source link