LLM
-
Blog
Microsoft Releases Largest 1-Bit LLM, Letting Powerful AI Run on Some Older Hardware
Microsoft researchers claim to have developed the first 1-bit large language model with 2 billion parameters. The model, BitNet b1.58 2B4T, can run on commercial CPUs such as Apple’s M2. “Trained on a corpus of 4 trillion tokens, this model demonstrates how native 1-bit LLMs can achieve performance comparable to leading open-weight, full-precision models of similar size, while offering substantial…
Read More » -
Blog
AMD rolls out open-source OLMo LLM, to compete with AI giants – Computerworld
Competitive performance and benchmark success In internal benchmarks, AMD’s OLMo models performed well against similarly sized open-source models, such as TinyLlama-1.1B and OpenELM-1_1B, in multi-task and general reasoning tests, the company claimed. Specifically, its performance increased by over 15% on tasks in GSM8k, a substantial gain attributed to AMD’s multi-phase supervised fine-tuning and Direct Preference Optimization (DPO). ‘ In multi-turn…
Read More » -
Blog
Researchers tackle AI fact-checking failures with new LLM training technique – Computerworld
“They could give the model a genetics dataset and ask the model to generate a report on the gene variants and mutations it contains,” IBM explained. “With a small number of these seeds planted, the model begins generating new instructions and responses, calling on the latent expertise in its training data and using RAG to pull facts from external databases…
Read More »