BlogHow to

Your PC Can’t Handle Meta’s New Llama AI Model (Probably)


Meta has released Llama 3.3 70B, a modified version of the company’s most powerful AI model that can be downloaded to run on your own hardware. Your PC probably isn’t ready for it, though.


Like many other large language models (LLMs), Meta’s Llama generative AI model is available in several parameter sizes for different use cases. For example, the smallest Llama 3.2 1B model can handle basic tasks with fast performance on the average smartphone, while the larger 11B and 90B versions are more powerful and need higher-end PCs and servers. The Llama models are primarily intended for text and chat functionality, but some versions can understand images too.


Meta’s new Llama 3.3 70B model is supposed to offer the same performance as the company’s largest model, the 405B version, but with the ability to run on more PCs and servers. Meta’s VP of generative AI said in a social media post, “By leveraging the latest advancements in post-training techniques including online preference optimization, this model improves core performance at a significantly lower cost.”


Even though this new 70B model is significantly shrunk down from the original 405B version, you’ll still need a beefy PC or server to run it locally with acceptable performance. The file size is 37.14 GB, and LLMs generally need to fit in RAM to run well, so you’d probably need a machine with 64 GB RAM. You would also need a powerful GPU (or several paired together) for running the model.

The model’s description explains, “Llama 3.3 is intended for commercial and research use in multiple languages. Instruction tuned text only models are intended for assistant-like chat, whereas pretrained models can be adapted for a variety of natural language generation tasks. The Llama 3.3 model also supports the ability to leverage the outputs of its models to improve other models including synthetic data generation and distillation.”


Even though Llama 3.3 70B won’t run on most computing hardware, you can run the smaller 1B, 3B, and 8B on many desktops and laptops with apps like LM Studio or Nvidia’s Chat With RTX. My 16GB M1 Mac Mini runs Llama 3.1 8B at similar speeds as cloud-based AI chatbots, but I use smaller 3B models with my 8GB MacBook Air, since I have less RAM available.

You can download Llama 3.3 70B and the other Lama models from Meta’s website, Hugging Face, the built-in search in LM Studio, and other repositories.

Source: TechCrunch


Source link

Related Articles

Back to top button
close