
Learn the Most Important AI Models
Meta Llama: A Key Player in Open Source AI
Reading time: approx. 12 min
We have explored a range of commercial AI models, and also DeepSeek as a representative of open source models. Now it is time to focus on one of the most significant and influential players in the open source AI landscape: The Meta Llama family. Meta, the company behind Facebook and Instagram, has taken a leading role in making powerful AI models available to researchers and developers globally, driving innovation and transparency in the industry.
What you will learn
- What distinguishes Meta's Llama models and the unique licensing model.
- The evolution of the Llama family, including the latest Llama 3.3 and Llama 4 families.
- Llama models' capacity for text generation, reasoning, coding and multimodality.
- Llama's performance in Swedish.
- Important considerations and license terms for using Llama in a school context.
The Basics: What is Meta Llama?
Meta Llama (Large Language Model Meta AI) is a family of large language models developed by Meta AI. Unlike many commercial models, which are closed and only available via APIs or specific interfaces, Meta has chosen to release its Llama models with open weights. This means that the model's "brain" is freely downloadable and can be inspected.
Llama's Licensing: "Open Weights" but Not Always "Open Source"
However, it is important to clarify the difference: while the Llama models' weights are open and freely available, the license (Meta Llama Community License, now in version 4) is proprietary and breaks with the Open Source Initiative (OSI's) strict definition of open source. This is due to specific restrictions, including a threshold for commercial use and certain geographic restrictions that can affect users within the EU.
Meta's strategic move with open weights has still had a huge impact on AI research and development, as it has enabled thousands of innovators to build on, adapt and improve these models.
The Evolution of the Llama Family: From Llama 2 to Llama 4
Meta has continuously developed and refined the Llama family at a rapid pace. Here is an overview of the key versions up to July 2025:
| Version | Released | Sizes / Type | Key Features | Multimodal | Context Window |
|---|---|---|---|---|---|
| Llama 2 | 2023-07 | 7B · 13B · 70B | Base + Chat variants | No | 4k |
| Llama 3.0 | 2024-04 | 8B · 70B | Better reasoning and code | No | 8k |
| Llama 3.1 | 2024-07 | 8B · 70B · 405B | 128k tokens | No | 128k |
| Llama 3.2 (Vision) | 2024-09 | 1B · 3B text 11B · 90B vision | First image-in support, Llama Guard 3 | Yes | 128k |
| Llama 3.3 | 2024-12 | 70B text-only | Same quality as 3.1-405B at lower cost | No | 100k |
| Llama 4 Scout | 2025-04 | 17B active (109B tot) MoE | Natively multimodal, runs on one H100, 10 million tokens | Yes | 10M |
| Llama 4 Maverick | 2025-04 | 17B active (400B tot) MoE | GPT-4o class in code & reasoning, 1 million tokens | Yes | 1M |
| Llama 4 Behemoth* | 2025-04† | 288B active (2T tot) MoE | Under training, intended as State-of-the-Art | Yes | TBA |
*Previewed, not publicly downloadable. †Announced April 5, 2025, model to be completed later in 2025.
Strengths: What is Meta Llama Good At?
General language understanding and generation: Llama models, especially from Llama 3 onwards, are very capable for a broad set of language-based tasks. They can generate creative text, summarize information, answer questions and perform complex instructions. Llama 3 outperforms, for example, Mixtral 8x22B on MMLU benchmark according to Meta.
- Practical example: "Write a short speech about the importance of source criticism for students in grade 9" or "Summarize the main arguments for and against nuclear power."
Reasoning and coding: The Llama 3 family has proven to be very strong in logical reasoning, problem-solving and code generation, making them useful for STEM subjects and computer science.
- Practical example: "Suggest an algorithm to sort a list of numbers" or "Explain how photosynthesis works in detail, including the chemical formulas."
Native Multimodality (Llama 4 family): Llama 4 Scout and Maverick are Meta's first natively multimodal models, meaning they are designed from the ground up to understand and process information from images and other modalities beyond text. Llama 3.2's vision models were an early step in this direction.
- Practical example (with Llama 4): "Describe what is happening in this image of a historical event and suggest discussion questions for students."
Extremely long context windows (Llama 4 Scout): Llama 4 Scout stands out with an enormous context window of approximately 10 million tokens, enabling processing of extremely large amounts of information. Llama 3.1 and 3.2 also have a significant context window of 128,000 tokens.
Swedish Language Handling
Llama models, especially from Llama 3 onwards, are trained on massive datasets that include broad representation of languages beyond English. Llama 3.3 70B has improved multilingualism and beats Mixtral 8x22B on XS-QuAD Swedish according to Meta's own benchmarks. An external pilot study shows good accuracy in sentiment analysis for Llama 2 in Swedish (97% precision). This indicates that Llama 3 and successors have good ability to understand and generate text in Swedish with high quality for most common tasks. However, it may have more difficulty with very specific cultural nuances or dialects.
Important Considerations for School Environments (License, Security and Responsibility)
Using Llama models in schools involves specific considerations, especially around licensing and implementation:
1. License Terms and Availability for Schools
Llama models are released under the Meta Llama Community License, now in version 4 (dated 2025-04-05). For schools considering using Llama commercially (even if it is for non-profit educational purposes), it is important to note the following:
- MAU threshold (Monthly Active Users): The license has a threshold of 700 million MAU. Organizations that exceed this need a separate, custom-written license from Meta. Although few schools will reach this level, it is an important limitation to be aware of for larger educational networks.
- Geographic restrictions (EU): The Llama 4 license may still contain clauses that exclude EU users from certain types of commercial use or require specific approvals. Schools within the EU should therefore seek separate clearance or legal advice before implementation.
- Attribution: If Llama models are used in public applications or services, the license requires that you document attribution ("Built with Meta Llama 3" or "Built with Meta Llama 4").
2. Technical Competence and Infrastructure
- Implementing and managing Llama models locally requires significant technical competence and powerful infrastructure.
- Llama 3 8B can run on an RTX 4090 (24 GB) in 4-bit quantization.
- Llama 3 70B requires at least 2x80 GB GPU or cloud service with more than 120 GB RAM.
- Llama 3.1-405B and Llama 3.2 90B Vision models need H100 clusters or equivalent. Local operation is often unrealistic for school IT without specialized resources.
- Llama 4 Scout can run on one H100 in int4 quantization. Llama 4 Maverick requires 4xH100 or equivalent.
3. Data Protection and Control (GDPR)
- If the model runs completely locally on the school's own servers, it can provide a higher degree of control over data protection, as no information leaves the school's network. This can be a strong argument from a GDPR perspective, but requires that the school's IT department has full control over implementation, security and GDPR compliance, including regular updates and security reviews.
- A thorough DPIA (Data Protection Impact Assessment) should be performed before pilot operation with Llama.
4. Security and Moderation: Llama Guard 4
- Meta has developed Llama Guard 4 (available in 12B, multimodal version) as a safety classifier, which has replaced Llama Guard 3 as the standard moderator. It can be integrated by schools or developers building applications on top of Llama, to filter out generation of unsafe or unwanted content.
- This is an important tool for adding "guardrails" in AI applications, but requires active implementation by the school/developer. Open source models generally do not have the same built-in, strict safety filters and moderation as the major commercial cloud-based services "out-of-the-box".
Practical Examples in the Classroom
- Create interactive learning resources: Teachers with technical knowledge can use Llama models to build their own local AI assistants that can answer questions about a specific subject (through fine-tuning with course material) or generate practice tasks.
- Programming projects: Students at higher levels can experiment with downloading and running Llama models as part of a project in computer science or machine learning.
- Language and writing development: Use Llama to generate text drafts for different purposes, summarize articles or to get alternative formulations, aware of the data protection aspects that apply with local operation.
Next Steps
Now that you have a complete overview of both commercial and open source models, it is time for the next module: Choosing the Right AI Model. We will summarize the most important guidelines for strategically choosing the right AI for different pedagogical and administrative needs.
