"He aha te mea nui o te taiao? He tangata, he tangata, he tangata."
What is the most important thing in the world? It is people, it is people, it is people.

An Introduction to Large Language Models (LLMs)

How AI Understands and Generates Language

Digital Technologies Year 7-10 45 minutes

What is a Large Language Model (LLM)?

A Large Language Model is a type of artificial intelligence that has been trained on a massive amount of text data. This training allows it to understand the patterns, grammar, and context of human language. Its core function is to predict the next most likely word in a sequence, which enables it to generate human-like text, answer questions, translate languages, and much more.

Te Ao Māori Perspective

In Māori worldview, language (reo) is considered a taonga (treasure). As we explore AI language models, we must consider how they handle indigenous languages and cultural knowledge responsibly.

Key Concepts

Training Data

LLMs are trained on vast datasets from the internet, books, and other sources. This is how they "learn" language patterns and knowledge. However, this is also the source of potential bias, as the training data may not represent all perspectives equally.

Critical Question: What voices might be missing from these datasets?

Parameters

These are the internal variables the model uses to make predictions about what word comes next. Models are often measured by the number of parameters they have - ranging from millions to hundreds of billions.

Think About: More parameters don't always mean better performance!

Prompting

This is the input you give to the model - your question or instruction. The quality and clarity of the prompt significantly affects the quality of the output you receive.

Practice: Clear, specific prompts yield better results.

Understanding How LLMs Work

Text Input

You type a question or prompt

Tokenization

The model breaks your text into smaller pieces called tokens

Processing

The model uses its parameters to understand context and meaning

Generation

It predicts and generates the most likely response

Critical Considerations

🤔 Bias and Fairness

LLMs can perpetuate biases present in their training data. This is particularly important when considering representation of indigenous knowledge and perspectives.

🌍 Environmental Impact

Training and running large models requires significant computational resources and energy consumption.

🎭 Cultural Sensitivity

How well do these models understand and respect different cultural contexts, especially indigenous knowledge systems?

🔒 Privacy and Data

What happens to the data you share with these models? Understanding data sovereignty is crucial.

Reflection Questions

How might the training data of an LLM affect its understanding of Māori perspectives and knowledge?
What are the potential benefits and risks of using LLMs in education?
How could we ensure that indigenous voices and knowledge are properly represented in AI systems?
What questions would you ask before trusting information from an AI model?