Article 5 in the series — AI Governance for Communities

Village AI as a Situated Language Layer

From Generic AI to Situated Language Layers

April 2026 · My Digital Sovereignty · All Articles

This is Article 5 in the AI Governance for Communities series. Start with Article 1: What Is AI, Really? for foundational context, or read Article 4: A Smaller Room Than You Think for the preceding article.

1. Introduction: From Generic AI to Situated Language Layers

There is a growing consensus among community and nonprofit leaders that artificial intelligence is too important to ignore and too consequential to adopt uncritically. The question is no longer whether to engage with AI, but how — and on whose terms.

The dominant model is straightforward: subscribe to a cloud service, feed it your documents, and receive answers. It is fast, often impressive, and increasingly cheap. It is also a model in which the community surrenders control over its knowledge, its values, and its data to organisations whose interests are structurally misaligned with community flourishing.

This article introduces an alternative: the Situated Language Layer (SLL) — a small, locally-trained language model that runs on community-controlled infrastructure. The term "situated" is deliberate. In philosophy, situated knowledge refers to understanding that emerges from a particular context, shaped by specific relationships, histories, and values. A Situated Language Layer is AI that knows where it is, whom it serves, and what it should not do — because the community that trained it made those decisions explicitly.

The word "small" in this context is not an apology. It is a design choice with governance implications. A model small enough to run on modest hardware is a model the community can actually control. A model trained on community content, with community consent, under community governance, is a model whose behaviour the community can inspect, adjust, and hold accountable. These are not limitations to be overcome on the way to something bigger. They are the features that make community-sovereign AI possible.

This is Article 5 in a series on AI governance for communities. The previous article — A Smaller Room Than You Think — asked who is deciding what AI becomes and argued that the sovereignty and plural-governance traditions deserve to be in the room alongside the longtermist and technocratic traditions that currently dominate. This article is the worked example: what it looks like when a community actually builds its own AI infrastructure, on its own terms, and what it has to withstand in a world where the threat landscape is shifting faster than most of us can follow.

What follows is not a technical manual. It is a governance document — written for the people who will decide whether, how, and under what conditions their community engages with artificial intelligence.

2. Design Principles of Village AI

Before examining architecture or implementation, it is necessary to establish the principles that govern every design decision. These are not aspirational statements to be revisited later. They are constraints that shape what the system can and cannot become.

Community Primacy

The community's needs and values come first, always. Commercial AI products are designed to serve the vendor's business model, with community benefit as a welcome side effect. The alignment target is user engagement, subscription retention, or data collection. Community flourishing is at best a proxy for these objectives.

Village AI inverts this hierarchy. The alignment target is the community's own definition of flourishing, expressed through its constitutional governance processes. If the community values unhurried reflection over rapid response, the AI is configured accordingly. There is no hidden objective function pulling the system toward someone else's definition of success.

Taonga-Centred

In te reo Maori, taonga means a treasure — something precious that carries obligations of guardianship. Te Mana Raraunga, the Maori Data Sovereignty Network, articulates a principle that resonates far beyond its indigenous context: community data is taonga, to be guarded and stewarded, not a resource to be mined.

Village AI treats every piece of community content — every story shared, every photograph uploaded, every meeting minute recorded — as taonga. This means:

Transparency and Contestability

Every AI decision can be questioned and explained. This principle has two dimensions.

Transparency means the community can inspect what the model knows, how it was trained, what data influenced its responses, and why it behaves as it does. There are no proprietary black boxes. The training data is auditable. The governance boundaries are documented. The model's knowledge can be reviewed by anyone with appropriate access.

Contestability means that when the AI produces a response that seems wrong, culturally inappropriate, or misaligned with community values, there is a clear process for challenging it and a mechanism for correction. The AI does not have the last word. Human judgment always supersedes algorithmic output.

Resilience Over Scale

Given a choice between a system that is more capable but requires a constant internet connection, and a system that is less capable but works independently on community-controlled hardware, Village AI chooses the latter.

Communities that rely on cloud services are one terms-of-service change, one price increase, or one geopolitical event away from losing access to their AI capabilities. Communities that run their own infrastructure face different challenges — maintenance, updates, capacity — but these are challenges they can address on their own terms.

Resilience also means the system degrades gracefully. If the AI is temporarily unavailable, the community platform continues to function. No critical community function depends on the AI being operational.

3. Architecture Overview

This section describes the architecture of Village AI at a governance level. The goal is not to explain how to build such a system but to explain what governance decisions are embedded in each architectural layer and why.

Base Models: Starting from Open Foundations

Village AI begins with open-weight foundation models — large language models whose internal structure is publicly available for inspection. The current production foundation is Alibaba's Qwen2 model family (14 billion parameters), chosen after extensive evaluation for three governance-relevant reasons: inspectability (the model weights and architecture can be examined by auditors and governance bodies), independence (it runs on any compatible hardware, not locked to one vendor), and modifiability (it can be fine-tuned to reflect community values without the developer's permission or the developer's knowledge).

The choice of foundation model is itself a governance decision, and one that has already been revised in practice. Village AI initially used Meta's Llama family before migrating to Qwen2 on the basis of superior multilingual performance — particularly for te reo Maori and for the European languages the platform supports. Not all open-weight models carry the same licence terms, safety properties, or biases, and the choice must be revisited as the landscape evolves.

The Value and Policy Layer

On top of the open-weight foundation, Village AI adds a fine-tuned layer that embeds three things:

This layer is produced through parameter-efficient fine-tuning, a technique that adjusts a small fraction of the base model's parameters rather than retraining the entire model. The result is a lightweight adapter — a thin layer of community-specific knowledge and values that sits on top of the unchanged foundation. This approach has several governance advantages:

Village-Specific Layers

Each Village — each tenant community on the platform — can have its own adapter. A parish community's AI assistant knows different things and observes different cultural protocols than a neighbourhood association's, even though both run on the same infrastructure. This per-community customisation reflects a foundational commitment: there is no single correct way for an AI to behave.

Data Sovereignty by Design

Community content stays on community infrastructure. This is not a policy that could be changed through a settings menu. It is an architectural constraint.

Interfaces: AI Integrated into Community Life

Village AI is not a standalone chatbot. It is integrated into the community platform — available where stories are shared, photographs organised, events planned, and documents managed. The AI understands content not as abstract text but as living knowledge with relationships, history, and significance.

Multi-Model Routing: Per-Community Specialisation

Village AI does not run a single model. It deploys per-product-type specialised models — each fine-tuned on the specific vocabulary, governance structures, and cultural context of its community type. In production today, Village operates nine specialised models: separate SLL adapters for whanau, episcopal, community, family, and business product types, with four further specialisations (conservation, diaspora, clubs, alumni) triggered only when the first real tenant of that type exists. Each runs on a shared 14B Qwen2 base.

A 14B community model serves as the fallback for any product type without its own specialisation. Routing is handled by an InferenceRouter that selects the correct model based on the requesting tenant's product type — a governance design, not just a performance optimisation. Each community gets the model trained on content most like its own, not a one-size-fits-all general assistant.

The routing infrastructure also supports graceful degradation: if the primary GPU endpoint is unreachable, the system falls back to a CPU-served model at reduced quality rather than failing silently. The community is always told when this happens — transparency about capability limitations is itself a governance commitment.

4. Governance Framework

Technology without governance is a loaded weapon. Village AI embeds governance at every level.

Governing Body

Each Village designates a governing body responsible for AI governance — typically the existing governance structure (a committee, board, or council of elders) rather than a new body. AI governance should be integrated into existing processes, not siloed in a technical committee.

Responsibilities include: approving the scope of AI training, setting behavioural boundaries, reviewing AI performance, and authorising updates.

Charter or Covenant

The relationship between the community and its AI assistant is formalised in a charter or covenant — a governance document, written in plain language, accessible to all members, and subject to the community's normal processes for amendment. It covers: the purposes for which the AI may be used; the data sources from which it may learn; the boundaries it must observe; the consent mechanisms for members whose content contributes to training; the process for reporting concerns and seeking redress; and the conditions under which the AI may be suspended or decommissioned.

Decision-Making: Community Governance of AI Evolution

AI systems change over time. In commercial products, these changes are made by the vendor, often without notice. In Village AI, changes to the AI's knowledge, behaviour, or capabilities go through the community's governance processes. AI evolution is democratic, not technocratic.

Consent and Participation

Members have meaningful rights regarding AI training on their content. Consent is opt-in (the default is exclusion), granular (members can consent to some uses but not others), revocable (withdrawal triggers retraining without that content), and informed (clear, non-technical explanations of what training means).

Accountability and Redress

When AI causes harm — and eventually it will — the governance framework provides clear pathways: incident reporting (any member can flag harmful, misleading, or culturally inappropriate outputs), investigation (determining root cause), remediation (retraining, boundary adjustment, or capability suspension), and transparency (incident reports documented and available to members). Trust is built through accountability, not opacity.

5. Alignment and Safety in a Village Context

The AI safety conversation is dominated by large-scale concerns: existential risk, autonomous weapons, mass unemployment. These are legitimate concerns, but they are not the concerns most relevant to community organisations. Village AI addresses alignment and safety at a different scale — the scale of a parish, a neighbourhood association, a cultural preservation group.

Alignment Targets: What Should the AI Optimise For?

Commercial AI products are typically aligned to "helpfulness" — in practice, producing responses that users rate positively, creating bias toward quick answers over careful reflection and engagement over understanding. Village AI recognises different alignment targets:

These are not metrics to be optimised. They are orientations that guide design decisions, training choices, and governance boundaries.

Hard Red Lines

Some things AI must never do, regardless of context, community preferences, or individual requests. These are embedded as architectural constraints, not as guidelines that might be overridden:

Resisting Drift Toward Global-Internet Norms

One of the subtlest risks of AI systems is value drift — the gradual adoption of assumptions embedded in training data that may conflict with community values. Base models carry implicit assumptions reflecting the demographics of the internet's most prolific contributors, not the values of any particular community.

Village AI addresses this through explicit counter-training. When the base model assumes efficiency is always desirable, fine-tuning can teach it that in this community, thoroughness matters more. When it assumes direct communication is always best, fine-tuning can teach it that indirect approaches reflect respect rather than evasion. This is not censorship. It is ensuring the AI reflects the community's values rather than the internet's values, which are often the values of nobody in particular.

Value Pluralism in Practice

Following Isaiah Berlin's insight that genuine human goods are fundamentally plural and often incommensurable, Village AI does not resolve value conflicts algorithmically. When privacy and transparency pull in different directions, when individual autonomy and collective responsibility cannot be fully reconciled, the AI surfaces the conflict rather than pretending it does not exist.

The AI might say: "This situation involves a tension between two things your community values. Here is what each value would suggest. This is a decision for your community to make." Harder and slower than an AI that picks the answer most likely to satisfy the user — but more respectful of the complexity of human life and more protective of human agency.

6. Comparison with General LLMs

The following table summarises the structural differences between a Situated Language Layer and a general-purpose cloud LLM. These are differences of kind, not degree.

Dimension Village AI / Situated Language Layer General Cloud LLM
Data sovereignty Community content stays on community infrastructure. No data flows to external systems without collective consent. User inputs are processed on vendor infrastructure. Data retention and use policies are set by the vendor and may change.
Alignment source Community values, expressed through governance processes and encoded in training. Vendor's alignment team, informed by user feedback but ultimately reflecting corporate values and risk tolerance.
Accountability Governing body within the community. Clear incident reporting and remediation processes. Vendor's trust and safety team. Users can report issues but have no decision-making role in resolution.
Flexibility to community Per-community adapters allow each Village to have distinct knowledge, values, and behaviour. One model serves all users. Customisation is limited to system prompts and usage policies.
Risk profile Lower capability ceiling, but risks are local and manageable. Community can inspect and correct. Higher capability, but risks scale globally. Individual communities have no ability to inspect or correct.
Cultural specificity Trained on community content with community cultural protocols. Trained on internet-scale data reflecting dominant cultural perspectives.
Cost model Infrastructure cost borne by community or consortium. No per-query fees to external vendors. Predictable. Subscription or per-query pricing set by vendor. Subject to change. Free tiers may subsidise data collection.
Exit strategy Community owns the model, the training data, and the infrastructure. Can switch providers or self-host at any time. Switching vendors means starting over. Custom configurations, conversation history, and institutional knowledge may not be portable.

7. "Not All LLMs Are the Same" — Differentiating Models

At a recent governance webinar, a presenter made a claim that is becoming disturbingly common: "Perplexity, Claude, GPT, Copilot, Gemini, DeepSeek — they're all about the same."

This claim is not merely wrong. For governance purposes, it is dangerous.

The consumer-facing products look similar: a text box, a conversational interface, plausible-sounding responses. But from a governance perspective — the perspective of someone responsible for their community's data and their members' privacy — the differences are profound.

What follows is an eight-dimension framework for evaluating AI language models from a governance perspective.

Dimension 1: Core Architecture and Capabilities

Model size affects depth of reasoning and quality of output. Context window (the amount of text processed at once) determines whether the model can work with long documents. Some models handle images, audio, or video. Some engage in extended reasoning; others work more quickly but less deeply.

A model with a small context window cannot summarise a year's worth of committee minutes. Capability matters — but it is only the first dimension, not the only one.

Dimension 2: Pre-training Data and Data Alignment

Every model is shaped by its training data. Models trained primarily on English-language internet content will reflect Western assumptions about family structure, social organisation, and values. They may handle te reo Maori poorly. They may misunderstand Pacific cultural contexts.

For community organisations: is your community's context represented in the training data? If not, responses will default to whatever the data suggests is "normal" — which may be foreign to your community's experience.

Dimension 3: Alignment, Safety, and Behaviour

How a model is aligned determines how it handles sensitive topics and value-laden questions. Some vendors use Reinforcement Learning from Human Feedback (RLHF), where human raters evaluate outputs. Some use constitutional approaches, training against explicit principles. Some combine methods.

The alignment approach determines whose values the model reflects. A model aligned by raters in one cultural context handles topics differently than one aligned in another. For governance purposes: do you know how this model was aligned, by whom, and to what values? If not, you are trusting behaviour to people you cannot identify, operating under values you cannot inspect.

Dimension 4: Task Performance and Evaluation Results

Not all models are equally good at all tasks. Standardised benchmarks provide some guidance, but they measure what benchmark designers thought was important. The most relevant evaluation is task-specific: can this model help with what your community actually needs? This requires testing with your own content, against your own standards.

Dimension 5: System and Product Layer

The model itself is only one layer. The product built around it — the interface, the data collection, the guardrails — adds governance-relevant decisions. Some products collect extensive metadata about interactions. Some add guardrails that may conflict with community needs — blocking topics your community legitimately needs to discuss, or permitting outputs your community would find harmful.

The same underlying model can appear in products with very different governance characteristics. Understanding which layer is responsible for which behaviour is essential for governance decision-making.

Dimension 6: Governance, Security, and Sovereignty Posture

This is the most important dimension for community organisations.

Where does the model run? If it runs on the vendor's cloud infrastructure, your data leaves your control every time a member asks a question.

Who has access to your data? The vendor's employees, their subcontractors, their cloud providers, and potentially their government under national security provisions.

What jurisdiction governs it? If the vendor is based in the United States, your data is subject to the CLOUD Act, which allows US authorities to compel disclosure of data held by US companies regardless of where the data is physically stored. If the vendor is based in China, your data is subject to Chinese national security laws with similar implications.

What does the vendor do with your interactions? Many vendors use customer interactions to improve their models. This means your community's questions, your members' concerns, your governance discussions may become training data for a model that serves millions of other users — including your competitors, your critics, and people whose values differ fundamentally from your community's.

For community organisations that handle sensitive information — personal stories, governance deliberations, member concerns, cultural knowledge — the sovereignty posture of an AI provider is not a secondary consideration. It is the primary one.

Dimension 7: Economics and Operational Characteristics

Free tiers often subsidise costs through data collection — if you are not paying for the product, your data may be the product. Subscription models provide predictability but may change terms unilaterally. Per-query pricing creates budgetary uncertainty.

Vendor lock-in is a particular concern. If your community builds workflows and member expectations around a specific service, switching costs escalate over time. The best governance position is one where switching remains feasible.

Dimension 8: Ecosystem and Specialisation

Can the model be fine-tuned with your own data? Can it be integrated into existing tools? Is there a community of practitioners sharing governance-relevant experience? Does the vendor's long-term strategy align with your community's interests?

Why Dimension 6 Matters Most

A model can be technically superior — faster, more capable, more knowledgeable — and still be governmentally unacceptable for your community. If your data is subject to foreign government access, if your interactions train a model you do not control, if your vendor can change terms unilaterally, then no amount of technical capability compensates for the governance deficit.

The recommendation for community organisations is to treat governance and sovereignty posture as the primary filter. Eliminate models that fail the governance test. Then evaluate the remaining options on capability, cost, and fit.

Governance and Sovereignty Ranking of Major Providers

The following ranking assesses major AI providers from a governance and sovereignty perspective, ordered from most concerning to least concerning for community and nonprofit organisations. This is not a ranking of technical capability — several providers ranked as more concerning are technically excellent. It is a ranking of governance risk.

Microsoft (Copilot / Azure OpenAI): Microsoft's deep integration with Office 365 creates an extraordinarily large data surface. If your organisation uses Microsoft 365, Copilot has potential access to emails, documents, calendars, and communications. The CLOUD Act applies. Enterprise agreements are complex and structurally favour Microsoft. The combination of existing data access and AI capability represents a concentration of information power that governance bodies should examine carefully.

OpenAI (ChatGPT / GPT-4): United States jurisdiction and CLOUD Act exposure. The partnership with Microsoft adds governance complexity — data may flow between entities in ways that are difficult for customers to audit. OpenAI's for-profit (capped-profit) corporate structure creates inherent tension between mission and revenue. Training data practices have historically been opaque, with several ongoing legal disputes regarding data sourcing.

Anthropic (Claude): United States jurisdiction and CLOUD Act applies. Anthropic's constitutional AI approach — training the model against explicit principles — is methodologically interesting but still opaque to external governance bodies. Google holds a minority investment, adding potential data-flow complexity. Anthropic has demonstrated more transparent safety practices than some competitors, but transparency is not sovereignty.

Google (Gemini): Google's AI products exist within the world's most comprehensive advertising and data infrastructure. Google already holds immense data about individuals through search, email, maps, and mobile operating systems. Adding AI capabilities to this foundation amplifies existing surveillance capacity. Data is used to improve products across Google's ecosystem. For organisations committed to data sovereignty, the structural incentives are deeply concerning.

Perplexity: A relatively new entrant with a less established governance track record. United States jurisdiction. The business model is still evolving, creating uncertainty about long-term data practices. Enterprise governance features are limited compared to more established providers.

DeepSeek: Chinese jurisdiction. Subject to Chinese national security laws that can compel data disclosure and cooperation with intelligence services. Governance transparency is the lowest of any major provider. Technical capability is significant and improving rapidly, but for organisations in Aotearoa New Zealand and similar jurisdictions, the governance concerns are severe. Community data processed through DeepSeek may be accessible to a foreign government with no accountability to your community.

A Note on Bias in Rankings

It is important to acknowledge that any ranking of AI providers carries potential bias. An article assisted by an AI made by Anthropic might present Anthropic more favourably. An article written using Google's tools might do the same for Google. The ranking above is based on governance structure, legal jurisdiction, and data sovereignty posture rather than technical capability or marketing claims. Readers should apply their own critical judgment and consult multiple sources.

Open-Weight Models and Sovereignty

For communities that can run models locally — or through trusted hosting partners — open-weight models like Qwen, Mistral, and Llama may score highest on sovereignty. No data leaves the community's systems. No vendor can change the terms. No government can compel disclosure through the model provider.

The trade-off is operational responsibility: technical capacity, ongoing maintenance, and hardware investment. But for communities where data sovereignty is a primary concern, this trade-off may be essential. This is the foundation on which Village AI is built — and as the next section explains, recent developments in the wider AI landscape have made this choice not merely principled but urgent.

8. Security in a Post-Mythos World

In April 2026, Anthropic disclosed the capabilities of Mythos — a model it chose not to release publicly because it can discover software vulnerabilities at scale across every major operating system and web browser, and write working exploits for them. Anthropic's controlled-release programme, Project Glasswing, offered the model to approximately forty large technology companies so they could find and fix their own weaknesses before equivalent capabilities proliferate. The first $100 million of compute costs were covered by Anthropic.

The implications for community organisations are not abstract. What was previously the preserve of nation-state cyber programmes — the ability to find and exploit hidden weaknesses in software — will, within a year or two, be available to anyone with access to a sufficiently capable model. The barrier to entry drops from millions of dollars and years of expertise to a model prompt.

This changes the calculus for every organisation that stores data on infrastructure it does not control. Every major cloud platform — every US-based SaaS provider, every vendor whose infrastructure depends on the same operating systems and web browsers Mythos targets — is now a confirmed attack surface. Self-hosting, which was already the principled choice for data-sovereign communities, becomes the security choice too.

What Village has done

Village runs on two privately managed servers — one in France (OVH), one in New Zealand (Catalyst Cloud) — with no US cloud dependencies. A small number of EU-based third-party services are used where self-hosting is not practical: Airwallex (New Zealand) for payment processing, DeepL (Germany) for on-demand translation, Bunny CDN (Slovenia) for media edge caching on the EU server, and ProtonMail (Switzerland) for email transit. All are European or New Zealand entities outside US jurisdiction. No community data is processed on, stored on, or transits through US-owned infrastructure. This architecture was chosen for data sovereignty reasons, but in a post-Mythos world it also functions as a security posture: a small, well-defended target is not in the blast radius of the mass-exploitation scenarios that Mythos-class capabilities enable.

Specific measures taken since the Mythos disclosure:

What cannot be defended

It would be irresponsible to claim that any small platform can defend against a Mythos-class model directly targeting it. No small platform can. The defence is architectural: a small attack surface with no US cloud dependencies, strong encryption, rapid patching, and the structural isolation that makes mass-exploitation tools inefficient against a target that does not look like the targets they were designed to scan. Village is not immune. It is positioned where the blast is weakest.

The CLOUD Act compounds the problem

The United States CLOUD Act gives US authorities the legal power to compel disclosure of data held by US companies, regardless of where the data is physically stored. In a post-Mythos world, US-controlled infrastructure is simultaneously subject to legal compulsion AND a priority target for AI-driven exploitation. These are not separate risks — they compound. Sovereign hosting on EU and NZ infrastructure addresses both vectors. Village's refusal to use US cloud services, which was originally a policy decision grounded in data sovereignty principles, now carries security weight that it did not carry six months ago.

9. Village in Practice — Connecting to Action

The previous article in this series — A Smaller Room Than You Think — concluded with six concrete directions for readers who want to act on the governance questions this series raises. This section shows what those actions look like when a community actually implements them.

1. Treat the ownership of your tools as a serious question.

Village is hosted on two sovereign servers — OVH (France, EU jurisdiction) and Catalyst Cloud (New Zealand). No US-owned cloud services process, store, or transit community data. Payment processing uses Airwallex (NZ) Limited — a New Zealand-regulated entity, not a US corporation. Every piece of infrastructure is documented and auditable.

2. Attend to scholars and frameworks thinking about this for longer.

Village's architecture is not built in a vacuum. The consent model implements the CARE Principles (Collective benefit, Authority to control, Responsibility, Ethics). The cultural boundaries system enables tenant moderators to define tikanga-specific handling requirements for AI — tapu, restricted, contextual, seasonal, relationship-based — following the frameworks articulated by Te Mana Raraunga and Dr. Karaitiana Taiuru. The system deliberately does NOT train on WAI 262 material without explicit permission, treating that restraint as a governance commitment rather than a limitation.

3. Reduce dependence on the systems you are most uneasy about.

Village offers a self-contained alternative for the functions communities most commonly depend on external platforms for: messaging (Matrix protocol, self-hosted), document management (Collabora, self-hosted), storage (OpenStack Swift on NZ sovereign infrastructure), email (ProtonMail Bridge), and AI assistance (locally-hosted SLL models). Every function runs on infrastructure the community can audit, and every function continues to work if the AI component is temporarily unavailable.

4. Participate in deliberation rather than only in consumption.

Every thumbs-down rating on a Village AI response enters the feedback pipeline. The system investigates the complaint automatically, checks the response against its training data, and if the complaint is substantiated, generates a training candidate for the next model update. Community members are not consumers of AI — they are participants in its ongoing governance, and the architecture makes that participation consequential rather than decorative.

5. Bring the question into conversations that already exist.

Village serves thirteen distinct community types — parishes, whanau, governance boards, conservation groups, clubs, alumni networks, committees, businesses, and more. Each type receives its own product vocabulary, its own AI specialisation, and its own governance defaults. A parish vestry does not need to learn "tenant isolation" — it sees "Parish Members" and "Parish Moderators" in its own language. The governance question is embedded in the tools communities already use, not presented as a separate concern.

6. Honour the obligations that come with knowledge.

The AI consent architecture implements per-purpose consent: members choose separately whether their content may inform AI triage, AI OCR processing, and AI summarisation. Consent is opt-in, granular, revocable, and explained in plain language. Content designated as culturally loaded — through the cultural boundaries management system — receives additional protections, including the ability to block AI processing entirely, escalate to a named cultural authority, or require explicit contextual consent before the AI engages with the material.

These are not aspirational features on a roadmap. They are in production, running on sovereign infrastructure, serving real communities. The gap between the principles articulated in this series and the systems that embody them is, for once, smaller than it usually is.

10. Training a Situated Language Layer — What Is Actually Involved

The perception that AI development is impossibly complex serves the interests of those who want communities to remain consumers rather than creators. Training a Situated Language Layer is not trivial, but neither is it beyond the reach of community organisations with modest technical capacity and clear governance processes.

a) Define Goals and Scope

Every AI training process begins with governance questions, not technical ones.

What should the model know? Community history, organisational policies, service directories, cultural protocols, governance documents. The scope of knowledge is a governance decision.

What should it refuse? Making decisions on behalf of members, offering medical or legal advice, disclosing private information, engaging with topics the community considers harmful. The scope of refusal is also a governance decision.

How small can we keep it? There is a powerful governance incentive to keep models small. A smaller model is easier to inspect, cheaper to run, faster to retrain, and more predictable in its behaviour. The question is not "how big can we make it?" but "how small can we keep it while still being useful?" The answer depends on the community's specific needs, and finding it requires clear-eyed assessment rather than aspiration.

b) Choose Base Model and Method

The process starts from an open-weight foundation model that already understands language, reasoning, and general knowledge. The community adds a thin layer of community-specific knowledge on top.

The method — parameter-efficient fine-tuning — adjusts only a small fraction of the model's parameters. Think of it as adding a custom lens to a camera rather than building a new camera. Training a full model from scratch requires resources only large technology companies possess. Training an adapter requires a fraction of those resources — feasible for community organisations, especially when infrastructure costs are shared.

c) Collect and Filter Training Data

Training data is drawn from sources the community controls: platform documentation, community content (with consent), governance documents, and cultural context. Every piece of training data is a governance decision — including a document means the AI may draw on it; excluding it means the AI will not have that knowledge.

d) Respect Consent and Privacy

Not all community content should train the AI. Per-content consent flags allow members to control which contributions are used. Tenant-level settings allow the governing body to set community-wide defaults. Visibility rules ensure that content shared with a limited audience is not used to train an AI that responds to all members. Consent is not a checkbox presented once and forgotten — it is an ongoing relationship.

e) Transform Data into Training Examples

Raw community content must be transformed into formats the training process can learn from: question-answer pairs, persona examples, governance boundary demonstrations.

This is where community values become model behaviour. A policy document becomes examples showing how the AI responds to policy questions. A governance boundary becomes examples of the AI refusing to cross it. The quality of this transformation directly determines the quality of the resulting model.

f) Biases in the Base Model

It is essential to understand that the training process does not start from neutral. The base model — trained on internet-scale data — carries biases that reflect its training corpus:

Community training does not eliminate these biases. It works with them — reinforcing where the base model's tendencies align with community values, and countering where they do not.

g) How Debiasing Works in Practice

Debiasing is not a one-time process or a magic filter. It is deliberate counter-training: constructing specific examples that correct for known biases in the base model.

For a community AI, this means including training examples that:

The goal is not to eliminate bias — that is impossible in any system built by humans on human data. The goal is to make biases known, documented, and actively countered. The community should understand what biases exist in its AI, what steps have been taken to address them, and what residual biases remain.

h) Configure, Run, and Monitor Training

The training process takes the community's curated data and adjusts the base model's behaviour. Monitoring ensures the model learns what we want without losing what it already knows — a model that learns governance boundaries but forgets how to answer simple questions has traded one problem for another. The community's technical partner monitors this balance and reports to the governing body.

i) Evaluate

Evaluation has two components, both non-negotiable:

Automated evaluation tests whether the model meets defined standards:

The model achieves high accuracy on governance boundary tests — the hard constraints that define what the AI must never do. This is the most critical evaluation dimension, because a model that crosses governance boundaries is worse than no model at all.

Human evaluation tests whether the model's responses feel right:

Human evaluation involves community members, not just technical staff. The people who will use the AI daily are the people best positioned to judge whether it serves the community well.

j) Export and Deploy

The trained model is converted to an efficient format and deployed on community infrastructure. The principle is straightforward: the model runs where the data lives. No cloud dependency. The community is not subject to any vendor's uptime, pricing changes, or policy decisions. If the community decides to retrain the model or shut it down entirely, it can do so without anyone else's permission.

k) Two-Model Routing in Practice

The fast model handles everyday questions — "When is the next meeting?", "How do I upload photos?" — with quick, predictable responses. Most interactions flow through this model. The deeper model handles complex synthesis — "What themes have emerged from community feedback this year?" — with more nuanced outputs and additional governance scrutiny. The routing criteria are configurable by the governing body.

l) Per-Community Adapters

Each Village can have its own knowledge layer — its own fine-tuned adapter — without retraining the base model. A parish community's AI assistant knows its liturgical calendar and pastoral care protocols. A neighbourhood association's knows its local services and planning processes. Both share the same base model but each reflects its own community's knowledge and values.

Each adapter is auditable, modifiable, and controlled by the community's governing body. When knowledge changes, the adapter is updated. When values evolve, it is retrained. The community is never stuck with an AI that no longer reflects who they are.

m) Handle Failures and Rollbacks

Training does not always produce better results. Village AI maintains versioned models. Every training run produces a new version, tested before deployment. If a new version fails evaluation or produces unacceptable results, the previous version is restored.

This rollback capability is a governance guarantee: the community can always return to a known-good state. Experimentation and improvement are possible precisely because failure is reversible.

11. Use Cases in Community and Nonprofit Settings

Abstract principles become meaningful through concrete application. The following use cases illustrate how a Situated Language Layer serves community needs in ways that generic AI products cannot.

Hui Preparation

A committee chair asks: "Summarise the key discussion points from the last three committee meetings and identify any unresolved actions." A generic AI cannot do this — it has no access to the community's records. Village AI, trained on the community's own content, produces a summary that uses the community's terminology and identifies actions in governance context.

Service Navigation

A new family asks: "What support services are available for families with young children in our area?" Village AI, drawing on the community's own service directory, provides information that is specific, current, and relevant — not a generic list of national services but a curated guide to what is actually available locally.

Knowledge Stewardship

A member asks: "Help me write up Nana's stories about growing up in Ponsonby in the 1950s." Village AI understands the community's storytelling traditions and the cultural significance of preserving elder knowledge. It assists with structure and language while preserving the member's voice. It does not impose a generic template.

Community Planning

A governance committee asks: "What themes have come up most in our community feedback this year?" Village AI synthesises across meeting minutes, surveys, and informal conversations to identify patterns — presented as observations, not recommendations, preserving the committee's authority to interpret and act.

What Village AI Explicitly Does Not Do

Equally important is what Village AI refuses to do:

12. Implementation Pathways and Capacity Building

For most community organisations, implementing a Situated Language Layer will be a gradual process. The following pathways are designed to meet communities where they are, not where the technology wishes they were.

Pilot Programmes

Start small. A pilot might involve a single use case — AI-assisted meeting summaries — with a small group of consenting members. Pilots should be time-bounded, with clear evaluation criteria. At the end, the community decides whether to continue, expand, or discontinue. This is governance in action: experimentation with accountability.

Shared Infrastructure

The economics of community AI improve significantly when infrastructure costs are shared. A consortium of community organisations can share hosting infrastructure while maintaining separate, sovereign models through per-community adapters. Organisations already share meeting spaces and administrative resources. Sharing AI infrastructure is a natural extension, governed by the same principles of mutual benefit and sovereign decision-making.

Local Training Programmes

AI literacy is essential for meaningful governance. Members who understand what AI can and cannot do are better positioned to participate in governance decisions. Local training programmes — workshops, discussion groups, hands-on demonstrations — are governance education, not technical training. The goal is informed participants, not AI engineers.

Sovereignty-Respecting Partnerships

Most communities will need technical partners. A sovereignty-respecting partner serves the community's interests, not their own. They provide expertise without extracting control, build capacity rather than creating dependency, and accept accountability for their work. The relationship should be governed by the same principles that govern the AI itself: transparency, accountability, and the community's ultimate authority.

13. Risks, Open Questions, and Future Directions

Responsible governance requires acknowledging what we do not know and what could go wrong. The following risks and open questions deserve ongoing attention.

Re-centralisation Risk

There is a real risk that community AI infrastructure could become a new form of centralised dependency. If a single hosting provider serves many communities, or a single governance framework becomes dominant, the sovereignty gains could be eroded by new concentrations of power. Mitigations include maintaining multiple hosting options, building technical capacity within communities, and ensuring no single entity becomes indispensable.

Inter-Village Interoperability

As more communities develop their own Situated Language Layers, questions of interoperability will arise. How can sovereign communities collaborate on AI without compromising sovereignty? Can federated learning allow models to improve through shared insights while keeping data local? The federation model governing data sharing between Villages provides a starting point — bilateral agreements, explicit consent, preserved sovereignty — but extending it to AI collaboration will require new thinking.

Influence on Broader AI Governance Debates

If communities can demonstrate that effective, values-aligned AI is possible without surrendering data sovereignty, this challenges the assumption that scale and centralisation are prerequisites for AI capability. Community-sovereign AI provides a working example for policy-makers and regulators — one that prioritises human agency and democratic governance over scale and centralised control.

The Political Economy of AI

It is naive to discuss AI governance without acknowledging the political economy that shapes it. A company whose revenue depends on user engagement has no incentive to build AI that encourages users to spend less time on screens. A company whose business model depends on data collection has no incentive to minimise data flows. A company whose competitive advantage depends on proprietary models has no incentive to support open-weight alternatives.

Community-sovereign AI is not just a technical alternative. It is a political statement about who should control the AI systems that increasingly mediate human knowledge, communication, and decision-making. The answer this project offers is: the communities they serve.

14. Conclusion: From Users to Co-Stewards of AI

The dominant narrative about AI positions communities as consumers — passive recipients of capabilities developed elsewhere, according to values they did not choose, under governance they cannot influence.

This article has presented an alternative. Communities are not consumers of AI but co-stewards of it. The community decides what its AI should know and what it should refuse. The community governs how the AI behaves and holds it accountable when it falls short. The community controls the infrastructure, the data, and the values that shape every interaction.

This is not a utopian vision. It is a working architecture, with real constraints, real costs, and real trade-offs. A Situated Language Layer will not match the raw capability of the largest commercial systems. What it will have is something those systems cannot offer: genuine alignment with a specific community's values, under that community's governance, on infrastructure that community controls.

Not every community will choose this path. Some will decide that commercial AI capabilities outweigh the governance concerns. Some will lack the resources. These are legitimate choices, made by sovereign communities exercising their own judgment.

But for communities that do choose this path, the possibility is real. The technology exists. The governance frameworks exist. The philosophical foundations exist. What remains is the decision to begin.

The invitation is not to build your own AI. It is to govern AI — whatever AI you use — with the same seriousness and commitment to human agency that you bring to every other governance decision your community faces.

Because in the end, the question is not whether AI will shape the future of community life. It will. The question is whether communities will shape the AI — or whether the AI will shape them.

References

Alexander, C. (1977). A Pattern Language: Towns, Buildings, Construction. Oxford University Press.

Alexander, C. (1979). The Timeless Way of Building. Oxford University Press.

Berlin, I. (1969). Four Essays on Liberty. Oxford University Press.

Global Indigenous Data Alliance. (2019). "CARE Principles for Indigenous Data Governance." https://www.gida-global.org/care

First Nations Information Governance Centre. (2014). Ownership, Control, Access and Possession (OCAP): The Path to First Nations Information Governance.

Kukutai, T., & Taylor, J. (Eds.). (2016). Indigenous Data Sovereignty: Toward an Agenda. ANU Press.

Ostrom, E. (1990). Governing the Commons: The Evolution of Institutions for Collective Action. Cambridge University Press.

Stanford Encyclopedia of Philosophy. "Value Pluralism." https://plato.stanford.edu/entries/value-pluralism/

Te Mana Raraunga. (2018). Principles of Maori Data Sovereignty. https://www.temanararaunga.maori.nz/

Wittgenstein, L. (1921). Tractatus Logico-Philosophicus. Translated by C. K. Ogden (1922). Routledge & Kegan Paul.

This article is part of the AI Governance for Communities series by My Digital Sovereignty Limited. The series examines how community and nonprofit organisations can engage with artificial intelligence as active governors rather than passive consumers.

The Village platform and the Tractatus governance framework are developed by My Digital Sovereignty Limited. For more information, visit mysovereignty.digital and agenticgovernance.digital.

Contact: john@mysovereignty.digital

This work is licensed under a Creative Commons Attribution 4.0 International Licence (CC BY 4.0). You are free to share and adapt the material, in any medium or format, for any purpose, provided you give appropriate credit to the author and link back to the original.

Series navigation:

Previous: Part 4 Series Hub