Free course: AI for nonprofit organizations

Lesson 2: What AI can (and can’t) do in 2026

Lesson 3: How to get what you want from AI

Lesson 4: Using AI responsibly and reducing risks

Lesson 1: Understanding how AI works

If you want to use AI well, you need to understand what it actually is. Not at an engineering level. But well enough to make smart decisions about which tools to adopt and which risks to watch out for.

Generative AI and other types of AI

When most people talk about AI nowadays, they are usually talking about one specific branch of it: generative AI. But AI is a much broader field, and understanding the landscape helps you avoid confusion.

Artificial intelligence is the general term for computer systems designed to perform tasks that would normally require human intelligence: recognizing patterns, making decisions, translating languages, identifying images, and so on. AI has existed in various forms since the 1950s. The spam filter in your email inbox is AI. The algorithm that decides which pages appear on Google or which posts appear in your Facebook feed is AI. The software that flags unusual credit card transactions is AI.

Some people say “I don’t use AI”, but it’s not true for 99% of them. We all use AI-assisted tools every day. You are probably even using generative AI without knowing it on platforms like Google or Youtube (automatic AI responses, automatic captions or translations, etc.).

Here is a quick map of the main types of AI you are likely to encounter as a nonprofit professional:

Generative AI creates new content (text, images, audio, video, code). Examples: ChatGPT, Claude, Gemini, Midjourney, Suno.
Predictive AI uses historical data to forecast outcomes. Examples: donor churn models, grant success predictors, fraud detection. Many nonprofits are already using this type of AI without calling it AI.
Classification AI sorts inputs into categories. Examples: sentiment analysis on survey responses, spam filtering, content moderation.
Recommendation AI suggests content or actions based on behavior patterns. Examples: the suggested resources on your CRM, the YouTube algorithm.

For most of this course, we will focus on generative AI, because it is the category that is changing day-to-day work most visibly. It presents both the greatest opportunities and the most significant risks for nonprofit teams.

How LLMs work

Generative AI relies heavily on large language models (LLMs). The easiest way to understand an LLM is to think of it as a highly advanced autocomplete.

LLMs do not “think” or “understand” text the way humans do. They use complex statistics to predict which word should come next based on the prompt you gave them. If you type “Thank you for your generous…”, the LLM predicts the next word is likely “donation” or “support”. Because these models are massive, this simple prediction mechanic allows them to write highly coherent, complex, and creative text.

This is important to understand because it explains both the power and the limitations of LLMs:

The power: because these models have processed so much human knowledge, they can synthesize ideas, explain complex topics, write in different styles, translate between languages, help debug problems, and assist with an enormous range of tasks.
The limitations: because they are predicting plausible text, they can generate confident-sounding errors (usually called ”hallucinations”). Also, sometimes they can’t explain why they did something wrong (they can write a plausible explanation if you ask for it, but many times it’s not really explaining why their complex internal systems chose a response).

The latest AI platforms give reasoning capabilities, multimodal understanding (image, audio & video) and tool access to LLMs, so they are not just simple “word predictors” anymore, they can generate more complex ideas and act on them. But they still don’t think like us, we should never forget that.

The difference between AI models and AI products

You need to know the difference between the engine and the car.

The AI model: This is the underlying engine. Examples include GPT-5, Claude 4, or Gemini 3. These are the mathematical frameworks trained mostly by major tech companies. Smaller companies sometimes fine-tune the major models for specific tasks.
The AI product: This is the software application (the car) built on top of that engine. ChatGPT is a product built on the GPT model. Your organization’s customized donor CRM might also be a product that uses the exact same GPT engine under the hood.

When you evaluate an AI tool for your organization, you should ask both product-level questions (features, pricing, support) and model-level questions (models included, training policies, knowledge cutoff, data handling). Products based on the same model may offer very different results.

Ways to use AI

AI tools come in many shapes, and the way you access them has real implications for your organization’s privacy, budget, and flexibility. Here are the most important distinctions.

Cloud vs. Local

Most AI tools you use today are cloud-based: your input is sent to a server owned by the AI company, processed there, and the response is sent back to you. This is convenient and gives you access to the most powerful models, but it means your data leaves your organization’s systems.

Local AI means running a model on your own hardware (a laptop or a server your organization controls). The tradeoff is that local models are generally less capable than the largest cloud models, and running them requires some technical setup. For highly sensitive data (client records, confidential donor information), local AI is worth exploring.

Closed-source vs. Open-source

Closed-source AI (also called proprietary AI) means the underlying model is owned by a company and you cannot examine, modify, or run it yourself. ChatGPT, Claude, and Gemini are all closed-source. They offer the most powerful models, but you are subject to their terms, data handling, price changes, etc.

Open-source AI means the models are publicly available. Anyone can download and run them. Examples include Llama, Mistral, Qwen, Deepseek, and many others. Open-source models can be run locally, modified, and deployed on your own infrastructure, which gives you more control but requires more IT skills.

For nonprofits with limited IT resources, closed-source cloud tools are often the practical starting point. But keeping an eye on open-source options is smart for long-term independence from big tech vendors.

API vs. UI

A UI (user interface) is what you interact with when you go to Claude.ai or ChatGPT.com and type in a message. It is designed for humans. You usually pay a monthly subscription per user if you want the best models and features.

An API (Application Programming Interface) is a way for software to talk to software. If you connect your CRM directly to an AI model to automatically summarize donor notes, that is API access. It usually requires either technical staff or no-code integration platforms like Make, Zapier, or n8n. You pay APIs by usage (based on how many words/tokens your apps request to AI systems), not a fixed price per user or app.

Most nonprofits start with UI and move toward API to explore new opportunities (automation, custom tools, using AI frontier models from different companies without paying many different subscriptions, etc.).

Chatbots vs. Agents

Chatbots like ChatGPT just responds to your messages. They are reactive. They don’t make changes on your systems or do anything without your input.

AI agents can take actions autonomously over multiple steps and using multiple tools if necessary. You give an agent a goal and it can browse the web, open documents, run searches, access your email or perform changes in external tools, all without you guiding each step. Some agents can even do recurring tasks on a schedule, without human input.

Agents are powerful (and probably the future for many tasks) but require more careful oversight. An agent that has access to your email, calendar, or donor database can cause real damage if it misunderstands an instruction or acts on incorrect information. You should probably start by using agents only for low-risk, reversible tasks.

The “jagged frontier” of AI

Researchers use the term “jagged frontier” to describe how uneven AI capabilities are. Some complex tasks for humans are surprisingly easy for AI, while some simple tasks for humans can’t be done correctly by AI.

The tricky part is that the jagged frontier does not follow obvious rules.

For nonprofits, this means:

You cannot assume that if AI handles one task well, it will handle all similar tasks equally well.
You must test AI on your specific use cases and treat any new use case as an experiment, not as an automatic green light.

What is training data?

An AI model learns from training data: the collection of text, images, audio and/or video (depending on the type of AI) that it was exposed to during the training process.

For nonprofits, training data has three main implications:

Bias: If certain communities, languages, or regions are underrepresented or misrepresented in the training data, the model will reproduce that bias in its outputs.
Gaps: If your country, language, field, or type of work is not well covered, the model may be very weak on those topics.
Privacy and ethics: You should understand whether the model was trained on data similar to yours and whether sensitive or copyrighted content might have been included.

What is a knowledge cutoff?

A knowledge cutoff is the point in time after which the model has not seen new training data. In other words, it knows about events and information up to that date, but not after, unless the product adds real-time search or retrieval on top.

Because training runs are expensive and slow, models are typically trained on data that stops several months before the model is released. Big companies are always training new models, but the specific model you are using now will never learn new info. You should not assume the AI knows what happened last month, or even last year.

For nonprofits, knowledge cutoff means:

The model may not know about your latest campaign, recent conflicts or disasters, policy changes, or new regulations. AI may give you information that was accurate a year ago but is no longer correct.
When you need up to date information, you must provide it in the prompt, connect the AI to a live data source, or use web research tools instead of relying on its “memory”.

What is a context window?

The context window is the amount of text an AI model can “hold in mind” at once during a conversation. It includes your messages, the AI’s responses, any documents you have uploaded, and any system instructions that have been provided.

If you are working with a long document or a very extended conversation, the AI may start to “forget” things. This can lead to situations where the user believe the AI has read a full 100 page report, but in reality it only remembers part of it.

Context windows have grown dramatically in recent models, but you should still be careful. There can be “context degradation” even if you are not above the context window (models may lose focus on relevant information for long contexts, leading to more hallucinations or ignored instructions).

For nonprofits, this affects how you design workflows:

For long documents, break them into sections or work with summaries.
For ongoing projects, create concise briefs that capture the essential context instead of pasting entire email chains or huge documents.
For critical tasks, ask the model to repeat back key facts it is using, so you can check whether it actually saw what matters.

What is fine-tuning?

Fine-tuning means taking a pre-trained model and training it further on a smaller specialized dataset so it behaves differently for your use case. It is like giving the model extra lessons focused on your organization’s language, policies, or workflows.

Fine-tuning can help with tasks where you want consistent, organization-specific behavior, such as answering FAQs about your services or following your tone of voice. However, it has additional costs and requires high quality examples and careful evaluation.

For most nonprofits, the first step is not fine-tuning. You usually get more returns by:

Writing better prompts that include your policies and examples.
Providing detailed context and instructions in what is called a system prompt or custom instructions (ChatGPT and similar tools have that feature). For many use cases, a well-crafted system prompt can get you most of the benefits of fine-tuning without the technical overhead.
Using tools that let you upload or connect your own documents for retrieval (e.g. NotebookLM).

Only consider fine-tuning later for high volume repetitive tasks where the extra investment pays off.

What are hallucinations?

An AI hallucination is when the model generates information that is factually wrong, completely fabricated, or internally inconsistent, and does so with the same confident tone it uses when it is correct. The model does not flag the error. It does not say “I am not sure about this.” It just states the wrong thing as if it were true.

In LLMs, hallucinations often look like invented statistics, fake sources, incorrect legal or medical advice, or plausible sounding but wrong descriptions of events.

Several factors contribute to hallucinations: gaps or errors in training data, the probabilistic way models generate text, and the fact that models are optimized for plausible language rather than truth.

You can minimize hallucinations using the latest frontier models (less prone to hallucinations) and web research features included in ChatGPT and similar tools (designed to search for reliable sources and include links), but they are never going to be 0% with the current LLM technology.

That’s why you should develop consistent verification habits. Every factual claim in an AI-assisted document should be checked against a reliable source before it goes out. Statistics, citations, legal references, and any specific claims about other organizations or people should be treated as unverified until confirmed.

Summary

Generative AI, the technology behind ChatGPT and similar tools, is the most relevant branch of AI right now. But AI is not only GenAI.
How you access AI (cloud or local, closed-source or open-source, UI or API, chatbots or agents) has important implications for privacy, cost, and control.
The jagged frontier means AI is unpredictably good and bad. Test it for your specific use cases. Do not generalize.
Training data shapes the model’s knowledge and its biases. Be aware of both.
Knowledge cutoffs mean AI may be out of date. Use research tools and verify time-sensitive information.
Context windows shape what AI can “remember”. Don’t assume AI remembers everything you write or upload.
Fine-tuning is an advanced option for customizing AI to your needs. Providing examples and using system prompts are more accessible alternatives.
Hallucinations are real and common. Build verification into every workflow that uses AI for factual claims.

Lesson 2: What AI can (and can’t) do in 2026

If you believe the marketing hype from big tech companies, AI can run your entire nonprofit while you sit back and relax. If you believe the doomsayers, AI is completely useless and untrustworthy. The reality in 2026 is right in the middle.

AI is an incredibly powerful tool, but it is a tool with specific strengths and glaring weaknesses. To use it effectively, you have to know exactly when to hand a task over to AI and when to keep it strictly in human hands.

Where AI is strong today

Here are the areas where you should be relying heavily on AI right now:

Drafting and editing text

Modern language models are very good at producing clear, structured text when you give them a specific role, audience, and examples.

Typical nonprofit uses:

Drafting first versions of donor emails, newsletters, social media posts, blog entries, and web copy.
Rewriting communications for different audiences: for example simplifying technical text for beneficiaries or policymakers.
Turning bullet points or messy notes into polished paragraphs, FAQs, or scripts for volunteers.

Summarizing and reorganizing information

AI is very effective at compressing and restructuring text: this plays directly to its pattern recognition strengths.

Useful nonprofit examples:

Summarizing long reports, meeting transcripts, or consultation notes into short briefs.
Creating one page overviews for the board from 50 page strategy documents.
Turning case notes into anonymized thematic summaries for internal learning.

The risk is that summaries can omit subtle but important details, so humans must still spot check content, especially around sensitive programs.

Translation and language support

Modern AI systems provide strong translation and cross language support for many major languages.

For nonprofits, this can enable:

Quick translation of outreach materials into multiple languages for initial drafts.
Checking tone and clarity of messages for non native speakers.
Helping staff who speak different languages collaborate on shared documents.

However, for legal texts, sensitive communications, or minority languages, you should still involve fluent human reviewers.

Basic data analysis

AI is good at scanning large datasets or text corpora and suggesting patterns, outliers, and simple correlations.

Nonprofit examples:

Exploratory analysis of donor data to identify segments, giving patterns, or churn risk.
Grouping open ended survey responses into themes.
Detecting anomalies in transaction logs that might suggest fraud or reporting errors.

You should treat AI insights as hypotheses, not conclusions: staff still need to validate results with proper statistical and domain checks.

Brainstorming and ideation

Studies comparing humans and models on standard creativity tests find that AI models often generate more ideas and higher average originality scores in divergent thinking tasks.

In practice, this makes AI useful for:

Brainstorming event ideas, campaign concepts, or partnership angles.
Generating alternative headlines, slogans, or stories to test.
Proposing variations on existing programs, such as new workshop formats or volunteer engagement tactics.

AI is very good at “quantity of ideas” and remixing known patterns. Humans must still choose which ideas fit mission, constraints, and community reality.

Coding and technical tasks

For staff who work with websites, databases, or any kind of technical infrastructure, AI has become an extraordinary assistant.

You do not need to be a developer to use AI to write a simple script, create a form, fix a broken piece of code, build a data visualization, or troubleshoot a technical problem. AI explains what it is doing in plain language and can iterate based on your feedback.

This is democratizing technical capacity in organizations that cannot afford dedicated IT staff.

Where AI often fails or is high risk

Now for the harder conversation. In the same way that overestimating AI leads to embarrassing mistakes, underestimating its failure modes leads to real harm.

Automated decisions that affect people

AI systems can amplify bias from their training or input data, leading to unfair outcomes across demographic groups.

High risk nonprofit examples include:

Automated screening of beneficiaries or applicants for services.
Automated prioritization of who receives limited support first.
Automated scoring of staff or volunteers for performance or promotion.

Because these decisions are morally and politically sensitive, you should keep humans in charge and use AI only as a transparent support tool, if at all.

Generating novel insights

AI is very good at recombining and synthesizing existing ideas. It is much weaker at genuine originality: identifying a pattern no one has noticed, developing a truly new theoretical framework, or producing creative work that breaks from established conventions in a meaningful way.

For nonprofits doing innovation work, community organizing that depends on local insight, or program design that requires deep sector expertise, AI is a tool that can accelerate execution of ideas but is not a reliable source of breakthrough thinking.

Fully autonomous agents connected to real systems

Organizations experimenting with AI agents that can act on email, CRMs, and databases reported that small AI mistakes could propagate into large scale problems: for example, sending incorrect messages to many contacts or corrupting data records.

This is especially risky for nonprofits:

An unsupervised agent could email donors with wrong amounts or misaligned messaging.
It could change records in case management systems based on misinterpretations.
It could accidentally share sensitive information if tool boundaries are not carefully designed.

Guideline: If an AI agent can click, send, or delete on your behalf, treat it as you would a new staff member with system access. Start with very limited permissions, logs, and human approvals.

Where AI will never do as well as humans

Beyond the current limitations that may improve over time, there are categories of work where human involvement is not just currently necessary but will remain so by nature. Understanding this is important for making good decisions about where to invest in AI integration and where to protect human roles.

Treat AI like a brilliant but reckless intern. You can give the intern complex tasks, huge amounts of reading, and creative brainstorming assignments. But you would never let that intern publish a legal document, handle sensitive medical files, or speak on behalf of your organization without you reviewing their work first.

Empathy and trust

While some studies show that AI can be rated as “more empathetic” than doctors in written responses, critics point out that these tests reduce empathy to text style and ignore the real context of human relationships.

In nonprofit work, trust is built through:

Being physically or emotionally present with people over time.
Understanding community history, trauma, and power dynamics.
Sharing vulnerability and accountability.

AI can help draft empathetic language, but it cannot participate in real relationships or be held morally responsible when things go wrong.

Ethical judgment

Decisions about who receives limited aid, how to balance donor wishes and community needs, and how to respond to political pressures are fundamentally ethical and political, not technical.

AI can:

Help clarify options and summarize arguments.
Simulate stakeholder reactions based on patterns in text.

AI cannot:

Decide what your organization stands for.
Own responsibility when harm occurs.

Regulatory trends such as the EU AI Act and other national frameworks reinforce that humans remain legally and ethically responsible for AI assisted decisions.

Strategic vision and accountability

Analyses of future of work emphasize that uniquely human capabilities such as meaning making, purpose setting, and leadership become more important as AI automates routine tasks.

For nonprofits, this includes:

Articulating a compelling mission and narrative that mobilizes supporters.
Holding hope in difficult circumstances and helping communities imagine better futures.
Navigating ambiguity and conflict within coalitions and movements.

AI can suggest slogans and scenarios, but it cannot truly care whether a community thrives or fails.

A practical decision framework: should I use AI for this?

What are the stakes if this output is wrong? Low stakes (internal brainstorm, first draft for your eyes only): AI is probably fine. High stakes (public communication, legal document, client-facing decision): AI requires careful human verification and oversight.
Does this task require deep knowledge of our specific context, community, or relationships? If yes, AI can assist but should not lead. The human with that knowledge needs to be central.
Is there a vulnerable person on the receiving end of this output? If yes, apply a higher standard of review and ensure human accountability.
Am I comfortable being fully transparent with our stakeholders about AI involvement here? This is a useful gut check. If the answer is no, that discomfort is worth paying attention to.
Is the efficiency gain worth the risk in this specific case? For a one-time task that takes 10 minutes and requires high accuracy and human judgment, AI may not add value. For low-stakes & repetive tasks, the ROI calculations could be very different.

Lesson 3: How to get what you want from AI

This lesson is the most practical in the course. It is about skill-building: how to prompt well, how to give AI the right context, how to build tools your whole team can use, how to verify what AI gives you, how to manage costs, and how to set up systems that improve over time.

The anatomy of a good prompt

A prompt is anything you type or say to an AI system. A bad prompt produces a generic, unhelpful, or wrong response. A good prompt produces something you can actually use. The difference is usually not about length or complexity. It is about whether you have given the AI what it needs to understand your task clearly.

A good prompt typically contains some combination of the following elements:

Goal: What do you want and for whom.
Output: Format, length, tone, language.
Constraints: What to avoid, policies, word limits, budget or token limits.
Context: The minimum information the AI needs to do the task well.
Examples: One or two samples that show what “good” looks like.

Context engineering > prompt engineering

Most AI guides focus on “prompt engineering”: the specific words and techniques you use to structure your request. And it’s true that some small “tricks” in prompts used to have big impacts.

But as models get more “intelligent”, the wording and structure of your prompts is not so important. They will probably understand what you want regardless of the specific words you use. And they need less explanations and handholding to complete tasks successfully.

What is still very important (and will always be) is context engineering: deciding which information you give an AI system for a specific task. If you give too much or too little information, you will get bad results.

If the AI knows nothing about your organization, your voice, your programs, or your audience, it will always produce generic responses that are not very useful.

But dumping huge amounts of text or documents (“context dumping”) often reduces quality and increases costs, loading times, privacy risks, etc.

So we have to be selective about the info that we give AI.

What does context engineering look like in practice?

Building rich context documents. Create documents that you (or your team) paste into AI conversations to provide background: your organization’s mission, programs, and impact data; your key messages and brand voice; your audience descriptions; your current priorities. These do not need to be elaborate. Even a one-page organizational brief that you paste at the start of a relevant AI session dramatically improves results.
Using system prompts in tools that allow them. Many AI tools allow you to set persistent instructions (called a system prompt or custom instructions) that apply to all conversations.
Curating what you include. Context engineering is not about giving the AI everything. It is about giving it the right things. A 100-page document pasted indiscriminately into a prompt may actually produce worse results than a focused two-page summary because the AI has to work harder to identify what is relevant. Select and prepare your context deliberately.

Using system prompts and custom instructions

Most serious AI tools allow you to set instructions that apply persistently, not just within a single conversation. Depending on the tool, these are called system prompts, custom instructions, or similar. Understanding and using them well is one of the highest-leverage things you can do to improve your team’s AI results.

A system prompt is a set of instructions given to the AI that shapes how it behaves in every conversation. It can specify: who the AI is in the context of your organization, what it knows about your organization, what tone and style it should use, what it should and should not do, and how it should handle common situations.

Here is an example of a basic system prompt a nonprofit communications team might set:

You are a writing assistant for [Organization Name], a nonprofit that provides free legal services to undocumented immigrants in [Region]. Our voice is warm, clear, and grounded in the experiences of the people we serve. We avoid legal jargon in public communications. 

Our audiences include: the families we serve (primarily Spanish-speaking, limited English proficiency); donors and foundations (professional, values-driven); and the general public (interested but not expert). When helping with writing tasks, always ask which audience this is for if I have not specified. 

Our key messages are: [insert]. 

Never describe our clients as 'illegal' or 'aliens.' Preferred terms: 'undocumented immigrants,' 'immigrant families,' 'the people we serve.'

This kind of system prompt means every conversation with the AI starts from a shared understanding of who you are and what you need, without repeating it every time. It improves consistency across your team and reduces the cognitive load on individual staff members who use the tool.

Choosing the right tool for the right task

There are literally thousands of tools with AI features now in the market, so it’s not easy to choose. A few recommendations:

A common mistake is either using a single tool like ChatGPT for everything (missing out on tools that are significantly better for specific tasks) or accumulating too many tools without a clear rationale for each one (creating cost overlap, governance complexity, and staff confusion).
You have to test different tools and then decide which ones makes more sense, not just based on the quality of its results, but also considering easy-of-use, speed, integrations, cost, etc. If a tool delivers great results but it’s difficult or slow to use (requires copy-pasting between tools, waiting for results, changing formats, etc.), your staff might never use it or get tired of it in a few months.
Before adding new AI tools, review what you already have. Most major software platforms that nonprofits already use have built AI features into their existing products: CRM platforms, email marketing tools, project management software, document editing suites, and more. They are often a better starting point than adding a separate AI tool, because they integrate directly into workflows staff are already using, often without additional cost (if your subscription already includes them), and with data handling governed by agreements you have already reviewed.

Build tools for repetitive tasks

Once your team is comfortable using AI for individual tasks, the next level of value comes from building reusable tools.

There are different options:

Custom AI configurations (custom GPTs, projects, and similar)

Many AI platforms now allow you to create configured versions of the AI assistant with specific instructions, context, and capabilities built in. OpenAI’s custom GPTs, Claude’s Projects feature, and similar functionality in other platforms let you create a dedicated “Grant Writing Assistant” that knows your organization’s programs and voice, a “Donor Communication Helper” pre-loaded with your communication guidelines, or a “Policy Research Tool” configured to focus on your specific issue area. These configured tools lower the barrier for staff to get great AI results without needing deep prompting expertise.

Automated workflows

Workflow automation platforms like Make, Zapier, and n8n allow you to connect AI capabilities to your other software tools without writing code (or with minimal code). Common nonprofit automation workflows include:

Automatically summarizing incoming emails or messages that meet certain criteria and routing them to the appropriate staff member.
Generating a first-draft follow-up email when a new donor makes their first gift, pre-populated with the donor’s name, gift amount, and the specific program they supported.
Transcribing and summarizing meeting recordings and posting the summary to a shared workspace.
Generating a weekly digest of mentions of your organization’s name across news and social media.
Processing incoming grant application materials, extracting key fields, and populating them into a tracking spreadsheet.

These automations are not complex to build (especially with AI assistance in the building itself), but they require some upfront investment of time. The payoff is ongoing: hours saved every week on tasks that previously required manual staff time.

Tailored AI agents

AI agents can carry out multi-step tasks autonomously. Building agents requires more technical capacity than basic automation, but the potential time savings for complex and/or high-volume tasks are significant.

A key principle for all of these tools: build them collaboratively with the staff who will use them. The people who do a task every day know its nuances better than any manager or AI specialist. Tools built without their input tend to miss important edge cases and get low adoption.

Set up your own evaluations & benchmarks

To get the best possible results from AI, you should develop your own evaluation systems. It’s the only way to take reliable decisions, based on your own goals and data.

There are many public AI benchmarks (e.g. Artificial Analysis, Epoch). They are a good reference, but you still have to do your own testing. The best model in a general benchmark could be pretty bad for your specific tasks. And not all models are included in these public benchmarks, only the most popular ones.

You can start with a simple document with 10 key prompts you use frequently. Every time you want to try a new AI tool or model, you test it with those prompts and compare the results (accuracy, appropriate tone, absence of harmful or incorrect information, etc.).

You should also test again your key AI workflows every 6 months at least, even if you are not planning to change their setup or models (sometimes providers change the underlying models or their performance degrades without you noticing).

For high-volume or high-stakes AI workflows, you might want to create automated eval setups, where you automatically test a lot of prompts and/or large datasets (including edge cases and risky examples), using another AI model and/or custom code to judge the results without requiring hundreds of hours of human work.

Allocate budget for paid AI tools

Free AI tools exist and can be genuinely useful. But organizations that rely exclusively on free tiers are accepting significant tradeoffs: weaker models, less privacy protection, and usage limits that cap your team’s ability to work at full capacity.

The cost of good AI tools is modest relative to the productivity value they provide. A small nonprofit team with access to paid tiers of one or two well-chosen AI tools will significantly outperform the same team trying to get by on free tiers.

Consider the cost of not investing. The relevant comparison is the cost of paid AI tools versus the staff time, capacity, and quality of output you would need to achieve the same results without them.

Cost forecasting and control

AI costs can be difficult to predict, especially when usage is growing, when you are experimenting with new tools, or when you begin using AI via API (where you pay per use rather than a flat subscription). Developing basic cost management practices early prevents unpleasant surprises and helps you make smarter decisions about which tools to use for which tasks.

Understand the pricing model for each tool you use. Subscription tools (flat monthly or annual fee) are simpler to budget. API-based tools charge by usage, typically measured in tokens. Usage-based pricing can scale significantly as your team’s AI use grows, which can be a good thing (you pay for what you use) or a challenging one (a workflow that runs unexpectedly often can generate unexpectedly large bills).
Set usage alerts and spending caps for API-based tools. Most API providers allow you to set alerts when spending reaches a threshold and hard caps that prevent usage beyond a set limit. Configure these from the start, not after you receive your first surprising bill.
Right-size model usage to task complexity. Using the most powerful (and expensive) AI model for every task is like hiring your most experienced consultant to answer basic questions. For routine tasks (drafting a simple email, summarizing a short document, generating a list of options), a smaller, faster, cheaper model is often sufficient.
Audit usage periodically. Every few months, review which tools are being used, how much, and by whom. Are there subscriptions nobody is using? Are there tasks being done with expensive API calls that could be handled with cheaper models? Could we save money by moving from user subscriptions to API calls or viceversa?
Make the ROI case explicitly. When AI costs come up in budget discussions, come prepared with a concrete estimate of the staff time saved or other KPIs.

Create a shared AI library for your team

One of the highest-leverage investments you can make in your team’s AI effectiveness is building and maintaining a shared AI library: a documented, accessible collection of the prompts, workflows, context documents, and guidelines that your team uses most.

Without a shared library, every staff member has to rediscover for themselves how to get good results from AI. And when someone leaves the organization, their AI expertise goes with them.

A shared library changes this. It means collective learning accumulates rather than evaporates. It means new staff can be productive with AI tools faster. It means your organization’s prompts and workflows improve systematically rather than depending on individual initiative.

Your AI library can include:

Curated prompts for your most common tasks: Each prompt should include context about when to use it, what inputs to provide, and what to watch out for in the outputs.
Organizational context documents: the one-pagers about your programs, your mission, your key messages, your brand voice, your audience descriptions, that you paste into AI sessions to provide background.
Workflow documentation: step-by-step descriptions of AI-assisted workflows your team has developed for specific tasks. Documented workflows make it possible for anyone on the team to replicate what works.
Tool guidance: which tool to use for which task, with brief rationale. Your team should not have to figure out from scratch each time whether to use the general-purpose LLM, the built-in CRM AI feature, or the specialized writing tool for a particular task.
Lessons learned: brief notes on AI approaches that did not work, edge cases to watch out for, common errors in specific output types, and other practical wisdom accumulated through experience.

Keep this AI library in a place everyone can access and that is already part of your team’s workflow: a shared Google Drive folder, a Notion workspace, or wherever your team’s shared documents actually live. It should be easy to access and update for everyone.

Assign ownership. A library without a designated maintainer decays. Someone needs to be responsible for keeping it organized, removing outdated content, and prompting the team to add new learnings.

Summary

Context engineering is more important than prompt engineering. Give AI the organizational background it needs to produce outputs that actually sound like you and reflect your real work.
System prompts are high-leverage tools for making AI consistently useful across your team without requiring everyone to be an expert prompter.
Audit your existing software for AI features before adding new tools.
Build reusable configurations (like custom GPTs) and automations for your most common and repetitive tasks.
Build a shared AI library so your team’s collective learning accumulates rather than disappearing into individual silos.

Lesson 4: Using AI responsibly and reducing risks

Nonprofits depend on trust from beneficiaries, communities, donors, and regulators. AI can help your mission, but it can also damage that trust if you use it carelessly.

You will learn how to put basic guardrails in place so that AI serves your values, your people, and your legal obligations.

Train your staff continuously

The single most important thing you can do to reduce AI risk in your organization is invest in ongoing AI education for all your team. Not a one-time workshop. Not a PDF policy document circulated via email. Continuous, practical, role-relevant learning.

AI tools change fast. A staff member who learned how to use ChatGPT in 2024 may have outdated mental models about what current tools can and cannot do. An organization that treats AI training as a one-time event will have a team whose knowledge drifts further out of date every month.

Training should also address risks explicitly, not just capabilities. Staff who only learn what AI can do, without understanding hallucinations, data privacy risks, bias, and the importance of verification, are more likely to make consequential mistakes.

Some practical approaches:

Create a shared channel (in Slack, Teams, or whatever you use) where staff post AI tips, experiments, and cautionary tales.
Designate a monthly “AI learning” slot in an existing all-staff meeting where someone shares something new they have tried or learned.
Prioritize AI trainings designed for nonprofits. The risks and opportunities for a nonprofit organizations are not the same as for businesses or other AI users. Also, different departments have different risks (programs, development, comms, etc.). Generic AI training is better than nothing, but not ideal. The most effective training shows people how AI applies to the specific tasks they actually do.

Develop an organizational AI policy

Your organization needs a written AI policy. Not because a policy document magically prevents all harm, but because the process of developing it forces important conversations, and having it in place creates clarity and accountability.

An AI policy does not need to be long or technically complex. But it should address the following questions clearly:

What AI tools are approved for use at your organization? Staff should not have to guess whether a particular tool is acceptable. You can mention specific tools, providers or categories (e.g. ChatGPT, OpenAI, AI image generators, chinese open-source models or any tools without a certain certification). You can list only approved tools/categories, only banned ones or both.
Who approves new AI tools before staff adopt them? Without a clear process, you end up with shadow AI use and bigger risks. Designate who is responsible for evaluating and approving tools, and make it easy for staff to bring new tools forward for review.
What data can and cannot be entered into AI tools? Clearly specify that certain categories of data (e.g. personal information, sensitive financial data) should not be entered into external AI tools without explicit approval or appropriate safeguards (e.g. anonymization).
What are the expectations around disclosing AI use? In what contexts and tasks should staff disclose to external parties (funders, clients, partners) that AI was used?
What are the consequences of policy violations? Staff should understand that mishandling data or using unapproved AI tools is a serious matter. But be careful with your limits and punishments. If people fear punishment, they may hide their usage (”shadow AI use”), actually increasing risks. If they feel invited to share experiments, you can manage risks together.
How will the policy be reviewed and updated? Given how fast AI evolves, a policy may be significantly out of date in less than a year. Build in a regular review cycle and assign someone responsibility for monitoring developments and flagging when the policy needs updating.

Getting staff input during policy development is important both practically (they know the real workflows that need to be addressed) and culturally (they are more likely to follow a policy they helped shape). Consider forming a small working group that includes representatives from different teams and roles.

Establish clear ownership & accountability

AI tools are powerful, but they do not absolve humans of responsibility. Clear ownership and accountability are crucial for managing AI risks. This involves:

Designated AI leads: Appointing individuals or teams responsible for overseeing AI strategy, policy implementation, and risk management.
Process ownership: Ensuring that every AI-augmented workflow has a clear human owner who is accountable for the process and its outcomes.
Incident response: Integrating AI-related incidents (e.g., hallucinations, data breaches) into the organization’s broader incident response plan, with clear roles and procedures.

Review AI tools and vendors carefully

Not all AI tools are created equal when it comes to privacy, security, ethics, and long-term reliability. Before your organization adopts any AI tool for regular use, especially one that will handle sensitive data or be used in client-facing contexts, you should conduct a meaningful evaluation.

Here is what to examine:

Privacy and data use policy. Does the vendor use your inputs to train their models? Under what circumstances is your data retained, and for how long? Who has access to it? Can you request deletion? Read the actual terms of service, not just the marketing copy. If the terms are unclear or evasive, that is itself a signal.
Security practices. Does the vendor use encryption in transit and at rest? Do they have relevant security certifications? What is their track record on security incidents? What happens to your data in the event of a breach?
Ethical track record. Has the vendor been involved in public controversies related to bias, manipulation, misinformation, or harmful use of their technology? Do they publish transparency reports?
Compliance with relevant regulations. Depending on your jurisdiction and sector, you may be subject to regulations (GDPR, HIPAA, state privacy laws, sector-specific requirements) that constrain which tools you can use and how.
Vendor lock-in. Can you export your data, your custom configurations, and your institutional knowledge if you decide to switch tools?

A practical approach for smaller organizations: develop a short vendor assessment checklist that anyone proposing a new AI tool must complete before it is approved. This creates a consistent evaluation process without requiring legal expertise every time someone wants to try a new tool.

Avoid risky and unethical uses

Some AI uses are clearly high risk for nonprofits even if they are technically possible. Global compliance guides and the EU AI Act highlight especially sensitive areas such as biometric identification, eligibility screening, and systems that affect fundamental rights.

You should be very cautious or avoid AI for:

Automated decisions that determine access to services, benefits, or support.
Predictive policing, surveillance, or biometric systems that could harm civil liberties.
Manipulative targeting of vulnerable groups with fundraising or political messages.
AI-generated content that is presented as authentic human experience (e.g. a donor story or a testimonial from a beneficiary).

When in doubt, ask three questions:

Could this harm vulnerable people?
Could it affect fundamental rights?
Would we be comfortable explaining this use face to face with beneficiaries and regulators?

Prevent prompt injection and other attacks

Prompt injection is a technique where malicious or unexpected content inside documents or web pages tricks an AI into ignoring its original instructions and doing something else. Recent security and risk analyses show that AI systems which read external content or connect to tools are vulnerable to this class of attack.

For nonprofits, this matters if:

Your AI tools read emails, PDFs, or web pages from outside your organization.
You connect an AI assistant to systems that can send messages, update records, or access files.

Basic defensive measures include:

Limiting what external content the AI can execute as instructions, for example treating unknown text strictly as data to summarize.
Restricting what tools or actions an AI agent can perform automatically and requiring human approval for sensitive actions.
Keeping detailed logs and alerts for unusual behavior, such as an AI trying to exfiltrate data or send many emails.

For most nonprofits, the safest approach is keeping AI agents in “read and draft” mode and not giving them direct write access to critical systems without strong security support.

Minimize data sharing and use anonymization

Every piece of information you enter into an AI tool is potentially at risk: from data retention by the vendor, from security incidents, from policy changes by the platform, and from your own staff’s future mishandling. The simplest risk reduction strategy is to share less data in the first place.

Practical principles for data minimization:

Default to not entering sensitive data into external AI tools. Client personal information, staff HR records, confidential donor details, sensitive financial information, and anything covered by a confidentiality agreement should not go into external AI tools without a specific, evaluated reason and appropriate safeguards.
Anonymize before you enter. For many tasks, you can get AI assistance without entering identifiable information. Instead of pasting a client’s case notes with their name and identifying details, replace those details with placeholders (“Client A, a 34-year-old single parent in a major urban area”) before using AI to help you draft a summary or recommendation.
Use on-premise or local deployment options for sensitive workloads. For organizations handling particularly sensitive data (health information, immigration status, legal matters, financial vulnerability), it is worth exploring AI tools that can run locally or within a private cloud environment where your data does not leave your control.
Educate staff on what counts as sensitive. Not everyone instinctively recognizes all categories of sensitive information. Building explicit guidance into your AI policy (and training) about what kinds of data require special handling helps staff make better decisions in the moment.

Minimize bias

AI systems can perpetuate, amplify, and in some cases introduce new forms of bias: racial bias, gender bias, economic bias, cultural bias, and more. For nonprofits working with communities that have historically been marginalized, ignored, or harmed by institutional systems, this is not an abstract concern.

What can your organization do?

Test AI tools for biased outputs before deploying them in any context that affects people.
Ask AI tools to generate content for different demographic groups and compare the results.
Involve people from affected communities in the evaluation of AI tools used to serve those communities.
Choose vendors who publish information about bias testing and mitigation in their systems.

If you detect biased behavior, you may need to adjust prompts, switch models, change your examples/data, or in some cases stop using AI for that task.

Minimize environmental cost

Training large AI models consumes enormous quantities of energy and other resources. Running these models at scale continues to consume significant resources.

Nonprofits can respond in several ways:

Use AI where it clearly advances your mission and avoid frivolous usage.
Consolidate workflows to reduce unnecessary repeated calls, for example summarizing a document once and sharing the result with your team instead of many people running separate queries.
Prefer more efficient models when they are good enough. Consider using only small and/or local models if you prioritize energy consumption over results.
Favor vendors who are transparent about their environmental impact and have credible commitments to renewable energy.

Consider intellectual property and copyright

Generative AI systems are trained on a mix of licensed, public domain, and possibly copyrighted material, and legal debates about how copyright applies are active in many jurisdictions.

Practical guidelines for nonprofits:

Treat AI generated content as material that might embed patterns from copyrighted works. Avoid using it as “exclusive” content that you claim as fully original without review.
Be careful with logos, images of real people, or close imitations of existing art or text. When in doubt, use your own assets or clearly licensed material.
Check vendor terms about ownership of generated content and training data, and how they handle copyright claims.

For high visibility campaigns, consider asking legal counsel to review your use of AI generated content.

Summary

Responsible AI use is not a one-time compliance exercise. It is an ongoing organizational practice that requires investment in training, governance, culture, and continuous monitoring.
The most important structural elements are: a written AI policy that is actually used and regularly updated; clear human ownership and accountability for every AI system; and a rigorous vendor evaluation process.
The most important cultural elements are: a team that understands both AI’s capabilities and its risks; an environment where staff can raise concerns and ask questions without fear; and genuine transparency with the people your organization serves and works with.

Lesson 5: How to prepare for the future

AI will keep changing over the next decade. Your goal is not to predict everything, but to build a nonprofit that is flexible, skilled, and ready for different futures.

Invest in AI fluency as a core competency for all staff

In 2026, AI fluency is no longer a specialized technical skill. It is a basic professional competency, the same way spreadsheet literacy became a baseline expectation for office workers in the 1990s and internet literacy became one in the 2000s.

What does investing in AI fluency actually look like in practice?

Allocate budget for AI training. Depending on the skills and needs of your organization, it could be custom training developed in-house, external courses or a mix.
Make it part of onboarding. New staff should receive AI training as part of their orientation, not as an optional add-on. Cover your organizational AI policy, the approved tools, the data handling expectations, and the practical skills relevant to their role.
Encourage continuous learning: Leaders should treat time spent learning and experimenting with AI as part of the job, not something staff must do “after hours”.
Create space for peer learning. Build structures that let people share their learnings (specific AI channels or forums, brief show-and-tell moments in meetings, documented experiments in a shared library).

Build a culture of experimentation and safe failure

Organizations that will navigate AI well over the coming years are those that are genuinely curious about new tools and approaches, willing to try things that might not work, and structured to learn from failures rather than hide them.

A culture of excessive caution around technology is itself a risk: it means your organization learns slowly, misses opportunities, and eventually finds itself significantly behind peer organizations.

Safe experimentation does not mean careless experimentation. It means:

Creating safe spaces for trying new things. Low-stakes internal projects, pilot programs with defined scope, or explicitly experimental initiatives where staff understand the goal is to learn, not just to succeed. An experiment that reveals AI is not useful for a particular task is just as valuable as one that reveals it is.
Rewarding learning and honesty about failure. When someone tries an AI approach that does not work and they report honestly on what happened and what they learned, that should be recognized as a contribution, not treated as an embarrassment. Organizations where failure is punished develop cultures where people either stop experimenting or hide the results of experiments that did not go well. Both outcomes are bad.
Documenting and sharing what you learn. Experiments that are not documented do not generate institutional learning. Build the habit of capturing what was tried, what happened, and what you would do differently.

Store internal data and knowledge in AI-friendly formats

AI tools are only as useful as the knowledge and data you can give them to work with.

Organizational knowledge that is locked in inaccessible formats, scattered across personal hard drives, buried in email threads, or stored only in people’s heads cannot be leveraged effectively by AI tools.

What does “AI-friendly” mean in practice?

Text-based formats over image-based ones. A scanned PDF of a document is much harder for AI to use than the same document as a Word file or a properly formatted PDF with selectable text. Where possible, digitize and convert documents into formats that AI can read directly.
Structured storage over scattered storage. Documents and data that live in a shared, organized system (a knowledge base, a shared drive with consistent naming conventions, a CRM with complete records) are much more useful to AI tools than the same information scattered across individual email inboxes, personal Dropbox folders, and the memories of long-tenured staff.
Documented processes over undocumented ones. If your organization’s knowledge about how things are done lives primarily in the heads of particular staff members, it is both organizationally fragile (what happens when those people leave?) and AI-inaccessible. Investing in process documentation (SOPs, checklists, guides) creates an asset that AI tools can use to help onboard new staff, answer operational questions, and assist with consistent execution.

Appoint internal AI champions

Every organization needs people who are specifically tasked with staying current on AI developments, testing new tools, sharing learnings with colleagues, and helping translate between the fast-moving AI landscape and your organization’s specific needs and context.

What should an AI champion actually do?

Monitor developments in AI relevant to your work.
Test new tools before recommending them.
Support teams in designing prompts, workflows, and evaluations.
Coordinate updates to your AI policy, library, and training.

The AI champion role should be recognized and supported: given dedicated time, access to paid tool subscriptions for testing, occasional budget for relevant training or conferences, and visibility as a valued function rather than an extracurricular hobby.

Monitor legislation and compliance rules

The regulatory environment around AI is changing rapidly and will continue to do so. Organizations that are not paying attention to relevant legal developments risk finding themselves out of compliance with requirements they did not know existed.

Regulatory developments to watch include:

Data privacy law. AI tools that process personal data are subject to existing data privacy regulations (GDPR in Europe, CCPA in California, and a growing patchwork of state-level US laws, plus sector-specific regulations like HIPAA).
AI-specific legislation. The European Union’s AI Act is the most comprehensive AI regulatory framework currently in force, and its provisions apply to many organizations that process data about EU residents, even those based outside Europe. In the United States, a patchwork of state-level AI regulations is emerging, with requirements around algorithmic transparency, bias auditing, and disclosure in specific sectors.
Sector-specific requirements. Organizations in healthcare, education, legal services, housing, or financial services may face AI-specific requirements from their sector’s regulators.
Employment law implications. Using AI tools to assist with hiring, performance evaluation, or compensation decisions may trigger requirements under emerging AI employment laws in several jurisdictions.

A few practical tips:

Designate someone to monitor AI regulatory developments relevant to your jurisdiction and sector.
Subscribe to a nonprofit technology or legal newsletter that covers these developments.
Build a relationship with a legal advisor who is following AI law, even if you only consult them occasionally.
When you evaluate new AI tools or expand your use of existing ones, include a question about compliance implications in your review process.

Monitor open-source AI developments

Open-source models can be downloaded, run locally, and modified by anyone.

In 2026, open-source models can perform at or near the level of proprietary models for many common tasks. And they can be run locally, on hardware your organization controls, without sending data to an external server.

Why does this matter for nonprofits?

Privacy and data sovereignty. Running AI locally means sensitive data never leaves your systems. For organizations that handle client information, health data, immigration status, or other sensitive information, this is a significant advantage.
Independence from big tech vendors. Reliance on proprietary AI tools from a small number of large corporations creates dependencies: on their pricing decisions, their terms of service, their political choices, and their continued existence. Open-source models reduce this dependency. An open-source model that exists today will continue to exist even if the company that created it changes direction or ceases operations.
Cost. Running open-source models on your own hardware has costs (the hardware itself, energy, maintenance), but for organizations that have significant AI usage or powerful hardware already, it can be cheaper than subscriptions or API fees for proprietary tools.
Community accountability. Open-source models can be evaluated, tested for bias, and audited by researchers, civil society organizations, and the broader public in ways that proprietary models cannot.

The #1 barrier for most nonprofits today is technical skills. But this is changing, with increasingly user-friendly tools for deploying open-source models without deep technical expertise (LMstudio, GPT4all, Ollama, etc.).

Communicate change: prepare stakeholders and beneficiaries

AI usage will continue to grow in the next years, touching more tasks and areas. You have to plan for it and communicate correctly to avoid damaging your reputation.

Stakeholders who feel AI was introduced into their relationship with your organization without transparency or consultation are likely to feel disrespected. Communities that have historically been harmed by algorithmic systems are especially likely to have legitimate concerns that deserve genuine engagement.

Good practices:

Explain clearly why you are using AI, what tasks it supports, and what remains human led.
Be transparent when beneficiaries interact with chatbots or AI drafted content.
Invite feedback and concerns from communities, and adjust your approach where it conflicts with their expectations or rights.

This is also an opportunity: showing that you are using AI thoughtfully and ethically can strengthen your credibility with donors and partners who worry about irresponsible deployments.

Plan for different AI future scenarios

No one knows exactly how fast AI capabilities will advance or how markets and regulation will react. But it can be very useful to do some scenario planning: imagining several plausible futures and asking what each would mean for strategy.

You can run a simple scenario exercise with your team. For example, discuss three futures:

Fast AI acceleration: models become much more capable (close to AGI) and integrated into most tools by 2030, with AI agents doing autonomously +50% of our repetitive tasks. What new opportunities and risks appear for your programs? What will happen to your staff?
Slowdown or backlash: technical limits or public pushback slow AI adoption, and funders become skeptical of AI hype. How do you justify and adjust past investments? Can you keep using the same AI tools (or maybe open-source alternatives that will be available forever)?
Fragmented AI world: heavy regulation and geopolitics create different AI ecosystems by region, with varying access to models and data. How do you manage cross border programs and data flows?

You don’t need a detailed plan for each scenario. You need to:

Avoid decisions that are catastrophic in some plausible scenarios (e.g. extreme vendor dependency, complete deskilling of your team in a particular area because AI handles it).
Favor decisions that hold value across multiple scenarios (e.g. strong learning culture, documented processes, diverse tool portfolio including open-source solutions).
Have honest conversations at the leadership and board level about which scenarios seem most relevant to your context and what your organization’s response would be. Revisit those conversations periodically as the landscape becomes clearer.

The goal is not to predict the future. It is to build a resilient organization that can respond to a range of futures and adapt quickly to key changes.

Summary

Invest in AI fluency as a universal staff competency, not a specialty skill.
Build a culture where experimentation and honest reporting on failure are both valued.
Store your organizational knowledge in forms that AI tools can use easily.
Appoint AI champions who have the time and mandate to stay current and support their colleagues.
Communicate proactively and honestly with stakeholders about AI use.
Preparing for the future is less about guessing which model will win and more about building good governance and learning cultures that will serve you under many possible AI futures.

Next steps

Get free “AI Superpowers”. If you want premium tools & support, join our membership.

Get help from IA experts. Request a free consultation and get custom recommendations.

Receive new AI tools for nonprofits. Subscribe to our newsletter and follow on Linkedin.