MwalimuPLUS started as a tutoring product for Kenyan primary-school students. Three years later, it's used by more than 200,000 students across eight counties and runs as one of the largest production AI deployments in East African education. This article is the honest field-notes version: what worked, what didn't, and what we'd do differently if we were starting over.
It's written with the cooperation of the MwalimuPLUS team and reflects the system as it operates in late 2025.
What we built
At its core, MwalimuPLUS is an adaptive learning system. A student logs in, takes a short diagnostic, and the system serves a sequence of curriculum-aligned questions, explanations, and worked examples calibrated to their current level. As they work, the system updates its model of where they are and adjusts what it serves next.
Underneath, it's a collection of focused services: an adaptive engine that selects the next item, a content authoring system used by Kenyan teachers to maintain the question bank, an LLM-driven explanation generator, a teacher-facing dashboard, and a parent app.
What worked
Curriculum-anchored content. Every question is mapped to a specific Kenya CBC outcome. Teachers trust the system because it speaks their curriculum, not a generic global one. This is the single most important design decision we made.
Low-bandwidth-first design. Pages are under 200KB. Lessons load in 2-3 seconds on 3G. The app works in poor connectivity, which is most of the country.
Adaptive engine over flashy AI. The recommendation engine is mostly classical (Bayesian Knowledge Tracing with hand-tuned mastery thresholds), not LLM-based. It's predictable, debuggable, and parents and teachers can understand the logic when they ask.
LLMs for explanation, not for sequencing. We use LLMs for one specific job: generating worked-example explanations in the student's preferred language and reading level. Sequencing decisions stay with the deterministic engine. This separation has held up in production for two years.
What was harder than expected
- Content velocity. Maintaining a quality question bank against a curriculum that the regulator updates is genuinely hard. We spent more time on the authoring tools than on the AI.
- Parent engagement. Parents are hard to onboard. The pattern that worked was making the parent app a natural extension of the student experience, not a separate app.
- Device diversity. Kids share family phones running Android 8 through Android 14. Test matrices got large. Performance optimisation paid for itself ten times over.
- Teacher adoption. Teachers want to maintain a sense of professional authority. Tools that "do the teaching" face resistance. Tools that "help the teacher do the teaching" get adopted. We rebuilt the teacher dashboard twice before getting this right.
What we got wrong
We over-invested in the algorithm and under-invested in the content tooling for the first 18 months. The system's quality is bottlenecked by content quality — and content quality is bottlenecked by how easy it is for teachers to author and review questions.
We also underestimated the operational cost of running the LLM-based explanation generator. At scale, those tokens cost real money — and the cost optimisation work that closed that gap took us two quarters. Operators new to LLMs should plan for that.
The fact that something is "AI" is the least interesting thing about it once it's in production. What matters is whether the question bank is high quality, whether the teacher trusts what they're seeing, and whether the parent feels their kid is making progress.— Titus Wangusi
What we'd do now
- Start with content tooling, not the algorithm.
- Bring teachers in as designers, not just users.
- Use LLMs for narrow jobs (generation, translation, summarisation) — not as the brain.
- Instrument from day one so you can defend every claim about learning gains with data.
- Plan for the parent app from week one, not month nine.
Closing
The temptation in education AI is to assume that bigger models will solve everything. They don't. The work that compounds is the unsexy stuff — clean curriculum mapping, fast-loading pages, teacher-friendly tooling, honest measurement. Get those right, and the AI parts mostly take care of themselves.
If you're working on AI in African education and want to compare notes, we're always interested in conversations.