AI Tutor or Human Tutor? A Practical Rubric for UK Schools Evaluating Maths Tools Like Skye
A practical rubric for UK schools comparing AI maths tutoring like Skye with human tutors on cost, safeguarding, alignment and impact.
Choosing between AI tutoring and human tutor programmes is no longer a branding decision or a simple price comparison. For UK school leaders, it is a procurement and impact question: which intervention best fits the curriculum, protects pupils, scales reliably, and produces evidence you can actually use in school improvement planning? That is especially true in maths, where time pressure, staffing shortages, and uneven pupil progress often make intervention choices feel urgent. If you are comparing AI-first tools such as Skye Third Space Learning with human-led tuition, the right answer is rarely “always AI” or “always human”. It is “match the model to the problem”.
This guide gives school leaders a practical tutor evaluation rubric built around the criteria that matter most in schools: curriculum alignment, safeguarding, scalability, cost of intervention, progress evidence, and teacher oversight. It also shows how to use the rubric in school procurement, from initial shortlist to pilot review, so you can decide when AI tutoring is the best fit and when human tutors remain the safer, stronger option.
Pro tip: The cheapest per-hour tuition is not always the lowest-cost intervention. Schools should compare the total cost of delivery, administration, staffing time, safeguarding checks, and evidence of impact—not just hourly rate.
For many school leaders, this decision is happening in a more scrutinised market than ever. Since the National Tutoring Programme ended, schools have become more cautious about value for money and measurable impact, and that is pushing procurement teams to ask sharper questions. If you are building a broader intervention strategy, it helps to review related procurement and evaluation thinking in resources like student data privacy in assessments, performance optimisation for sensitive-data websites, and monitoring AI vendor signals before you commit to a platform.
1. The Real Question: What Problem Are You Trying to Solve?
Different interventions solve different problems
Before comparing products, define the intervention need. A Year 6 pupil who needs fluent arithmetic practice, low-friction repetition, and steady feedback is a different case from a GCSE pupil who needs conceptual explanation, emotional reassurance, and adaptive questioning. AI tutors can work extremely well when the school needs consistent, high-volume practice aligned to a narrow outcome. Human tutors tend to be stronger when the barrier is motivational, confidence-related, or cognitively complex. That distinction matters because the wrong intervention can look efficient on paper while producing weak progress in practice.
The most common error in school procurement is to treat all tuition as interchangeable. A school may choose a premium human tutor programme for a problem that is actually routine fluency, or deploy AI where a pupil needs relational support and guided reasoning. For a wider lens on matching tools to use cases, see how other leaders make shortlists in budget-friendly tool comparisons and how buyers assess quality in AI-designed products. The same procurement discipline applies in education: define the job first, then select the model.
Intervention aims should be measurable
Good interventions are specified in outcome language. Instead of saying “we need tutoring”, schools should say “we need 20 Year 8 pupils to improve algebraic manipulation so they can close a one-grade gap by summer”, or “we need targeted support for low prior attainers to secure foundational multiplication facts”. AI tutoring platforms often excel here because they can standardise the practice routine, ensuring every pupil receives the same sequence of tasks and feedback. Human tutoring can also be targeted, but the variation between tutors may be greater unless the programme is tightly managed.
For more on evaluating educational and content quality at scale, it is useful to look at the logic behind curation in AI-flooded markets and the thinking in bot-directory strategy for enterprise workflows. In both cases, the winner is usually the system that aligns better to the user journey and desired outcome, not the one with the loudest marketing.
AI tutoring is a model, not a magic wand
AI tutoring works best when it is constrained, curriculum-linked, and monitored. It is not the same thing as letting pupils chat freely with a general-purpose chatbot. A school-approved AI maths tutor should have guardrails, a known pedagogy, and reporting that helps staff interpret pupil activity. That is why schools should ask not simply “is it AI?” but “how is the AI controlled, evidenced, and aligned to classroom practice?”
2. Curriculum Alignment: The First Non-Negotiable
Match to the English national curriculum, not generic skill tags
Curriculum alignment is the backbone of any effective maths intervention. A platform may be impressive technically but useless if it does not map to the topics pupils are currently studying. Schools should ask whether the tool aligns to the English national curriculum, KS2/KS3 progression, GCSE strands, and the sequencing your department actually follows. In practice, the best systems do not merely label content as “fractions” or “algebra”; they break learning into small, teachable objectives that fit the school’s scheme of work.
Human tutor programmes can be adapted to the curriculum, but only if the provider has subject expertise and training routines that keep tutors on script. AI tutoring solutions such as Skye Third Space Learning may offer stronger consistency at scale because each pupil experiences the same curriculum-matched pathway. That consistency matters in schools where intervention groups are spread across multiple year groups or where staff want to avoid tutor drift. For a broader example of subject-specific matching, see also personalised learning analogies and structured AI research pathways, where sequencing and relevance drive performance.
Look for diagnostic-to-practice coherence
The best maths tools do not just assign content; they diagnose gaps and route pupils to the right practice. A Year 5 pupil struggling with fraction equivalence needs different work from a pupil who can identify equivalents but cannot compare fractions on a number line. Good curriculum alignment means the diagnostic assessment, teaching input, practice tasks, and review questions all point in the same direction. If the platform cannot explain how it moves from diagnosis to intervention, the alignment may be superficial.
Ask for examples of how the system handles misconceptions. For example, does it distinguish between procedural errors and conceptual misunderstandings? Does it re-test after a short delay to check retention? Does it adapt to prior responses or simply repeat the same exercise with different numbers? These questions help schools avoid “busy work” interventions that create activity without genuine progress.
Curriculum fit should be visible to teachers
Teachers should be able to see what a pupil worked on, why that item was assigned, and how it connects to current classroom teaching. If the system is opaque, teachers lose confidence and adoption falls. The best platforms provide teacher-facing language that mirrors classroom vocabulary and shows how intervention work supports whole-class instruction. That makes it easier for teachers to integrate the tool into lesson planning and follow-up support.
3. Safeguarding and Data Protection: The School Cannot Delegate Risk
Safeguarding is more than DBS checks
Human tutor programmes typically lead with safeguarding credentials: enhanced DBS checks, school DSL liaison, code of conduct requirements, and escalation processes. These are essential, but they are not the whole story. Schools should also examine session recording, communication controls, identity verification, tutor supervision, and how the provider responds to concerns about language, behaviour, or inappropriate content. The strongest human tutoring providers make safeguarding operational, not just documentary.
AI tutoring changes the risk profile. There may be no adult in the loop during every interaction, but there are still safeguarding and compliance responsibilities around content safety, prompt handling, pupil identity, and privacy. Schools should ask how the vendor prevents off-topic or unsafe interactions, what data is stored, who can access it, and whether pupil information is used to train models. For a useful adjacent perspective, read about student data collection in assessments and AI-driven cyber threats. Educational technology teams should be applying the same defensive mindset.
Safeguarding must fit school workflows
A platform can have strong policies and still fail in practice if it does not fit school workflows. Leaders need to know how pupils log in, who monitors usage, how alerts are surfaced, and what teachers or safeguarding leads are expected to do if a concern arises. A good system should reduce hidden administrative burden, not create another set of inbox tasks. If teachers cannot quickly identify unusual behaviour or session anomalies, the safeguarding advantage disappears.
Human tutors have the advantage of relational detection. A skilled tutor may notice disengagement, anxiety, or personal issues in a way AI cannot. Yet that benefit depends on tutor quality and communication with the school. AI tutoring, by contrast, offers consistency and log-level visibility, but fewer human cues. Schools should weigh these differences honestly rather than assuming one model is categorically safer.
Data minimisation and role-based access matter
Good school procurement now requires clear answers on data minimisation and access control. Which fields are collected? How long are they retained? Can staff export progress data without exposing unnecessary personal information? Are dashboards role-based for class teachers, heads of department, DSLs, and senior leaders? These are practical questions, not legal trivia, because they affect how confidently the platform can be used at scale.
4. Scalability: Can the Model Deliver at School-Wide Volume?
AI tutoring is built for consistency at scale
Scalability is where AI-first solutions often make their strongest case. A fixed annual subscription can support unlimited or near-unlimited one-to-one sessions without the staffing bottleneck that human tuition creates. That means a school can expand intervention access without spending months recruiting, vetting, and scheduling tutors. For schools with large cohorts or recurring intervention needs, this can dramatically change the number of pupils served per pound spent.
That said, scale is not only about headcount. It is also about reliability, timetable fit, and consistency across terms. AI tutoring can be deployed across multiple year groups with predictable delivery, whereas human tutoring often depends on tutor availability and session matching. If your school values standardisation and wants a repeatable model across cohorts, AI may be the better operational choice. For a broader view of system design under constraints, see cloud-vs-on-prem decision making and TCO modelling, both of which emphasise the same principle: scalable infrastructure wins when demand is uneven and predictable service quality matters.
Human tutoring scales through people, but at a cost
Human tutor programmes can scale regionally or subject-by-subject, but the limiting factor is always people. Recruitment, vetting, training, and scheduling add friction. This is not a flaw in human tutoring; it is simply the nature of the delivery model. When a school wants a deep, relational intervention for a small group of pupils, that cost can be worthwhile. But when the need is broad and recurring, the human model often becomes difficult to sustain without budget growth.
That challenge is similar to what many service businesses face when comparing bespoke delivery to automation. In education procurement terms, the question becomes: do you need high-touch expertise for a limited population, or a system that can reliably serve dozens or hundreds of pupils without expanding headcount each term?
Choose scale based on intervention frequency
Interventions that need daily or near-daily practice are excellent candidates for AI tutoring. Interventions that depend on rich discussion, confidence-building, or bespoke academic coaching may need human tutors even if the scale is smaller. Schools should avoid forcing one model to do the other’s job. A good rubric uses frequency, cohort size, and the complexity of support as the main scale indicators.
5. Cost of Intervention: Compare Total Cost, Not Just Price Tags
What schools should include in cost calculations
When comparing AI tutoring and human tutors, the headline price can be misleading. School leaders should calculate the total cost of intervention: direct tuition cost, admin time, tutor management time, safeguarding overhead, reporting time, onboarding, and any extra staff work needed to coordinate the programme. An apparently cheap hourly rate can become expensive if it requires substantial internal administration or produces weak attendance. Conversely, a fixed-cost AI platform can look expensive at first glance but become highly efficient once multiple cohorts use it repeatedly.
This is why tools such as ROI scenario planners are useful as a thinking model, even outside immersive tech. School leaders should run “what if” scenarios: What happens if uptake increases by 30%? What if one intervention cycle is repeated three times a year? What if the platform saves 20 hours of staff coordination? Those hidden variables often determine true value.
Per-pupil cost depends on utilisation
AI tutoring can be particularly strong where utilisation is high, because the cost is fixed while the number of sessions can increase. That is why products like Skye are attractive to schools looking for predictable budgeting. Human tutoring may have excellent quality but is usually priced per hour, which means cost grows with every extra session. A single-pupil premium programme may justify that spend; a whole-year intervention strategy often will not.
Schools should also account for opportunity cost. If a human tutor programme takes several hours a week of leader coordination, that time is not free. It is management capacity that could have gone into attendance work, curriculum development, or assessment analysis. The best procurement decisions often win not because they are cheaper per hour, but because they release school leadership time.
Beware false economies
Some schools choose the lowest-cost option, then discover poor attendance, weak alignment, or limited evidence. That is a false economy. A more expensive programme that produces measurable progress and requires little staff chasing may be the better financial decision. When budgeting, use three bands: minimum spend, realistic spend, and best-value spend. The “best-value” band is usually where decision-making becomes most honest.
6. Progress Evidence: Can You Prove It Worked?
Dashboards are useful only if the data is interpretable
One of the strongest arguments for AI tutoring is the potential for progress dashboards that show session frequency, accuracy, response patterns, and knowledge gaps. But dashboards are not impact unless they can be translated into school action. A good dashboard should help teachers identify who is stuck, what they are stuck on, and whether the intervention is accelerating progress relative to classroom learning. Without that interpretability, the platform may generate impressive charts that never influence instruction.
Human tutor programmes can also provide progress evidence, but it often relies on tutor notes and session summaries. Those can be rich in qualitative detail, especially around confidence and engagement. The trade-off is that summaries vary by tutor and may be harder to compare across groups. Schools should therefore ask whether the provider offers consistent metrics, not just narrative feedback.
Evidence should include leading and lagging indicators
Do not rely solely on end-of-programme test scores. Strong evaluation uses both leading indicators, such as attendance, task completion, and response accuracy, and lagging indicators, such as assessment improvement and class performance. This matters because end-point scores may take time to move, especially in maths where conceptual mastery develops gradually. If you only measure the final test, you may miss whether the intervention was working partway through.
For a similar mindset on separating signal from noise, see how AI analytics improve reporting without overcomplication and how data can separate real skill from hype. The lesson is the same in schools: track the indicators that predict change, not just the final scoreboard.
Ask for cohort-level and pupil-level reporting
School leaders need both views. Senior leaders want a whole-cohort picture to justify spend and evaluate strategic impact. Teachers need pupil-level detail to target follow-up and adapt classroom teaching. A platform that only delivers one of these views is incomplete. The best systems make it easy to report to governors, include in SEND or Pupil Premium reviews, and inform department planning.
Pro tip: Require vendors to show a sample impact report during procurement. If the report cannot be explained in plain English by a teacher in under two minutes, it is probably too complex to drive real school action.
7. Teacher Oversight: Who Is in Control of the Learning?
Teachers should remain the instructional decision-makers
Whether you choose AI tutoring or human tutoring, teacher oversight is essential. The platform should support the teacher’s professional judgement, not replace it. Teachers need to know which pupils have been assigned, what content has been covered, and what follow-up is required in class. If the intervention becomes a black box, teachers disengage and impact weakens. The most successful schools use tutoring as part of a wider instructional cycle, not as a standalone service.
AI-first solutions can strengthen oversight if they surface clear patterns and reduce repetitive planning. Human tutor programmes can also support oversight when there is structured communication with the school. The key question is which model gives teachers the right level of visibility with the least friction. That is especially important for secondary schools, where subject teachers and intervention leads may need different views of the same student data.
Oversight should include intervention review points
Schools should build scheduled review points into the programme: after the first two weeks, mid-cycle, and at the end of the cycle. These reviews should test whether the intervention is being delivered as designed, whether pupils are attending and engaging, and whether progress data matches teacher observation. If not, the school should be able to pause, adapt, or exit the programme quickly. Oversight is valuable only when it creates action.
Human support may be better for complex motivational barriers
When pupils are anxious, disengaged, or dealing with inconsistent attendance, a human tutor can sometimes provide the emotional and relational encouragement that an AI system cannot. That does not mean AI is ineffective; it means the form of support should match the barrier. The school leader’s job is to decide whether the main problem is practice volume, misconception correction, or relationship-building. The right answer may differ across pupil groups.
8. A Practical Tutor Evaluation Rubric for UK Schools
Use weighted criteria, not gut instinct
Below is a simple rubric school leaders can use during procurement and pilot evaluation. Score each provider out of 5 on each criterion, then apply weights according to your priorities. A school with a large intervention need may weight scalability heavily, while a smaller school with vulnerable pupils may weight safeguarding and oversight more heavily. The aim is to move from opinion to evidence-based selection.
| Criterion | What good looks like | AI-first solution like Skye | Human tutor programme |
|---|---|---|---|
| Curriculum alignment | Maps to scheme of work and topic sequence | Strong when tightly engineered and standardised | Strong when tutors are trained to follow school plans |
| Safeguarding | Clear controls, escalation, and data protection | Strong content controls; less human observation | Strong human oversight; depends on tutor compliance |
| Scalability | Can serve many pupils consistently | Excellent at high volume and fixed cost | Limited by tutor supply and scheduling |
| Cost of intervention | Total cost is transparent and sustainable | Often strong value at scale | Can be high but justified for complex cases |
| Progress evidence | Clear, actionable dashboards and reports | Often strong quantitative reporting | Often richer qualitative notes |
| Teacher oversight | Teachers can monitor, adapt, and act | Strong if dashboards are usable | Strong if communication is structured |
Suggested weighting model
If you want a simple weighting framework, use this starting point: curriculum alignment 25%, safeguarding 20%, progress evidence 20%, scalability 15%, teacher oversight 10%, and cost of intervention 10%. For a vulnerable cohort, increase safeguarding and teacher oversight. For a large upper-primary or KS3 intervention cohort, increase scalability and cost efficiency. For GCSE groups with uneven prior knowledge, increase curriculum alignment and progress evidence. The key is to adapt the rubric to your actual school need.
Decision thresholds
A practical threshold is this: if a provider scores highly on curriculum alignment, safeguarding, and progress evidence but only moderately on relational support, it may be ideal for routine maths intervention. If the group needs confidence-building, complex feedback, or significant adult encouragement, human tutoring may deserve a higher score despite higher cost. Use the rubric to make those trade-offs explicit rather than intuitive. This makes governing body and leadership decisions easier to defend.
9. When AI Tutoring Is the Better Choice
Large cohorts with clear learning gaps
AI tutoring is often the best choice when the school needs to support many pupils with similar gaps. Think of fluency issues in arithmetic, standard methods practice, or common misconceptions in algebra. These are areas where repetition, instant feedback, and standardisation can be highly effective. If the objective is to increase time-on-task without multiplying staff workload, AI-first tools are compelling.
Budgets need predictable, fixed-cost planning
Schools that need budget certainty may prefer AI tutoring because the model is easier to forecast over a full year. With human tuition, costs can escalate as demand rises or as additional sessions are added. A fixed annual price can support more confident planning for Pupil Premium, catch-up, or subject recovery. This does not automatically make AI the right educational choice, but it does reduce procurement volatility.
Teachers need consistent reporting and easy oversight
When the school wants a standardised view of progress across multiple pupils and year groups, AI dashboards can be a major advantage. They can help intervention leaders spot non-attendance, track session count, and see where pupils are falling behind. If the reporting is clear, teacher oversight becomes lighter and faster. That makes AI appealing where staffing is tight.
10. When Human Tutors Still Win
Complex barriers to learning
Human tutors often outperform AI when a pupil’s main barrier is not knowledge alone. Anxiety, poor self-belief, inconsistent attendance, or low trust can all make relational teaching essential. A skilled tutor can adjust pace, build rapport, and detect confusion through tone and body language. That kind of support is difficult for even the best AI systems to replicate.
High-stakes exam preparation with nuanced feedback
For some GCSE and A level learners, especially those aiming for top grades, human tutors can offer nuanced feedback on reasoning, exam technique, and confidence under pressure. They can probe why a pupil made an error and reshape explanation in real time. While AI systems can help with practice and knowledge gaps, there are moments when expert human judgement remains the most valuable intervention.
Small groups where customisation matters more than scale
If the cohort is very small and the support needs are highly varied, a human tutor may be more efficient than designing a highly specific AI workflow. In those cases, the school is buying expertise and flexibility rather than throughput. That may be the right choice if the aim is intensive, bespoke progress rather than mass intervention.
11. How to Run a Fair Procurement Process
Shortlist against the same criteria
To keep procurement objective, every provider should answer the same questions. Ask for curriculum maps, safeguarding policies, sample dashboards, case studies, pricing, and implementation timelines. If possible, involve a head of maths, a DSL, a data lead, and a senior leader in scoring the responses. That prevents the process from being captured by one viewpoint, such as price alone or technology enthusiasm alone.
It can also help to study how other sectors compare vendors through structured due diligence. Resources such as alternative data shortlisting, platform lock-in avoidance, and "" are not education-specific, but they reinforce the same discipline: compare how systems perform in real workflows, not just in sales decks. Use a pilot where possible, and require a review meeting at the end.
Pilot design should include baseline and exit criteria
A good pilot has a clear start and finish. Define baseline attainment, expected usage, success metrics, and a decision date. Then judge the pilot on actual outcomes, not anecdotes. Schools should also ask whether the provider can support implementation coaching in the first month, because early usage patterns often determine whether the intervention sticks.
Governance should be visible
Governors and trustees should see the rationale for the procurement, the evaluation method, and the expected impact. This is particularly important for AI tutoring, where questions about quality, privacy, and safety are likely to arise. A clean evaluation framework helps leaders explain why they chose a given model and how they will monitor it. Transparency builds trust.
12. Final Decision Guide: A Simple Rule of Thumb
Choose AI tutoring when the need is standardised, scalable, and measurable
If your core need is curriculum-linked practice, predictable delivery, clear dashboards, and broad access for many pupils, an AI-first solution like Skye Third Space Learning may be the strongest fit. It is especially persuasive when the school needs fixed-cost planning and wants to reduce dependency on tutor availability. In those cases, AI tutoring is not a compromise; it is often the most strategically sensible option.
Choose human tutors when the need is relational, complex, or highly bespoke
If the main barriers are confidence, motivation, advanced exam feedback, or safeguarding-sensitive support for a small group, human tutoring still has a major role. Schools should not treat human tutors as outdated. Instead, they should treat them as specialist tools for the right learning problem. In a mature intervention strategy, both models may coexist.
Use the rubric to make the trade-off explicit
The best decision is the one you can justify to pupils, staff, governors, and parents. That means showing why the chosen model fits the curriculum, how it protects learners, what it costs in total, how progress will be measured, and where teacher oversight sits. If you can answer those questions clearly, you are ready to procure with confidence.
For additional context on careful value assessment and operational planning, you may also find it useful to review long-term TCO thinking, AI signal monitoring, and curation strategies in crowded markets. In education procurement, as in other sectors, the best choices are rarely the flashiest; they are the ones that prove durable under real-world conditions.
Related Reading
- 7 Best Online Tutoring Websites For UK Schools: 2026 - Compare leading school tutoring options and see how they differ on cost, safeguarding, and delivery.
- Navigating Privacy: How to Address Student Data Collection in Assessments - A practical guide to handling student data responsibly in digital learning systems.
- Building an Internal AI News Pulse - Learn how leaders track model, regulation, and vendor changes before making decisions.
- Curation as a Competitive Edge - Why careful selection matters more than ever in crowded AI markets.
- ROI & Scenario Planner for Immersive Tech Pilots - A useful framework for comparing cost, adoption, and impact before committing to a new tool.
FAQ: AI Tutor or Human Tutor?
1. Is AI tutoring safe for UK schools?
It can be, provided the vendor has strong safeguarding controls, appropriate data protection practices, and clear school-facing oversight. Schools should review privacy policies, content filters, login controls, and escalation routes before approving any AI tutoring tool.
2. Does an AI tutor replace a human tutor?
Not always. AI tutors are often best for scalable, curriculum-aligned practice, while human tutors are better for relationship-building, complex misconceptions, and highly bespoke support. Many schools may use both for different intervention needs.
3. What should schools look for in a tutor evaluation rubric?
At minimum, schools should evaluate curriculum alignment, safeguarding, scalability, cost of intervention, progress evidence, and teacher oversight. A strong rubric also includes implementation support and the quality of reporting dashboards.
4. How do progress dashboards help school leaders?
Progress dashboards help leaders see attendance, engagement, accuracy, and topic-level trends. The best dashboards support decisions, such as which pupils need follow-up, whether the intervention is working, and how to report impact to governors.
5. Why might AI tutoring be cheaper than human tutoring?
AI tutoring usually runs on a fixed or predictable subscription model, so schools can serve more pupils without hiring more tutors. That said, schools should compare total cost, including staff time and administrative overhead, not just the headline price.
6. When should a school prefer a human tutor?
Human tutors are often the better choice when pupils need emotional support, nuanced feedback, or highly individualised academic coaching. They can also be useful for small cohorts where the school wants depth rather than scale.
Related Topics
Amelia Grant
Senior Education Content Strategist
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
What School Leaders Can Learn from Education Week’s 40‑Year Playbook for Trustworthy Reporting
How to Build a School-Closing and Attendance Tracker: A Step‑by‑Step Guide for Districts
From Spring Assessment to Targeted Tutoring: A Literacy Intervention Playbook
Why Asia‑Pacific Is Leading the In‑Person Tutoring Boom — and What Local Providers Can Copy
Choosing the Right CRM: A Guide for Educators and Tutors
From Our Network
Trending stories across our publication group