Applied AI Governance in Government: The 2026 Leader's Playbook for the Gulf & Jordan
Applied Artificial Intelligence and Its
Governance in Government Work: A Practical Guide for Government Institution
Leaders in the Gulf and Jordan
Government sectors across the Gulf Cooperation Council (GCC)
states and the Hashemite Kingdom of Jordan are undergoing a fundamental
transformation in how public services are delivered, decisions are made, driven
by accelerated investment in applied artificial intelligence technologies, and
the establishment of the government governance frameworks required regulating
them. PwC estimates indicate that AI will contribute approximately USD 320
billion to Middle Eastern economies by 2030, equivalent to 11% of GDP, with
Saudi Arabia expected to capture USD 135.2 billion and the United Arab Emirates
USD 96 billion (nearly 14% of its GDP), while the four other Gulf states
(Bahrain, Kuwait, Oman, and Qatar) account for approximately USD 45.9 billion
or 8.2%. In terms of readiness, Saudi Arabia leads the Middle East region and
ranks seventh globally on the Government AI Readiness Index 2025, while the UAE
has recorded AI adoption of 97% among government entities — the highest
worldwide. Bahrain ranks fifteenth globally and second regionally on the World
Bank’s 2025 Government Technology Maturity Index with a score of 93.6%. Jordan has launched the National Artificial
Intelligence Strategy 2023–2027, encompassing 68 projects across 12 priority
sectors. These indicators shift the conversation from “Should we adopt AI?” to
“How do we govern it and measure its impact?” — the question this practical
guide answers through steps and tools ready for direct application.
This guide presents an integrated model that connects practical
applications, international governance frameworks, and risk
management within the government environment, with reference tables and
measurement tools that leaders can use directly inside their institutions.
The Conceptual Framework: From General AI to
Applied Government AI
Applied AI is the deployment of specific models to solve
tangible operational problems. The essential difference from research AI is
that applied AI is measured by its impact on institutional performance
indicators, not by its technical sophistication. A government leader who
conflates the two categories often misjudges both cost and risk.
A Practical Classification of AI Types by
Government Task
Before selecting any solution, the official needs to match
the type of technology to the nature of the task:
•
Intelligent Automation: For
high-volume repetitive operations — request classification, data extraction
from forms, and document verification. It is characterized by low cost, limited
risk, and rapid return.
•
Predictive Machine Learning
(Predictive ML): For predicting a behavior or event such as demand
for services, the likelihood of fraud, and infrastructure outages. However, it
requires clean historical data of no less than 24 months.
•
Generative AI: For
linguistic and analytical tasks such as drafting correspondence, summarizing
reports, and answering inquiries. However, its risks are high — relating to
output accuracy and hallucination — and it requires careful human review.
•
Agentic AI: For
executing composite, multi-step tasks autonomously. This is what the United
Arab Emirates has announced by transitioning 50% of its government services to
it within two years. It offers the highest potential returns and the highest
operational risks.
The Application Feasibility Equation: When to
Choose, and When to Reject
An application is worth pursuing when three conditions are
met together:
•
The existence of a
documented problem with a transaction volume exceeding 1,000 per month or a
completion time exceeding five business days.
•
The availability of
historical data that can be governed for a period of no less than 18 months.
•
The presence of a
Business Owner accountable for the outcome indicator, not for the technology.
The absence of any one condition exposes the project to
failure, because the most well-known government failures in adopting AI are
attributable to weak data or absence of accountability — not to the technology
itself.
Applied Use Cases in the Gulf and Jordanian
Government Sectors
The target countries are moving from initial pilots to
institutional scale-up. A study by the Saudi Digital Government Authority
indicates that the Kingdom is expected to reap USD 56 billion annually from
deploying generative AI in the public sector. In Qatar, the Qai entity has been
established as a unified national platform for AI development that links policy
with implementation. In Bahrain, the National Policy for the Use of Artificial
Intelligence was launched in July 2025 through the Information and eGovernment
Authority. In Jordan, the Ministry of Digital Economy has conducted readiness
assessments on 18 government entities as part of a five-year action plan. What
follows is an applied dissection of the most mature use cases, which can be
adapted within any government entity.
Use Case One: Automating Licensing and
Approval Cycles
The practical model proceeds across four layers: receiving
and automatically classifying the request, verifying document completeness
through image and text processing, classifying risk into three categories (low
/ medium / high), and then routing — low-risk requests to a fully automated
track, medium-risk to an employee with an analytical recommendation, and
high-risk to a human review committee. Expected success indicators after six
months include reducing processing time by 40–60%, reducing human errors by
30–50%, and improving beneficiary satisfaction by at least two points on
service surveys.
Use Case Two: Legislative and Policy Analysis
In 2025, the United Arab Emirates launched the world’s first
regulatory AI ecosystem for analyzing laws and their impacts. The practical
application is based on three key functions: comparing legislative texts to
detect conflicts, analyzing the impact of policy before its enactment through
data-driven scenario simulation, and reviewing the consistency of executive
regulations with parent statutes. The most important benefit is not speed
alone, but rather the discovery of hidden conflicts that a human team might
miss in lengthy legislation.
Use Case Three: The Internal Smart Assistant
for Employees
This is one of the fastest models for achieving returns.
These are generative AI applications that serve government employees
themselves: answering human resources inquiries, drafting correspondence, and
summarizing meetings. The United Arab Emirates announced the launch of a smart
HR assistant serving more than 50,000 government employees and simplifying 108
services. The decisive condition for success is linking the model to an
up-to-date institutional knowledge base (Retrieval-Augmented Generation, or
RAG), rather than relying on the model’s general knowledge, in order to reduce
hallucination and ensure output accuracy.
Use Case Four: Demand Forecasting and Risk
Detection
Predictive machine learning applications are used to forecast
demand on public utilities, detect fraud in subsidy applications, predict the
spread of epidemics, and manage emergencies. Their most prominent gain is the
shift from responding after an event occurs to preventing it beforehand.
However, their success is conditional on the quality of historical data and its
continuous updating; otherwise, the model gradually deteriorates (Model Drift)
and produces misleading recommendations.
Pillars of AI Governance: Applying the NIST
Framework in the Government Environment
The most widely used international framework worldwide is the
NIST AI Risk Management Framework (AI RMF
1.0), which is built on four integrated functions operating cyclically rather
than sequentially: Govern at the center, surrounded by Map, Measure,
and Manage. The international standard ISO/IEC 42001:2023 provides an
auditable and certifiable management system that is integrated with NIST.
The Govern Function: Building the
Organizational Structure and Policies
The practical implementation elements within the government
entity include:
•
An AI policy approved at
the highest level, defining the scope of use, prohibited uses, and
responsibilities. The Saudi Data and Artificial Intelligence Authority (SDAIA)
has issued a national guide for AI adoption and its ethical principles for
government entities, which can serve as a foundation to build upon.
•
A high-level governance
committee comprising: the business owner, the IT director, the compliance and
risk director, a legal counsel, and an external advisor to ensure objectivity.
•
A RACI roles matrix for
each model: who is Responsible (R), who is Accountable (A), who is Consulted
(C), and who is Informed (I).
•
Inclusion of AI clauses
in procurement contracts: the right to audit, the right to explanation,
reliability commitments, and data ownership.
The Map Function: Context Profiling and Prior
Impact Assessment
Before launching any application, a written AI Impact
Assessment is conducted that answers seven fundamental questions:
•
What specific decisions
will the model influence, and who is the direct beneficiary?
•
Who is the potential
party harmed by errors, and which groups are most vulnerable?
•
What data is being used,
what is its source, what is its sensitivity level, and who owns it?
•
What alternatives have
been studied (manual, semi-automated), and why was AI chosen?
•
What is the remediation
plan in the event of error, and how can an aggrieved party file a grievance?
•
What is the model’s
review cycle after launch?
•
What are the disclosure
requirements for the citizen: does the citizen know that they are dealing with
an automated system?
The Measure Function: Measurement Indicators
Before and After Operation
Measuring models goes beyond “accuracy” to a multidimensional
set of indicators: overall accuracy and accuracy for each demographic group (to
detect bias), the rate of false positives and false negatives,
interpretability, robustness against anomalous inputs, and temporal stability.
These indicators are measured before launch and monitored monthly thereafter. A
drop in any of them by more than 10% warrants immediate review.
The Manage Function: The Model Registry and
Incident Response
The institutional Model Registry (AI Inventory) is the
cornerstone of operational governance, and it has become mandatory across every
U.S. federal agency and at least 13 states and 2 cities. The minimum fields the
registry must contain for each model are as follows:
•
Model identifier: a
unique serial number, the model name, and the version release.
•
Operational purpose: the
decision it supports, the beneficiary, the use case.
•
Risk classification:
prohibited / high / limited / low, with the classification date.
•
Data used: its source,
its sensitivity, the legal basis for processing.
•
Owners: the business
owner, the data owner, the technical model owner.
•
Performance indicators:
baseline values, targets, the latest measurement, and the date.
•
Review cycle: the date of
the last review and the date of the next review.
•
Incidents: a log of
incidents, responses, and adjustments.
Complementing this, the incident response plan covers four
categories that must be prepared for Model Drift, data leakage, adversarial
attacks on inputs, and systematic errors. For each category, the following must
be documented: what happens, who is informed within how many hours, who decides
to halt the model, and how operation is restored.
Risk Classification and Impact Assessment:
The Practical Approach
Regional cybersecurity reports indicate a 58% increase in
ransomware attack activity driven by AI-enhanced threats including social engineering
and deepfakes. This makes risk classification a daily operational matter, not
an annual document.
The Four-Tier Risk Classification Matrix
The internationally adopted approach classifies AI
applications into four categories, each with different governance requirements:
•
Prohibited Category: Use is
not permitted under any circumstances. Examples: social behavior scoring and
emotion recognition in workplace and education environments.
•
High-Risk Category: Requires
a prior impact assessment, continuous human oversight, an independent annual
audit, and transparency to the citizen. Examples: assessing entitlement to
benefits, public security analysis, and assistive judicial decisions.
•
Limited-Risk Category: Requires
disclosure to the user that it is an automated system and the ability to
escalate to a human. Examples: official chatbots and service recommendation
systems.
•
Low-Risk Category: Requires
a general policy for responsible use and employee training. Examples: internal
document classification and productivity tools for employees.
How the Impact Assessment Is Conducted in
Practice in Five Steps
•
Step one — Profiling: The
business owner writes a two-page document describing the problem, the proposed
alternative, and the rejected alternatives — and why.
•
Step two — Identifying
the Affected Parties: A list of the groups likely to be affected
(citizens, employees, partners, vulnerable groups), and how each is affected.
•
Step three — Data
Analysis: The source of the data, the legal basis, the extent of fair
representation of demographic groups, and protection mechanisms.
•
Step four — Scenario
Analysis: Three error scenarios (optimistic, pessimistic, worst-case),
with a remediation plan for each.
•
Step five — Committee
Decision: Approval, conditional approval with controls, postponement-pending
fulfillment of requirements, or rejection — with the decision and its rationale
retained as an audit reference.
An Implementation Roadmap with Practical
Measurement Indicators
Moving from intent to implementation requires sequential
steps, not leaps, with each phase linked to a clear measurement indicator. The
Jordanian experience offers a practical model: the Ministry of Digital Economy
conducted readiness assessments on 18 government entities and recorded a 25%
increase in staff awareness following capacity-building workshops — a
confirmation of the importance of the foundational phase.
Phase One: Diagnosis (90 Days) — Success
Indicator: An Approved Readiness Report
Diagnosis of the situation across four axes: data maturity
(quality, recency, availability), technical infrastructure maturity, skills
maturity, and the existing regulatory framework. The outcome: a written report
that identifies the starting point, ranks 5–7 candidate use cases by
feasibility and impact, and selects one as a pilot project.
Phase Two: Foundation and Piloting (6 Months)
— Success Indicator: An Operational Model
Three parallel work streams run concurrently in this phase:
(1) building policies and forming the governance committee, (2) executing the
pilot project within a defined budget and timeframe, and (3) qualifying staff.
The Sultanate of Oman has adopted a similar track via Oman GPT as a language model that reflects
local content prior to scaling. Success indicator: a model operating in
production with documented performance indicators.
Phase Three: Institutional Scale-Up (12–18
Months) — Success Indicator: Institutional Integration
Expanding the application to other units, integrating outputs
into core systems, and building sustainable in-house capabilities. In this
phase, governance shifts from paper to a daily practice within the life cycle
of each model, and the model registry becomes a living document — not a dormant
archive.
Success Indicators and Requirements for
Sustainability
Measuring success must go beyond technical indicators to
indicators of institutional impact. The “Tahawul” program for government
digital transformation in the Sultanate of Oman has achieved a performance rate
of 94% and the digitization of 90% of essential services, while the Kingdom of
Saudi Arabia has secured second place globally on the World Bank’s 2025
Government Technology Maturity Index with a score of 99.64%. These are outcome
indicators, not activity indicators.
The Four Core Operational Indicators
•
Transaction Cycle Time: Before
and after implementation, with a target reduction of no less than 30% during
the first year.
•
Transaction Cost: Inclusive
of infrastructure cost, licensing, and maintenance, divided by the number of
transactions.
•
Error Rate: Human
errors avoided, automated errors detected, and the grievance rate.
•
Beneficiary Satisfaction:
Pre-
and post-measurement via short surveys, with tracking of the change on a
monthly basis.
Governance Indicators: What Separates Use
from Actual Governance
•
The number of models
registered in the institutional model registry versus the models actually
operating (the gap is a warning indicator).
•
The percentage of models
whose impact assessment was completed before launch.
•
The number of AI
incidents and the average response time.
•
The percentage of staff
trained on policies and evaluation frameworks.
•
The number of independent
audits and the recommendations applied as a result.
Investing in Human Capital: The Real Gap
A study published in the journal Humanities and
Social Sciences Communications in 2025, analyzing 47 AI initiatives across the
six Gulf States, indicates that the largest gap is not in infrastructure, but
in the availability of skills and the alignment of organizational incentives
with transformation requirements. Any plan that does not include a clear
pathway for qualifying three functional segments — senior leadership (strategic
awareness), middle leadership (change management), and technical staff (applied
skills) — will remain deficient.
The integrated model that combines application, governance,
and risk management is the only framework that protects the government
institution from three recurring traps: rushed adoption without governance,
excessive governance that freezes innovation, or technical investment without
building human capacity. What the government leader lacks today is not national
references or international standards, but rather the transition from reading
these references to applying them with methodological discipline through: an
approved policy, an active governance committee, a living model registry, a
prior impact assessment for every application, monthly measurement indicators,
and an incident response plan. This pathway transforms AI from a slogan into a
daily government work tool that serves the citizen and controls risk in equal
measure.
Mature government experiences converge on a fundamental
truth: the applied AI ecosystem is not completed by policies and infrastructure
alone, but by qualified human cadres capable of translating these policies into
disciplined daily practice. Such qualification requires a graduated approach
that accommodates the differing needs of three pivotal functional segments:
senior leadership, which needs strategic awareness and the ability to make
decisions in a highly complex environment; middle leadership, which is
responsible for managing change and translating directions into measurable
operational plans; and the staff working in administrative and technical roles,
who represent the direct point of contact with the tools and applications in
the field.
Since the actual return on investment in AI is determined by,
the professionalism of the cadres that operate and govern it, choosing a
training partner that moves beyond theoretical delivery to applied building is
a strategic decision, not an operational one. Here emerges the role of
specialized institutes with practical accumulated experience in the region,
including The Only Solution for Training and Consulting,
which — through its training programs and specialized workshops in the field of
applied AI and its governance — offers a methodology that combines the
reference theoretical framework with practical application grounded in regional
case studies and applied measurement tools. This enables government entities to
shorten the learning curve, avoid the cost of trial and error, and build
sustainable institutional capabilities that ensure the continuity of impact
after the training program concludes.
...