The Only GCC Framework for Testing Agentic AI Systems.
APEX (Agentic Penetration Exercise) is pentest.ae's proprietary methodology for systematically testing AI agents, LLM applications, and autonomous systems against real-world attack vectors.
You might be experiencing...
Traditional penetration testing methodology was designed for a world where software was static, APIs were synchronous, and nothing acted autonomously. That world is gone.
APEX — the Agentic Penetration Exercise methodology — was built from first principles for the AI era. It systematically tests the attack vectors that emerge when software can read instructions, call tools, maintain memory, and take autonomous actions.
The Five Phases
PLAN — Every APEX engagement begins with a structured threat model. We identify your AI agent architecture, map trust boundaries between components, define the rules of engagement, and run automated OSINT to understand your AI stack’s external exposure. AI agents gather public intelligence in parallel; human researchers synthesize it into a focused attack plan.
SURFACE — We enumerate your complete AI attack surface: every agent, every tool connection, every API endpoint, every MCP server, every privilege scope. Most enterprises discover agents and tool integrations they didn’t know existed. This phase produces the attack surface map that drives the exploitation phase.
EXPLOIT — Human senior researchers drive creative attack chaining while AI agents run automated prompt injection sweeps (Garak, PyRIT, PromptBench) in parallel. This combination covers attack surface that manual testing alone cannot enumerate — at the speed that automated tools alone cannot contextualize. Critical findings are reported within 48 hours of discovery.
PERSIST — We simulate long-term adversarial presence: persistent agent compromise, privilege escalation through agent tool chains, exfiltration path mapping. This phase answers the question that standard pentests don’t ask: if an adversary got in, how long could they stay, and what would they do?
REPORT — Every finding is documented with business impact, CVSS score, reproduction steps, and remediation guidance. The report maps findings to OWASP LLM Top 10, NESA, and relevant UAE regulatory frameworks — structured for use as compliance evidence.
Human-Led, AI-Augmented
APEX is not an automated scanner. AI agents in APEX automate the high-volume, systematic work — enumeration, fuzzing, injection sweeps. Human senior researchers drive creative attack chaining, business logic exploitation, and findings narrative.
This combination eliminates the false-positive noise that purely automated tools produce, while covering attack surface that purely manual testing cannot enumerate in a reasonable timeframe.
Engagement Phases
PLAN
Scope definition, threat model development, AI architecture review, trust boundary mapping, rules of engagement, and automated OSINT gathering on AI stack exposure.
SURFACE
Asset discovery, tool connection mapping, privilege scope enumeration, MCP server inventory, API endpoint discovery, and agent interaction pattern analysis.
EXPLOIT
Manual prompt injection chaining, tool poisoning simulation, memory manipulation attempts, and unauthorized lateral movement. AI agents run Garak and PyRIT fuzzing sweeps in parallel.
PERSIST
Persistent agent compromise simulation, privilege escalation through agent tool chains, exfiltration path mapping, and long-term persistence mechanism identification.
REPORT
Narrative findings report with business impact, CVSS scores, OWASP LLM Top 10 compliance mapping, remediation roadmap, and regulatory alignment assessment.
Deliverables
Before & After
| Metric | Before | After |
|---|---|---|
| vs Traditional Pentest | OWASP Top 10 — no AI-specific methodology | OWASP Top 10 + OWASP LLM Top 10 + Agent-specific vectors |
| vs Automated Scanners | High false positive rate, no creative chaining | Human-led with AI augmentation — real attack chains |
| Regulatory Evidence | Generic penetration test report | APEX attestation letter + structured compliance mapping |
Tools We Use
Frequently Asked Questions
What makes APEX different from standard penetration testing methodology?
Standard penetration testing methodology (PTES, OWASP Testing Guide) was designed for web applications, APIs, and network infrastructure. APEX adds five AI-specific testing domains: prompt injection systematization, tool chain attack surface mapping, memory and context manipulation, agentic privilege escalation path analysis, and AI-augmented attack automation. APEX is the only documented methodology in the GCC that covers these domains.
How does AI automation work within APEX?
In the SURFACE phase, AI agents run automated asset enumeration in parallel with human reconnaissance — covering attack surface that would take a human team weeks to enumerate manually. In the EXPLOIT phase, AI agents run continuous fuzzing sweeps (Garak, PyRIT) while human researchers focus on creative attack chaining and business logic exploitation. Human researchers review and validate all AI-generated findings before inclusion in the report.
Is APEX recognized by regulators?
APEX maps explicitly to OWASP LLM Top 10, NESA Information Assurance Standards, and DFSA Technology Risk Framework requirements. We provide an APEX attestation letter that your compliance team can present to regulators as evidence of AI-specific security testing. CREST accreditation (targeted Q4 2026) will add a third-party quality assurance layer.
Can we get APEX methodology documentation for our records?
Yes. All APEX engagements include the methodology documentation as a deliverable — structured for inclusion in your information security management system (ISMS) as evidence of systematic AI security testing. The documentation covers the five phases, tools used, and testing coverage against OWASP LLM Top 10.
Find It Before They Do
Book a free 30-minute security discovery call with our AI Security experts in Dubai, UAE. We identify your highest-risk AI attack vectors — actionable findings in days.
Talk to an Expert