Code Quality in the Agentic Coding Age: Testing and Verification Are No Longer Optional
8 min read
An AI agent just rewrote your entire service layer in 30 seconds. It looks clean. The types seem right. You merge it.
Two days later, production is on fire.
No tests. No static analysis. No formatting checks. No code review. Just vibes and a fast merge button.
This is happening everywhere right now. AI-generated code is flooding codebases faster than teams can review it. And the projects that survive this era won’t be the ones that write the most code. They’ll be the ones that verify it.
Here’s the toolkit that makes that possible: PHPStan, Larastan, Pest, Rector, Laravel Pint, PHP-CS-Fixer, CodeRabbit, and GrumPHP. If you’re shipping PHP in 2026 without these, you’re gambling with your production environment.
Why This Matters More Now Than Ever
Before AI-assisted coding, a developer would write a function, mentally trace the logic, maybe run it a few times, and push it. The pace was human. Mistakes happened, but they happened at a human rate.
Now? An AI agent can generate hundreds of lines of code in seconds. It can touch dozens of files in a single session. It can refactor an entire module before you’ve finished reading the diff. The volume of change has exploded.
And that’s the problem. The more code that gets generated, the more surface area there is for bugs, regressions, and subtle issues that no one catches until production breaks.
AI-generated code isn’t inherently bad. In many cases, it’s quite good. But it’s not infallible. It can introduce type mismatches, miss edge cases, break existing conventions, or quietly change behavior in ways that pass a quick glance but fail under real-world conditions.
The only reliable defense? Automated tooling that catches problems before they reach your users.
The PHP Quality Toolkit
If you’re working in the PHP ecosystem, particularly with Laravel, you have access to an excellent set of tools that, when combined, create a safety net strong enough to catch virtually anything that slips through. Here’s what I recommend as a minimum.
PHPStan - Static Analysis That Catches Bugs Before Runtime
PHPStan analyzes your code without running it. It finds type errors, undefined variables, impossible conditions, dead code, and dozens of other issues that would otherwise only surface at runtime, or worse, in production.
What makes PHPStan particularly valuable in the agentic coding age is its ability to enforce type safety at scale. When an AI agent refactors a method signature, PHPStan immediately tells you if any callers are now passing the wrong types. When it generates a new service class, PHPStan checks that every dependency and return type is correct.
Run it at level 5 or higher. The stricter you are, the more bugs you catch early.
Larastan - PHPStan for Laravel
If you’re using Laravel, Larastan is essential. It extends PHPStan with Laravel-specific knowledge, understanding Eloquent models, facades, route parameters, validation rules, and other framework patterns that vanilla PHPStan can’t fully analyze.
Without Larastan, PHPStan might flag perfectly valid Laravel code as errors, or miss actual issues hidden behind the framework’s magic methods. Larastan bridges that gap.
Pest - Testing That Developers Actually Enjoy Writing
Pest is a testing framework built on top of PHPUnit, with an expressive, minimal syntax that makes writing tests feel less like a chore and more like writing documentation for your code.
Here’s why this matters in the context of AI-generated code: if your testing framework is pleasant to use, you’ll actually write tests. And if you have good test coverage, you can let AI agents make sweeping changes with confidence, because your test suite will catch regressions immediately.
Pest’s architecture-testing features are particularly useful. You can enforce rules like “controllers should not directly access the database” or “all models must belong to the Domain namespace.” These architectural guardrails prevent AI agents from violating your project’s design principles, even when they technically produce working code.
Rector - Automated Refactoring and PHP Upgrades
Rector is an automated refactoring tool that applies code transformations based on configurable rules. It can upgrade your codebase from PHP 7.4 to 8.3, replace deprecated function calls, enforce coding patterns, and modernize legacy code, all automatically.
In the agentic coding world, Rector serves as a code normalizer. AI agents might generate code using older patterns or inconsistent styles. Rector brings everything in line with your standards, automatically. It’s not a linter. It’s a tool that actually rewrites your code to match your rules.
Laravel Pint - Opinionated Code Formatting
Laravel Pint is Laravel’s official code style fixer, built on top of PHP-CS-Fixer. It’s zero-config by default, following Laravel’s own coding conventions.
Why is formatting important when AI writes code? Because consistency is readability, and readability is maintainability. When every file follows the same formatting rules, regardless of whether a human or an AI wrote it, code review becomes faster, diffs become cleaner, and onboarding new developers is easier.
For non-Laravel PHP projects, PHP-CS-Fixer provides the same capabilities with fully customizable rule sets.
CodeRabbit - AI-Powered Code Review
CodeRabbit adds an AI-powered code reviewer to your pull requests. It analyzes diffs for bugs, security vulnerabilities, performance issues, and adherence to best practices.
This is where things get interesting. You’re using AI to write code, and then using AI to review that code. It sounds circular, but it works remarkably well. CodeRabbit catches patterns that static analysis tools miss, things like “this function does what was asked, but there’s a simpler way” or “this change introduces a subtle N+1 query.”
It adds another layer to your safety net, one that thinks about code quality from a higher level than line-by-line analysis.
GrumPHP - The Gatekeeper
GrumPHP ties everything together by hooking into your Git workflow. It runs your configured checks (PHPStan, Pint, Pest, Rector, or any other tool) automatically before every commit. If any check fails, the commit is blocked.
This is your last line of defense before code enters your repository. It doesn’t matter if you or an AI agent wrote the code. If it doesn’t pass the checks, it doesn’t get committed. Period.
The CI/CD Pipeline: Your Final Safety Net
All of these tools should also run in your CI/CD pipeline (GitHub Actions, GitLab CI, Bitbucket Pipelines, or whatever you use). GrumPHP catches issues locally, but CI/CD catches everything else: commits pushed directly, force-pushes that bypass hooks, or developers who forgot to install GrumPHP.
A solid pipeline for a PHP project looks something like this:
- Laravel Pint / PHP-CS-Fixer checks formatting
- PHPStan / Larastan runs static analysis
- Rector verifies no outdated patterns slipped through
- Pest runs the full test suite
- CodeRabbit reviews the pull request
If any step fails, the PR cannot be merged. No exceptions.
The Real Cost of Skipping This
I’ve seen projects where AI-generated code was merged without review, without tests, without static analysis. It worked at first. Then, a few weeks later, edge cases started surfacing. Type errors in production. Broken API contracts. Security vulnerabilities hiding in auto-generated validation logic.
The cost of fixing these issues after deployment is orders of magnitude higher than the cost of setting up proper tooling from day one.
Think of it this way. AI agents are like an incredibly fast junior developer who never gets tired and never complains. But like any junior developer, they need guardrails. Code review, testing, formatting standards, and static analysis are those guardrails.
Getting Started
If you’re starting from zero, here’s the order I’d recommend:
- Install PHPStan/Larastan and fix errors at level 5. This alone will catch a huge number of issues.
- Set up Laravel Pint or PHP-CS-Fixer and run it across your codebase. Consistent formatting makes everything else easier.
- Start writing tests with Pest. Begin with your most critical paths: authentication, payments, data mutations.
- Add Rector with a conservative rule set. Let it modernize your code gradually.
- Configure GrumPHP to run all of the above before every commit.
- Set up CI/CD to run the full suite on every pull request.
- Add CodeRabbit to your repository for automated PR reviews.
You don’t need to do everything at once. But you do need to start.
The Bottom Line
We’re in the most exciting era of software development in decades. AI agents are making us more productive than ever. But productivity without quality is just technical debt accumulating at machine speed.
The tools exist. They’re free or affordable. They’re well-documented. There’s no excuse not to use them.
Your future self, your team, and your users will thank you.