Article 6 min read

How I Use Claude Code and Codex CLI Together to Write Better Laravel Code

Most comparisons of Claude Code and Codex CLI try to pick a winner. I stopped trying to do that a while ago. Instead, I run both of them together on the same codebase, using each one to challenge the other. The result is code that's better than what either model produces on its own.

Share

Here's how the workflow actually looks, and why it's become the way I develop on Wordsmith, a Laravel/Livewire social caption platform I maintain with real paying users.

The Setup

Both tools install the same way and both support Laravel Boost, so the playing field is level from the start.

composer require laravel/boost --dev
php artisan boost:install

The installer will ask which agents you're using. Select both Claude Code and Codex CLI. Boost auto-configures each one, but if it doesn't, you can add them manually:

# Claude Code
claude mcp add -s local -t stdio laravel-boost php artisan boost:mcp

# Codex CLI
codex mcp add laravel-boost -- php "artisan" "boost:mcp"

Once that's done, both agents have access to the same runtime context , your database schema, your routes, your Artisan commands, your Laravel docs. They're working from the same information. The difference is what they do with it.

Why Two Models?

Claude Code and Codex CLI are built on different models with different training and different instincts. Claude Code (built on Claude Sonnet/Opus) tends toward careful, well-reasoned output. It'll think through relationships, trace edge cases, and produce code that's thorough. Codex CLI (built on OpenAI's o4-mini and GPT-5 variants) is faster, more direct, and often catches different things.

That divergence is exactly why running them together is useful. Two models trained on different data will have different blind spots. What one misses, the other often catches.

I stumbled onto this workflow by accident. I had Claude Code generate a subscription billing feature in Wordsmith and then asked Codex CLI a question about the same code. Codex spotted a race condition in the webhook handler that Claude Code hadn't flagged. I went back to Claude Code with that feedback, it fixed the issue and also caught something in Codex's reasoning that didn't account for our specific billing model. By the third round the code was noticeably tighter than what either model had produced independently.

That became the workflow.

Round One: Claude Code Builds the Feature

I start almost every feature with Claude Code. Given a plain-English description of what I need, it reads the codebase, understands the architecture, and produces a first implementation.

For a recent Wordsmith feature, adding per-user content scheduling with timezone support, the prompt looked something like this:

Add content scheduling to Wordsmith. Users should be able to schedule a caption
for a specific date and time in their local timezone. Store in UTC, display in
their profile timezone. Follow standard Laravel conventions.
Include the migration, model, FormRequest, controller, and API resource.

Claude Code read the existing models and controllers, matched the conventions, and produced a coherent first pass touching all the relevant files. It's good at this kind of broad, multi-file work.

Round Two: Codex CLI Reviews the Output

Once Claude Code has produced something I'm reasonably happy with, I switch terminals and open Codex CLI in the same directory. I describe what Claude Code just built and ask Codex to review it.

The prompt I use is deliberately adversarial:

Review the scheduling feature that was just added. Look for edge cases,
potential bugs, anything that could fail in production. Don't tell me
what's good. Tell me what's wrong or what could be better.

That framing matters. If you ask an AI what it thinks of some code, it'll often tell you it looks great. If you tell it to find problems, it actually looks for them.

Codex CLI, coming at the code fresh without any of the context Claude Code built up during implementation, often sees things differently. In the scheduling example it flagged that the FormRequest wasn't validating that the scheduled time was in the future, and that the API resource was returning the stored UTC time without converting it back to the user's timezone for display.

Round Three: Back to Claude Code with Codex's Findings

I take Codex CLI's feedback and bring it back to Claude Code. Not as a list of instructions, more like a second opinion I want Claude Code to think through.

Codex flagged two issues: the FormRequest isn't validating that scheduled_at
is in the future, and the API resource is returning UTC instead of converting
to the user's profile timezone. Do you agree with both? Fix what's valid
and tell me if anything in Codex's reasoning doesn't account for our setup.

That last part is important. Sometimes Codex's feedback is correct but doesn't account for something specific to the codebase that Claude Code knows from context. Asking Claude Code to evaluate the feedback rather than just implement it catches those cases.

In the scheduling example, Claude Code agreed on the FormRequest validation and fixed it. On the API resource concern, it pushed back, our API clients handle timezone conversion on their end and the spec explicitly says to return UTC. Codex didn't know that. Claude Code did, because it had read the existing resources and understood the convention.

When to Stop

The loop doesn't go forever. Usually two or three rounds is enough. The signal to stop is when the round-trip produces no new findings. When Claude Code reviews Codex's latest feedback and says there's nothing to act on, or when Codex reviews Claude Code's latest changes and doesn't flag anything meaningful.

Occasionally they'll go in circles on something that's genuinely a judgment call rather than a correctness issue. When that happens I step in and make the call myself. The models are advisors, not decision-makers.

What This Workflow Is Actually Good For

It works best on features with real complexity. Anything involving business logic, data integrity, multi-step user flows, or edge cases that matter in production. For Wordsmith that's subscription billing, content scheduling, caption generation, user permissions.

It's overkill for straightforward scaffolding or boilerplate. If I need a new Eloquent model with a migration and factory, I just ask Claude Code and move on. The advisor loop is for when mistakes would actually hurt.

The Honest Trade-off

This takes longer than just using one tool. Running two agents through two or three rounds of review adds time to the process. It's worth it on features where a bug in production costs you, lost revenue, broken user flows, data integrity issues. It's not worth it on everything.

The other thing worth being honest about: this workflow only works because both models are good enough to challenge each other in useful ways. If one of them consistently produced low-quality feedback, the loop would break down. Right now they're different enough in their instincts that they catch different things, which is what makes it valuable.

Practical Notes

Keep both terminals open in the same directory. Boost gives both agents the same runtime context, so they're working from the same base. When you bring feedback from one to the other, paste the specific output rather than summarizing it, let the receiving agent read the actual reasoning and respond to it directly.

If you're on Claude Code's Max plan and Codex CLI on ChatGPT Plus, the combined cost is $40/month. For professional Laravel development work, that's not a significant line item, and you're getting two frontier models reviewing each other's work on every meaningful feature.

I haven't seen another workflow write-up that does this specific thing, which is part of why I wrote this. If you're already running both tools, it's worth trying for a week on something real and seeing if the output quality changes.

What does your AI development setup look like these days? I'm always comparing notes with other Laravel developers on this.

Comments (0)

Sign in to leave a comment.
No comments yet. Be the first to share your thoughts!