ferrywl.to

This is a record of my journey of refactoring a large legacy backend codebase with OpenAI Codex.

Using OpenAI Codex to refactor a legacy codebase


I was an early GitHub Copilot Pro subscriber from its debut. A few months ago, the company I work for started rolling out GitHub Copilot Enterprise.

Because weโ€™re allowed to use our personal GitHub account for work, I was asked to enrol with it. However, when I enrolled under the companyโ€™s Copilot subscription, GitHub automatically refunded my personal subscription โ€” because a GitHub account can only have one active Copilot license. Reference: GitHub Community discussion

To avoid any potential IP ownership disputes, I decided to stop using GitHub Copilot entirely on my personal machine.

At the same time, Iโ€™m a ChatGPT Plus subscriber. I use it for learning and daily creative tasks. Even though I treat it like a technical search engine quite often (and Iโ€™ve tried the web agent), I didnโ€™t realize Codex could run locally inside VS Code until very recently (tried Codex CLI long ago but the experience wasnโ€™t great).

So I decided to try it on a genuinely challenging project: refactoring a legacy backend codebase.

The legacy situation

The codebase consisted of:

  • A data object project mixing multiple concerns
  • Thirteen containerized nano-services deployed to GCP Cloud Run
  • Everything targeting .NET Core 3.1 (around five to six years old at the time of writing)

Refactoring objectives

  • Consolidate the shattered nano-service projects into a single API project
  • Migrate from .NET Core 3.1 to .NET 10
  • Move from a poorly organized project structure into a DDD-style structure
  • Enable strict mode across all projects (TreatWarningsAsErrors)

My Codex setup

Image of Codex VSCode extension

C# filesTotal LOCClasses
Before52745,927390
After39444,393408

It also helped fix thousands of errors when I enabled strict mode.

Even with AI, it still took me around 10 nights (about 2 hours per night) to complete.

There was one time Codex told me Iโ€™d run out of usage quota and needed to wait 24 hours.

Context usage

This would have been extremely difficult to do manually without a significant time investment.

Image showing diff on GitHub Pull Request

How I feel about Codex afterwards

Pros

  • The model was able to pause and ask for meaningful confirmation, like:

This is a very large refactor (hundreds of lambdas/short params across all services and tests). Doing it in one shot risks regressions and will take a lot of time.

  • It also felt faster and more accurate (for this kind of refactor) than my typical experience with Copilot.

  • And with a single Plus subscription, it offered strong value โ€” not just for Codex, but also for day-to-day learning and creative productivity.

Cons

  • Itโ€™s a bit of a pity it doesnโ€™t have Next Edit Suggestions (NES) like GitHub Copilot, which I personally find more useful for tight, single-file editing.

Conclusion

In the past Codex CLI requires OpenAI API account and charge by token consumption, this discovery saved me a few bucks from monthly subscription cost. :)

This post was originally published on [Medium].