A logging bug in Codex caused a problem where the equivalent of 640TB of data was written to the local SSD annually; this issue has been largely fixed in the latest version.



Users of OpenAI's coding assistance tool, Codex , have reported a bug causing excessive writes to their SSDs, equivalent to approximately 640TB per year, due to SQLite logs stored locally by Codex. OpenAI has implemented several fixes to reduce log output, and claims that the latest version has reduced the amount of writes by approximately 85%.

Codex SQLite feedback logs can write ~640 TB/year and rapidly consume SSD endurance · Issue #28224 · openai/codex
https://github.com/openai/codex/issues/28224

Codex records its operational status and feedback logs in an SQLite database. According to GitHub user 1996fanrui, in an environment where Codex was used for about 21 days, approximately 37TB of data was written to the main SSD, with the Codex SQLite log being the primary source of continuous writes. 1996fanrui points out that if writing continues at this pace, approximately 640TB of data will be written and deleted per year. A typical 1TB SSD, with a guaranteed write capacity (TBW) of around 600TB, could reach its guaranteed write limit in less than a year.

The reporter's investigation revealed that despite the database only holding approximately 680,000 logs, the SQLite row IDs exceeded 5.5 billion, indicating that a process of 'inserting a large number of logs and then immediately deleting them' was being repeated. Even in a 15-second measurement, approximately 36,211 new log rows were written, and the process of inserting rows, creating indexes, writing to the WAL, and then deleting them continued without changing the number of records held.

The analysis revealed that the majority of the writes were 'TRACE' level detailed logs, recording even function calls and variable values, and that a large amount of internal logs, such as file monitoring logs, were being stored. The report suggests changing the default behavior from persistently saving TRACE logs to recording only important information, setting a limit on the database size, and saving only the bare minimum necessary for WebSocket payloads, rather than the entire payload.

A few days after the report, version 81150 of the win-codex was released. Thibaut Sotio, head of core products at OpenAI and in charge of both ChatGPT and the Codex, stated on X, 'There was a bug in the Codex. We have fixed it.'




According to GitHub user wfy-op, based on some rough sampling, there were between 28,000 and 46,000 writes per 30 seconds before the update, but after the update, that number decreased to around 6,700 per 30 seconds. However, TRACE-level logs are still being stored in the database, and while the write volume has decreased significantly compared to its peak, wfy-op states that this is not normal low-frequency logging behavior.

in AI, Posted by log1e_dh