The internal structure of ChatGPT's bot detection system 'Turnstile' and the full details of the Sentinel Challenge have been revealed.



A detailed analysis of the internal structure of Cloudflare's bot detection system '

Turnstile ,' used by ChatGPT, and related Sentinel challenges has been published by Buchodi , a technical blog site that deals with security research and reverse engineering. The analysis was conducted by intercepting network traffic and deobfuscating the SDK, revealing how bots are detected and tokens are generated.

ChatGPT Won't Let You Type Until Cloudflare Reads Your React State. I Decrypted the Program That Does It.
https://www.buchodi.com/chatgpt-wont-let-you-type-until-cloudflare-reads-your-react-state-i-decrypted-the-program-that-does-it/

◆Multi-layered check using Turnstile
All messages from ChatGPT trigger a Cloudflare Turnstile program that runs secretly within the browser. This program goes beyond simple browser fingerprinting, checking 55 properties across three layers: the browser, the Cloudflare network, and the ChatGPT application itself. Specifically, it collects browser-specific information such as WebGL, screen resolution, hardware information, font measurement, DOM manipulation, and storage usage, as well as Cloudflare edge header information. Furthermore, it verifies whether the ChatGPT React application is fully rendered and hydrated. Through these multi-layered checks, even if the browser fingerprint is spoofed, bots that do not properly render the actual ChatGPT single-page application (SPA) will be detected and fail.



◆Decryption process
The bytecode for Turnstile is encrypted upon reception. First, the Base64-encoded field ( turnstile.dx ) included in the preparation response is decrypted by an XOR operation with the token ( p ) from the preparation request. Decrypting the outer layer is relatively easy because both are communicated over the same HTTP traffic.
[code]
outer = json.loads(bytes(
base64decode(dx)[i] ^ p_token[i % len(p_token)]
for i in range(len(base64decode(dx)))
))
# → 89 VM instructions
[code]


Decrypting the outer layer yields approximately 89 virtual machine (VM) instructions, but within these VM instructions is a 19KB encrypted data block containing the actual fingerprint program, using a different XOR key than the outer layer. Initially, this key was thought to be derived from performance.now(), but detailed analysis revealed that the key is directly embedded as a floating-point literal within the VM instructions. The key is generated by the server and embedded in the bytecode, allowing analysts to decrypt the internal program by knowing the key.
[code]
[41.02, 0.3, 22.58, 12.96, 97.35]
[code]


The complete decryption chain consists only of HTTP requests and responses and comprises the following five steps:

1. Read p from the preparation request.
2. Read turnstile.dx from the preparation response.
3. Perform an XOR operation on base64decode(dx) and p to obtain the external bytecode.
4. Find the 5-argument instruction following the 19KB blob and retrieve its last argument as the key.
5. XOR the base64decode(blob) and the string representation of the key to obtain the internal program (instructions 417-580 of the VM).

◆Sentinel Challenges other than Turnstile
Turnstile is one of three challenges offered by the Sentinel system, the other two being as follows:

Signal Orchestrator (SO) : This is a behavioral biometric authentication layer that tracks behavioral biometrics such as keystroke timing and mouse speed by installing user interaction event listeners such as key down, pointer movement, and clicks, and monitoring window.__oai_so_* properties.
Proof of Work (PoW) : This combines a 25-field fingerprint with a SHA-256 hash cache. Although the computational cost is increased, analysis shows that the seven binary detection flags are consistently zero across the entire sample, suggesting that it may not be a primary defense.

Here are the analysis results for all Sentinel challenges.

metric value supplement
Decrypting the program 377/377 100%
Number of unique users 32
Program-specific properties 55 All samples were identical.
Program-specific procedures 417-580 average 480
Unique XOR key 41
SO behavioral characteristics 36
PoW Fingerprint Field twenty five
PoW resolution time 72% were under 5ms



◆Intention to obscure the text
The following operational objectives are achieved by obfuscating the text using the XOR key.

- Hide the fingerprint checklist from static analysis.
- Prevent the website operator (OpenAI) from reading raw fingerprint values without reverse engineering.
- Prevent replay attacks by making each token unique.

However, this analysis has revealed that, in reality, it is merely an XOR operation within the same data stream, and therefore, while simple reading may be possible, it cannot completely prevent analysis.

◆Summary
The Sentinel Challenge demonstrated that efforts are being made to prevent bot access by implementing the following checks:

Turnstile detects bots through multi-layered checks and complex decryption processes.
- Non-Turnstile tools detect behavioral biometrics and fingerprints.

in AI,   Web Service, Posted by log1c_sh