The Epstein-research report, which examines 218GB of Epstein files using the AI model Claude Opus 4.6, is now available.



A large amount of data related to the investigation into Jeffrey Epstein, who was arrested on suspicion of child trafficking and sexual abuse and later died in prison, known as the 'Epstein Files,' was made public between December 2025 and the end of January 2026, causing a major scandal that shook the world. A scientific analysis library called ' Epstein-research ' has been released that uses

Claude Opus 4.6 and faster-whisper to structure and analyze the Epstein Files.

rhowardstone/Epstein-research: Distilled documents to assist
https://github.com/rhowardstone/Epstein-research

Epstein-research is a library of over 165 forensic analysis reports based on the massive 218GB dataset released by the US Department of Justice regarding the Jeffrey Epstein investigation.

Claude Opus 4.6 was used for text analysis, creating a searchable database and then examining all documents . Full text extraction and indexing with FTS5 were performed on all pages, analyzing over 2.58 million revision records and creating a register of 1,614 people. Faster-whisper large-v3 was used for audio and video transcription, processing 1,628 entries using a GPU (A100).



The data included approximately 1.38 million documents (over 2.77 million pages) and 3,864 non-PDF files such as videos, audio, and spreadsheets. The analysis traced over $755 million in funds, uncovered the existence of over 95 shell companies, and uncovered investigative failures.

Epstein-research offers a setup called Claude-powered Epstein Investigator, which users can install on their own in desktop or CLI versions. The processing scripts, including over 36 Python tools used to build the database, are also publicly available, and all analysis is performed locally, without using the cloud.

Each factual assertion in the report is linked to a unique EFTA number, ensuring transparency by providing a direct link to the original PDF published by the Department of Justice.



University of Connecticut computer scientist Rye Howard Stone, who published the Epstein-research, describes himself as a data scientist, not an investigative journalist, and urges caution in treating the content with caution because it has not been independently verified by humans.

The data set also contains extremely shocking content, so the site strongly warns users of the risk of secondary PTSD and the need for regular rest, and limits viewing to adults over the age of 18. The site also notes that documents have been falsified by the Department of Justice and that unusual omissions have been detected, such as 57% of data extracted from FBI devices not being included in the public data, and makes it clear that the information provided does not constitute legal advice or the official opinion of any government agency.

in AI,   Note, Posted by log1i_yk