36
Years
281
Entities
1.06M
Fields
414
Canonical Fields
1990
Earliest Edition
2025
Latest Edition
01 Mission
The Problem

The CIA only publishes the current edition of the World Factbook. When a new edition replaces the old, decades of geopolitical data disappears from public access. Historical editions survive only as scattered archives—plain-text on Gutenberg, cached JSON on GitHub, zip files on the Wayback Machine—each in a different format with no way to search or compare across years.

The Solution

This archive brings every edition from 1990 to 2025 into a single, normalized, searchable database. Every country, every field, every edition—parsed from original CIA publications, standardized across 36 years of format changes, and made available for research, analysis, and public use.

February 2026
The CIA discontinued the online World Factbook, making this archive one of the last comprehensive preserved copies of the full publication history.
v2 Data Drop — February 19, 2026
Comprehensive validation identified and repaired three data quality issues: (1) 1996 data for 7 countries (Venezuela, Armenia, Greece, Luxembourg, Malta, Monaco, Tuvalu) was truncated in the Project Gutenberg source—replaced with complete data from the CIA's own original text file recovered from the Wayback Machine. (2) Zimbabwe 1998 had 176 duplicate/noise fields from HTML parser artifacts—cleaned. (3) Germany GDP 1994–1996 was stored under a self-named field instead of "National product"—corrected. See Sources & Methodology for full details.
02 Capabilities
Boolean Search

Full-text search across 1,061,522 fields with AND/OR/NOT operators, phrase matching, field and year filtering.

Browse & Compare

Navigate any country in any year. View side-by-side comparisons. Track individual fields across the full 36-year span.

Intelligence Analysis

Choropleth maps, regional dashboards, timeline animations, country dossiers formatted per ICD 203 analytic standards.

Data Export

Download as CSV, Excel, or formatted PDF. Bulk export all 36 years for any country. Print-ready paginated reports.

03 Audience
AudienceUse Case
ResearchersTrack how countries evolved over decades—GDP trajectories, population shifts, governance changes
AnalystsCompare economic, military, or demographic indicators across time and regions
JournalistsVerify historical claims against primary source data from U.S. Government publications
ArchivistsPreserve access to public-domain government publications before they disappear
StudentsStudy international relations, political science, or intelligence analysis with real-world data
Data ScientistsBuild structured, longitudinal country datasets from standardized, canonical field names
04 Architecture
ComponentTechnology
BackendPython / FastAPI
TemplatesJinja2
DatabaseSQLite (deployed) / SQL Server (source)
ChartsApache ECharts 5
HostingFly.io (Docker)
Source ControlGitHub
ETL PipelineCoverage
Plain-text parser1990–2001 (Gutenberg + CIA original)
HTML parser (5 variants)2000–2020
JSON parser2021–2025
Field canonicalization1,090 → 414 names
Entity standardization281 entities, 9 types
Regional classification6 COCOM commands
05 Creator
MilkMp

Research analyst with training in information organization, archival methodology, intelligence analysis standards, and historical research methods. This project combines all four disciplines—library science to structure a government publication, historical methods for 36 years of format changes, ICD 203/208 standards for data presentation, and crime/intelligence analysis for the analytic tools.

GitHub Repository
Credentials
EducationMLIS (Library & Info Science)
EducationB.A. in History
CertificateCrime & Intelligence Analysis
Disclaimer
This project is not affiliated with the Central Intelligence Agency, the Office of the Director of National Intelligence, or the U.S. Government. All data originates from the CIA World Factbook, a public-domain publication. The intelligence community formatting (ICD 203/208 structure, COCOM regional organization, confidence badges) is used for presentation purposes only and does not imply access to classified sources or methods.
Full Sources & Methodology →