The Problem
Government data is public by law but fragmented by design. IRS nonprofit filings, congressional voting records, lobbying disclosures, campaign finance reports, regulatory dockets, and federal spending data all exist in separate databases maintained by separate agencies with separate formats. No single free platform connects them. The tools that do are either paywalled at thousands of dollars per seat per year, cover a single domain, or rely on commercial aggregators with closed methodology.
What We Built
DataDawn downloads data directly from federal government APIs, normalizes it, and cross-references it through verified linkage keys. A single member of Congress can be connected to their votes, floor speeches, stock trades, campaign donors, committee assignments, sponsored legislation, and lobbying activity directed at their committees — all in one query.
On the regulatory side, a proposed rule can be traced from the Federal Register through its OIRA review, its public comment docket, and into the final regulation in the Code of Federal Regulations. On the nonprofit side, every foundation grant, DAF disbursement, officer salary, and investment portfolio is searchable across more than five million IRS filings.
Data sources
Who Built It
DataDawn was built by three collaborators: a human, Claude (Anthropic), and DJ Crabdaddy (Claude Code). All code and data pipelines are published under CC0 (public domain).
Our Principles
No partisan affiliation, no advocacy agenda, no commercial entanglements.
Every record comes from a primary U.S. federal government source or an open public registry. We do not integrate commercial aggregators (Candid, Bloomberg Government, LegiStorm) or NGO-curated derived data (OpenSecrets, OpenCorporates, ProPublica derived tables). Every join on the platform is reproducible from public APIs — no dependency on third-party maintainers whose methodology is a trade secret.
Every data source, linkage method, coverage rate, and known limitation is documented. Our methodology is as public as our data.
We name people who act in public capacity — elected officials, lobbyists, nonprofit officers, Senate-confirmed appointees. For incidental appearances of private citizens in federal datasets (a tourist in a visitor log, a small-dollar donor in an FEC file, a pro-se commenter), we aggregate or redact. The record stays; the individual identifier does not.
We present government records as filed. We don't rate legislators, score voting records, or characterize organizations. The data speaks; users interpret.
No paywall, no freemium tier, no account required. CC0 licensing means this work can never be locked up.
The entire platform can be rebuilt from public government APIs using our published scripts. If DataDawn disappeared tomorrow, anyone could recreate it.
DataDawn exposes a live MCP server, REST APIs, an OpenAPI specification, and llms.txt orientation guides on each subdomain. Journalists and researchers working with AI tools can query live data directly — no scraped summaries, no stale snapshots.
Explore
Visit Explore to search the data, Connections to see cross-referenced accountability queries, or Methodology to review how everything was built.
Questions? Reach us at [email protected].