About — DataDawn

The Problem

Government data is public by law but fragmented by design. IRS nonprofit filings, congressional voting records, lobbying disclosures, campaign finance reports, regulatory dockets, and federal spending data all exist in separate databases maintained by separate agencies with separate formats. No single free platform connects them. The tools that do are either paywalled at thousands of dollars per seat per year, cover a single domain, or rely on commercial aggregators with closed methodology.

What We Built

DataDawn downloads data directly from federal government APIs, normalizes it, and cross-references it through verified linkage keys. A single member of Congress can be connected to their votes, floor speeches, stock trades, campaign donors, committee assignments, sponsored legislation, and lobbying activity directed at their committees — all in one query.

On the regulatory side, a proposed rule can be traced from the Federal Register through its OIRA review, its public comment docket, and into the final regulation in the Code of Federal Regulations. On the nonprofit side, every foundation grant, DAF disbursement, officer salary, and investment portfolio is searchable across more than five million IRS filings.

180M+

Deployed records

250+

Database tables

25+

Federal data sources

$0

Cost to use

Data sources

IRS e-File (990 filings)

IRS Business Master File

IRS Group Exemptions (BMF subordinates)

Congress.gov API

Government Publishing Office (GovInfo)

Federal Election Commission (bulk data)

Senate Office of Public Records (LDA Lobbying)

Department of Justice (FARA)

Regulations.gov API

Federal Register API

USAspending.gov

Electronic Code of Federal Regulations

House Clerk Financial Disclosures (PTR)

Senate eFD (Stock Trades)

House/Senate Clerk (Votes)

reginfo.gov (OIRA regulatory reviews)

oversight.gov (Inspector General reports)

gao.gov (GAO reports)

Congressional Research Service (via Congress.gov)

Congressional Budget Office (cost estimates)

House & Senate Appropriations (earmarks)

Bureau of Indian Affairs (federally-recognized tribes)

Census Bureau Government Units Survey 2022

SEC EDGAR (CIK crosswalk for stock-trade issuers)

USDA APHIS (Animal Welfare Act)

Who Built It

DataDawn was built by three collaborators: a human, Claude (Anthropic), and DJ Crabdaddy (Claude Code). All code and data pipelines are published under CC0 (public domain).

Our Principles

Independent

No partisan affiliation, no advocacy agenda, no commercial entanglements.

Primary sources only

Every record comes from a primary U.S. federal government source or an open public registry. We do not integrate commercial aggregators (Candid, Bloomberg Government, LegiStorm) or NGO-curated derived data (OpenSecrets, OpenCorporates, ProPublica derived tables). Every join on the platform is reproducible from public APIs — no dependency on third-party maintainers whose methodology is a trade secret.

Transparent about transparency

Every data source, linkage method, coverage rate, and known limitation is documented. Our methodology is as public as our data.

Privacy by choice, not by obligation

We name people who act in public capacity — elected officials, lobbyists, nonprofit officers, Senate-confirmed appointees. For incidental appearances of private citizens in federal datasets (a tourist in a visitor log, a small-dollar donor in an FEC file, a pro-se commenter), we aggregate or redact. The record stays; the individual identifier does not.

No editorial layer

We present government records as filed. We don't rate legislators, score voting records, or characterize organizations. The data speaks; users interpret.

Free forever

No paywall, no freemium tier, no account required. CC0 licensing means this work can never be locked up.

Permanent by design

The entire platform can be rebuilt from public government APIs using our published scripts. If DataDawn disappeared tomorrow, anyone could recreate it.

AI-accessible

DataDawn exposes a live MCP server, REST APIs, an OpenAPI specification, and llms.txt orientation guides on each subdomain. Journalists and researchers working with AI tools can query live data directly — no scraped summaries, no stale snapshots.

Explore

Visit Deep Dive to search the data, Connections to see cross-referenced accountability queries, or Methodology to review how everything was built.