Security & data rights
Thesma is a developer API for US government open data — SEC EDGAR, the US Census Bureau, and the Bureau of Labor Statistics. This page covers where our data comes from, how we handle your account data, what we commit to operationally, and what you can do with Thesma data in your own products.
Where our data comes from
SEC EDGAR
Public company filings — 10-K, 10-Q, 8-K, Form 4, 13F, DEF 14A, SC 13D/G and more — published by the US Securities and Exchange Commission and released into the public domain, freely redistributable. We cover ~3,000 US public companies — about 98% of the investable US equity market by market cap.
US Census Bureau
American Community Survey and related federal demographic and economic datasets published by the US Census Bureau as US public-domain works with no downstream license restrictions.
Bureau of Labor Statistics
CES, QCEW, OEWS, and JOLTS series published by the US Bureau of Labor Statistics as US public-domain works, free to use and redistribute.
What you can do with our data
The underlying facts are US public domain. Our normalized representation — schema, derived metrics, and the compiled dataset as a whole — is our work. Our position is broadly permissive; here is the short version.
Broadly allowed
Commercial use in your products, display to your users, caching, derivation of analysis and metrics, inclusion in reports, using responses as grounding for LLM and RAG applications, and training models that are part of your own product.
Not allowed
Re-serving our normalized API responses as a competing data API or dataset, training commercial foundation models whose primary purpose is to replicate Thesma, and circumventing rate limits.
Attribution
Optional. A mention is appreciated but never obligatory. No "Powered by" clause.
Security controls
What we do today. Every line is load-bearing — if you need to hold us to a specific claim for a vendor-risk review, this is the list.
-
TLS 1.2 or higher on all public endpoints
Enforced uniformly across api, portal, screener, thesma.dev, and the MCP server via Railway / Let's Encrypt. Modern clients negotiate TLS 1.3.
-
API keys hashed with SHA-256
API keys are never stored in plaintext. The plaintext key is shown at creation time only; only the hash is persisted.
-
Production database encrypted at rest
Via Railway / AWS infrastructure.
-
Access to production infrastructure is restricted
Only authorized team members hold production credentials. No external collaborators or agents.
-
API access logs retained for 30 days
Via Railway log aggregation.
-
Incident response
Report to security@thesma.dev. Reaches a human.
Uptime target
Target: 99.5% monthly uptime. This is a goal, not a contractual SLA. We do not offer service credits or refunds tied to availability, and we do not publish a status page today.
Contact
-
security@thesma.dev
Vulnerability reports and security questions. Reaches a human within one business day.
-
privacy@thesma.dev
Privacy questions and data subject requests under UK GDPR and EU GDPR.
Data rights FAQ
Quick answers to the questions developers ask most. Each question has its own anchor URL — share the link to jump straight to the answer.
- Can I redistribute Thesma data?
- Can I resell products I build with Thesma?
- Can I cache API responses?
- Can I use Thesma data to train an LLM?
- Can I build something that competes with Thesma?
- What happens to data I've already cached if I cancel?
- Do I need to credit Thesma?
- How do I reach you about security?
- What's your uptime commitment?
Can I redistribute Thesma data?
Yes for the underlying facts — they are US public domain, so you can include them in your own reports, research, dashboards, and user-facing content. No for our normalized API responses as a dataset product — please do not publish a dump of our responses on Kaggle or bundle them as "Thesma data" for others to consume.
Can I resell products I build with Thesma?
Yes. Build commercial SaaS, sell reports, sell dashboards, charge the users who see the data. That is exactly the intended use.
Can I cache API responses?
Yes, for as long as makes sense for your product. There is no duration limit.
Can I use Thesma data to train an LLM?
Yes for models that are part of your own product, including RAG systems and application-specific fine-tunes. No for training commercial foundation models whose primary purpose is to replicate a data service like Thesma.
Can I build something that competes with Thesma?
You cannot re-serve our normalized API responses as a competing data API or dataset. You are welcome to do the normalization work yourself — the source data is public. That is how we did it.
What happens to data I've already cached if I cancel?
You can keep using it. The underlying facts are public domain. You just cannot pull new data without an active subscription.
Do I need to credit Thesma?
No. A mention is appreciated, but crediting us is entirely optional.
How do I reach you about security?
Email security@thesma.dev. Reaches a human within one business day.
What's your uptime commitment?
We target 99.5% availability as an operational goal. There is no contractual SLA or credit commitment at this time.