SQL Queries: Run Real SQL Against Your CSV Data
A new endpoint that accepts raw SQL SELECT queries against your dataset. Full PostgreSQL syntax with defense-in-depth security. Requires a private API key.
Sometimes you just need SQL
csv-api's filter and aggregation endpoints are great for the 80% case — list records, apply some operators, group and sum. But there's a class of questions that are awkward or impossible to express with filter[col][op]=val. "Show me cities where the average age is above 30 but only if there are at least 5 people." "Bucket customers into tiers based on their balance." "Find the second-highest value per group." These are all one-liners in SQL — and now they're one request in csv-api.
The new /records/sql endpoint accepts a raw SQL SELECT query and returns JSON. You get the full PostgreSQL query language — WHERE, ORDER BY, GROUP BY, HAVING, DISTINCT, CASE, subqueries, window functions — against a virtual data table that maps to your dataset.
How it works
Send a GET or POST to /api/v1/datasets/:id/records/sql with your SQL in the q parameter. Your query runs against a table called data whose columns are the same display names you see in the dashboard.
A simple query
curl -G "https://csv-api.com/api/v1/datasets/YOUR_ID/records/sql" \
-H "Authorization: Bearer sk_..." \
--data-urlencode "q=SELECT name, city FROM data WHERE age > 30 ORDER BY name"
{
"data": [
{ "name": "Alice", "city": "Portland" },
{ "name": "Charlie", "city": "Portland" }
],
"meta": { "row_count": 2 }
}
GROUP BY with HAVING
curl -G "https://csv-api.com/api/v1/datasets/YOUR_ID/records/sql" \
-H "Authorization: Bearer sk_..." \
--data-urlencode "q=SELECT city, COUNT(*) AS cnt, AVG(age) AS avg_age
FROM data
GROUP BY city
HAVING COUNT(*) >= 3
ORDER BY avg_age DESC"
{
"data": [
{ "city": "Portland", "cnt": 14, "avg_age": 33.2 },
{ "city": "Seattle", "cnt": 9, "avg_age": 29.8 }
],
"meta": { "row_count": 2 }
}
CASE expressions
curl -G "https://csv-api.com/api/v1/datasets/YOUR_ID/records/sql" \
-H "Authorization: Bearer sk_..." \
--data-urlencode "q=SELECT
CASE WHEN age >= 30 THEN 'senior' ELSE 'junior' END AS tier,
COUNT(*) AS cnt
FROM data
GROUP BY tier
ORDER BY cnt DESC"
Requires a private key
The SQL endpoint requires a private API key (sk_...). Public keys (pk_...) are rejected with a 403. This is intentional — raw SQL is a power-user feature, and the private-key requirement ensures it never ends up in a frontend bundle by accident.
Reminder: public pk_ keys still work for the list, show, and aggregate endpoints. Only /records/sql requires sk_. For background on API key scopes, see Securing Your API.
Built for safety
Letting users send SQL is a big trust decision. We designed this endpoint with multiple layers of validation and isolation so that your query can only ever read from your own dataset — nothing else. Only SELECT statements are accepted. Writes, data-definition statements, and multi-statement payloads are all rejected before they reach the database. Queries are sandboxed, time-limited, and capped at 1,000 result rows.
Error messages are clean and actionable — you'll see things like column "nope" does not exist rather than raw database stack traces.
Error messages you'll actually see
The endpoint returns clear, specific errors. Here are the most common:
| What you did | Error message |
|---|---|
| Sent an UPDATE | Only SELECT queries are allowed |
| Referenced another table | Only the 'data' table can be queried. Unknown table(s): users |
| Used pg_sleep() | Function 'pg_sleep' is not allowed in SQL queries |
| LIMIT 5000 | LIMIT cannot exceed 1000 rows |
| Typo in column name | Query error: column "nope" does not exist |
| Used a public key | This endpoint requires a private (write) API key. |
When to use SQL vs. filters vs. aggregation
The SQL endpoint doesn't replace the existing API — it complements it. Here's a rough guide:
- Use filters for simple lookups: "customers in Portland," "orders over $100," "names starting with A." Works with public keys, safe to embed in frontends.
- Use aggregation for dashboards: "total revenue by city," "average age by plan." Also works with public keys.
- Use SQL when you need HAVING, CASE, subqueries, window functions, or anything the filter syntax can't express. Requires a private key and is best suited for server-side use.
All three share the same authentication, rate limiting, and statement timeout. The SQL endpoint is just a wider aperture on the same data.
Try it now
If you already have a dataset, mint a private key from your account page and try a query:
curl -G "https://csv-api.com/api/v1/datasets/YOUR_ID/records/sql" \
-H "Authorization: Bearer sk_YOUR_KEY" \
--data-urlencode "q=SELECT * FROM data LIMIT 5"
The endpoint is documented in the API docs and in the per-dataset Swagger UI. For the full feature breakdown, see the SQL Queries feature page.