[ Documentation ]
> Quick Start
MarketParquet serves daily-partitioned OHLCV data as Apache Parquet files. Each file contains all symbols for a single trading day.
- Sign up for a free account
- Browse available data by asset type and timeframe
- Download the parquet files you need (web or API)
- Load into pandas, polars, DuckDB, or any tool that reads Parquet
> File Format
Files are named by trading date and stored under by_date/{asset}_{timeframe}/YYYY/YYYY-MM-DD.parquet.
Intraday schema (1min, 5min, 30min, 1hour)
timestamp TIMESTAMP[us] -- bar open time (US/Eastern) symbol STRING -- ticker symbol asset_type STRING -- "Stock", "ETF", or contract code open FLOAT64 high FLOAT64 low FLOAT64 close FLOAT64 volume FLOAT64
Daily (EOD) schema
date DATE -- trading date symbol STRING -- ticker symbol asset_type STRING open FLOAT64 high FLOAT64 low FLOAT64 close FLOAT64 volume FLOAT64
Stock and ETF prices are split & dividend adjusted. Futures use continuous ratio adjustment. Compression: Snappy.
> Python
pandas
import pandas as pd
# Load one day of stock 1-min bars
df = pd.read_parquet("stock_1min_2024-01-15.parquet")
# Filter to a single ticker
spy = df[df.symbol == "SPY"]
# Compute VWAP
spy["vwap"] = (spy.close * spy.volume).cumsum() / spy.volume.cumsum()
polars (faster, recommended)
import polars as pl
df = pl.read_parquet("stock_daily_2024-01-15.parquet")
# Top 10 by volume
top = df.sort("volume", descending=True).head(10)
# Lazy load multiple files
df = pl.scan_parquet("stock_daily_2024-*.parquet")
monthly_avg = df.group_by("symbol").agg(pl.col("close").mean()).collect()
DuckDB (SQL on Parquet)
import duckdb
con = duckdb.connect()
# Query parquet files directly with SQL
result = con.execute("""
SELECT symbol, AVG(close) AS avg_close, SUM(volume) AS total_vol
FROM 'stock_daily_*.parquet'
WHERE symbol IN ('AAPL', 'MSFT', 'GOOGL')
GROUP BY symbol
""").fetchdf()
> REST API
The API is available to Pro subscribers. Generate an API key from your
account page, then authenticate with the Authorization header.
List available assets
GET https://marketparquet.com/api/v1/assets
curl -H "Authorization: Bearer bt_YOUR_KEY" \
https://marketparquet.com/api/v1/assets
# Returns:
{
"plan": "pro",
"assets": [
{
"asset_type": "stock_1min",
"asset": "stock",
"timeframe": "1min",
"file_count": 6690,
"earliest_date": "2000-01-03",
"latest_date": "2026-04-06",
"access": "full"
},
...
]
}
List dates for an asset
GET /api/v1/dates/{asset_type}?page=1
curl -H "Authorization: Bearer bt_YOUR_KEY" \
https://marketparquet.com/api/v1/dates/stock_daily
# Returns up to 100 dates per page (newest first)
{
"asset_type": "stock_daily",
"plan": "pro",
"page": 1,
"count": 100,
"dates": [
{
"date": "2026-04-06",
"file_size_bytes": 232488,
"download_url": "/api/v1/download/stock_daily/2026-04-06"
},
...
]
}
Download a file
GET /api/v1/download/{asset_type}/{date}
curl -H "Authorization: Bearer bt_YOUR_KEY" \
https://marketparquet.com/api/v1/download/stock_daily/2026-04-06
# Returns a presigned R2 URL (valid 60 seconds)
{
"asset_type": "stock_daily",
"date": "2026-04-06",
"file_size_bytes": 232488,
"download_url": "https://...r2.cloudflarestorage.com/...?X-Amz-Signature=...",
"expires_in": 60
}
Asset types
stock_1min stock_5min stock_30min stock_1hour stock_daily etf_1min etf_5min etf_30min etf_1hour etf_daily futures_1min futures_5min futures_30min futures_1hour futures_daily
> Examples
Python: download a file via the API
import requests
import pandas as pd
API_KEY = "bt_YOUR_KEY"
BASE = "https://marketparquet.com/api/v1"
headers = {"Authorization": f"Bearer {API_KEY}"}
# 1. Get a presigned URL
resp = requests.get(
f"{BASE}/download/stock_daily/2026-04-06",
headers=headers,
).json()
# 2. Download the parquet file
parquet_bytes = requests.get(resp["download_url"]).content
# 3. Load with pandas
import io
df = pd.read_parquet(io.BytesIO(parquet_bytes))
print(df.head())
Bash: download a date range
API_KEY="bt_YOUR_KEY"
for d in 2026-04-01 2026-04-02 2026-04-03; do
URL=$(curl -s -H "Authorization: Bearer $API_KEY" \
"https://marketparquet.com/api/v1/download/stock_daily/$d" \
| python3 -c "import sys,json; print(json.load(sys.stdin)['download_url'])")
curl -s -o "stock_daily_$d.parquet" "$URL"
echo "Downloaded $d"
done
Python: backtest a simple strategy
import pandas as pd
import glob
# Load all daily files for 2024
files = sorted(glob.glob("stock_daily_2024-*.parquet"))
df = pd.concat([pd.read_parquet(f) for f in files])
# SPY-only momentum strategy
spy = df[df.symbol == "SPY"].sort_values("date").reset_index(drop=True)
spy["return"] = spy.close.pct_change()
spy["sma20"] = spy.close.rolling(20).mean()
spy["signal"] = (spy.close > spy.sma20).astype(int)
spy["strategy_return"] = spy.signal.shift(1) * spy["return"]
cumret = (1 + spy.strategy_return).cumprod()
print(f"Final value: {cumret.iloc[-1]:.2f}")
> Need more help?
Check the FAQ for data quality details, or browse the interactive API docs at /docs.