derive.historical
Documentation for eth_defi.derive.historical Python module.
DuckDB persistence for Derive funding rate and open interest history.
Stores hourly funding rate snapshots and hourly open interest snapshots for perpetual instruments. Incremental sync fetches the full available history (back to instrument inception) and is crash-resumeable.
The sync is crash-resumeable: partial batches are safely re-inserted
on restart via INSERT OR IGNORE on natural primary keys.
Schema
Three tables:
funding_rates– hourly funding rate snapshotsopen_interest– hourly perp snapshots (OI, perp price, index price)sync_state– per-instrument watermarks for incremental sync
Storage location
Default: ~/.tradingstrategy/derive/funding-rates.duckdb
Example:
from pathlib import Path
from eth_defi.derive.session import create_derive_session
from eth_defi.derive.historical import DeriveFundingRateDatabase
session = create_derive_session()
db = DeriveFundingRateDatabase(Path("/tmp/funding-rates.duckdb"))
inserted = db.sync_instrument(session, "ETH-PERP")
print(f"Stored {inserted} new funding rate entries")
df = db.get_funding_rates_dataframe("ETH-PERP")
print(df.tail())
oi_inserted = db.sync_open_interest_instrument(session, "ETH-PERP")
print(f"Stored {oi_inserted} new open interest entries")
oi_df = db.get_open_interest_dataframe("ETH-PERP")
print(oi_df.tail())
db.close()
Module Attributes
Size of each API fetch chunk in days. |
|
Maximum lookback for inception-date binary search (days). |
|
Data type name for sync state tracking |
|
Data type name for open interest sync state tracking |
|
Step size for on-chain perp snapshot backfill. |
|
Block step for hourly reads on Derive Chain (3600s / 2s per block = 1800 blocks). |
|
Checkpoint interval: flush sync state and DuckDB WAL every N hours of data. |
|
Multicall3 deployment block on Derive Chain. |
|
Multicall3 deployment block number on Derive Chain. |
Functions
|
Estimate the Derive Chain block number closest to a Unix timestamp. |
Classes
DuckDB database for storing Derive funding rate history. |
- DEFAULT_FUNDING_RATE_DB_PATH = PosixPath('/home/runner/.tradingstrategy/derive/funding-rates.duckdb')
Default DuckDB path for Derive funding rate history
- CHUNK_DAYS = 28
Size of each API fetch chunk in days.
The Derive API returns empty results for windows >= 30 days, so we use 28-day chunks as the maximum safe window.
- MAX_INCEPTION_PROBE_DAYS = 1100
Maximum lookback for inception-date binary search (days).
We probe up to this many days into the past to find the first day an instrument had funding rate data.
- DATA_TYPE_FUNDING_RATES = 'funding_rates'
Data type name for sync state tracking
- DATA_TYPE_OPEN_INTEREST = 'open_interest'
Data type name for open interest sync state tracking
- OI_STEP = datetime.timedelta(seconds=3600)
Step size for on-chain perp snapshot backfill.
- OI_BLOCK_STEP = 1800
Block step for hourly reads on Derive Chain (3600s / 2s per block = 1800 blocks).
- OI_CHECKPOINT_INTERVAL = 500
Checkpoint interval: flush sync state and DuckDB WAL every N hours of data.
This ensures the scan can resume close to where it left off after a crash rather than restarting from the beginning.
- DERIVE_MULTICALL3_DEPLOYMENT_TS = datetime.datetime(2023, 12, 30, 0, 0)
Multicall3 deployment block on Derive Chain.
Deployed at block 1,935,198 (2023-12-29 23:20 UTC). OI history starts from this date — the ~22 days between ETH-PERP activation and Multicall3 deployment are skipped.
- DERIVE_MULTICALL3_DEPLOYMENT_BLOCK = 1935198
Multicall3 deployment block number on Derive Chain.
- estimate_block_at_timestamp(w3, target_ts, latest_block=None, latest_ts=None)
Estimate the Derive Chain block number closest to a Unix timestamp.
Uses linear interpolation from the current block. Derive Chain has a stable 2-second block time so the estimate is accurate to within a handful of blocks (a few seconds), which is more than adequate for hourly perp snapshots (OI, perp price, index price).
- Parameters
w3 (web3.main.Web3) – Web3 instance connected to Derive Chain.
target_ts (int) – Target Unix timestamp (seconds, UTC).
latest_block (int | None) – Current block number. Fetched automatically if not provided.
latest_ts (int | None) – Timestamp of the current block. Fetched automatically if not provided.
- Returns
Estimated block number (always >= 1).
- Return type
- class DeriveFundingRateDatabase
Bases:
objectDuckDB database for storing Derive funding rate history.
Stores hourly funding rate snapshots for perpetual instruments at the native resolution provided by Derive (one entry per hour).
On first sync, fetches the full available history back to
DEFAULT_LOOKBACK_DAYSusing 1-day API chunks. On subsequent syncs, fetches only new data since the last stored timestamp.The database is crash-resumeable: interrupted syncs can be safely re-run without data loss or duplicates.
Thread safety: all database operations are protected by an internal lock. Multiple threads can call sync methods concurrently — the API calls run in parallel while database writes are serialised.
Open or create a DuckDB database for funding rate storage.
- Parameters
path – Path to the DuckDB file. Parent directories are created automatically if they do not exist.
- __init__(path=PosixPath('/home/runner/.tradingstrategy/derive/funding-rates.duckdb'))
Open or create a DuckDB database for funding rate storage.
- Parameters
path (pathlib.Path) – Path to the DuckDB file. Parent directories are created automatically if they do not exist.
- close()
Close the database connection.
- save()
Force a checkpoint to ensure data is persisted to disk.
- find_inception_date(session, instrument_name, base_url='https://api.lyra.finance', timeout=30.0)
Find the earliest available funding rate data for an instrument.
Uses binary search over day-sized probes to locate the first day with data. Typically requires ~10 API calls.
- Parameters
- Returns
Start of the earliest day with data (naive UTC), or
Noneif no data exists at all.- Return type
datetime.datetime | None
- sync_instrument(session, instrument_name, start_time=None, end_time=None, base_url='https://api.lyra.finance', timeout=30.0, progress=None)
Fetch funding rate data and store it.
Walks the time range in
CHUNK_DAYS-sized windows (1 day by default) to fetch the full available history from the Derive API.On the first run, uses
find_inception_date()to discover how far back data is available, then fetches from inception to now. On subsequent runs, resumes forward from the last stored timestamp.- Parameters
session (requests.sessions.Session) – HTTP session from
create_derive_session().instrument_name (str) – Perpetual instrument name (e.g.
"ETH-PERP").start_time (datetime.datetime | None) – Override start time (naive UTC). Defaults to resume point or instrument inception date.
end_time (datetime.datetime | None) – Override end time (naive UTC). Defaults to now.
base_url (str) – Derive API base URL.
timeout (float) – HTTP request timeout.
progress (tqdm_loggable.tqdm_logging.tqdm_logging | None) – Optional external
tqdmprogress bar to update per chunk. WhenNone, no progress bar is shown (usesync_instruments()for progress tracking).
- Returns
Number of new funding rate entries inserted.
- Return type
- sync_instruments(session, instrument_names, start_time=None, end_time=None, base_url='https://api.lyra.finance', timeout=30.0)
Sync funding rate history for multiple instruments.
Two-phase process:
Probe — determine the effective time range for each instrument (inception date or resume point).
Fetch — walk all instruments day-by-day with a single progress bar showing total days remaining.
- Parameters
session (requests.sessions.Session) – HTTP session.
instrument_names (list[str]) – List of instrument names (e.g.
["ETH-PERP", "BTC-PERP"]).start_time (datetime.datetime | None) – Override start time for all instruments.
end_time (datetime.datetime | None) – Override end time for all instruments.
base_url (str) – Derive API base URL.
timeout (float) – HTTP request timeout.
- Returns
Dict mapping instrument name to number of new entries inserted.
- Return type
- get_funding_rates(instrument_name, start_time=None, end_time=None)
Get stored funding rate entries for an instrument.
- Parameters
instrument_name (str) – Perpetual instrument name.
start_time (datetime.datetime | None) – Optional start time filter (naive UTC).
end_time (datetime.datetime | None) – Optional end time filter (naive UTC).
- Returns
List of FundingRateEntry objects sorted by timestamp ascending.
- Return type
- get_funding_rates_dataframe(instrument_name, start_time=None, end_time=None)
Get stored funding rates as a Pandas DataFrame.
Columns:
timestamp,instrument,funding_rate.- Parameters
instrument_name (str) – Perpetual instrument name.
start_time (datetime.datetime | None) – Optional start time filter (naive UTC).
end_time (datetime.datetime | None) – Optional end time filter (naive UTC).
- Returns
DataFrame with funding rate data.
- Return type
- get_row_count(instrument_name)
Get the number of stored entries for an instrument.
- get_sync_state(instrument_name, data_type='funding_rates')
Get sync state for an instrument.
- sync_open_interest_instrument(session, instrument_name, w3=None, start_time=None, end_time=None, base_url='https://api.lyra.finance', timeout=30.0, progress=None)
Fetch open interest history and store it in DuckDB.
Queries the
openInterest(uint256)view function on the Derive Chain perp contract at hourly intervals. Derive Chain is an archive node so historical state from chain genesis is available.On the first run, fetches hourly snapshots from the instrument’s
scheduled_activationdate to now. On subsequent runs, resumes from the last stored timestamp. UsesINSERT OR IGNOREso the sync is crash-resumeable.- Parameters
session (requests.sessions.Session) – HTTP session from
create_derive_session(). Used to look up the instrument’s on-chain contract address.instrument_name (str) – Perpetual instrument name (e.g.
"ETH-PERP").w3 (web3.main.Web3 | None) – Web3 instance connected to Derive Chain (
https://rpc.derive.xyz). Created automatically if not provided.start_time (datetime.datetime | None) – Override start time (naive UTC). Defaults to last synced timestamp or instrument activation date.
end_time (datetime.datetime | None) – Override end time (naive UTC). Defaults to now.
base_url (str) – Derive API base URL.
timeout (float) – HTTP request timeout for instrument detail lookup.
progress (tqdm_loggable.tqdm_logging.tqdm_logging | None) – Optional external
tqdmprogress bar to update per day.
- Returns
Number of new open interest entries inserted.
- Return type
- sync_open_interest_instruments(session, instrument_names, rpc_url='https://rpc.derive.xyz', start_time=None, end_time=None, base_url='https://api.lyra.finance', timeout=30.0, max_workers=8)
Fetch perp snapshot history for multiple instruments using parallel reads.
Uses
read_multicall_historical()to read on-chain state at hourly intervals across all instruments in parallel. Each worker process runs its own Web3 connection viaTunedWeb3Factory.For N instruments, 3×N subcalls (OI, perp price, index price) are batched into one Multicall3
tryBlockAndAggregatecall per hour. Withmax_workers=8, eight hours are read concurrently, giving roughly 8× throughput over sequential reads.- Parameters
session (requests.sessions.Session) – HTTP session.
instrument_names (list[str]) – List of instrument names (e.g.
["ETH-PERP", "BTC-PERP"]).rpc_url (str) – Derive Chain JSON-RPC URL. A
TunedWeb3Factoryis created from this URL for parallel worker processes.start_time (datetime.datetime | None) – Override start time for all instruments.
end_time (datetime.datetime | None) – Override end time for all instruments.
base_url (str) – Derive API base URL.
timeout (float) – HTTP request timeout.
max_workers (int) – Number of parallel worker processes for RPC reads.
- Returns
Dict mapping instrument name to number of new entries inserted.
- Return type
- get_open_interest(instrument_name, start_time=None, end_time=None)
Get stored open interest entries for an instrument.
- Parameters
instrument_name (str) – Perpetual instrument name.
start_time (datetime.datetime | None) – Optional start time filter (naive UTC).
end_time (datetime.datetime | None) – Optional end time filter (naive UTC).
- Returns
List of
OpenInterestEntryobjects sorted by timestamp ascending.- Return type
- get_open_interest_dataframe(instrument_name, start_time=None, end_time=None)
Get stored open interest as a Pandas DataFrame.
Columns:
timestamp,instrument,open_interest,perp_price,index_price.- Parameters
instrument_name (str) – Perpetual instrument name.
start_time (datetime.datetime | None) – Optional start time filter (naive UTC).
end_time (datetime.datetime | None) – Optional end time filter (naive UTC).
- Returns
DataFrame with open interest and price data.
- Return type
- get_open_interest_row_count(instrument_name)
Get the number of stored open interest entries for an instrument.