HyperliquidDailyMetricsDatabase

Documentation for eth_defi.hyperliquid.daily_metrics.HyperliquidDailyMetricsDatabase Python class.

class HyperliquidDailyMetricsDatabase

Bases: object

DuckDB database for storing Hyperliquid vault daily metrics.

Stores daily share price time series and vault metadata. The share prices are computed using time-weighted returns from the Hyperliquid API portfolio history.

Example:

from pathlib import Path
from eth_defi.hyperliquid.daily_metrics import HyperliquidDailyMetricsDatabase

db = HyperliquidDailyMetricsDatabase(Path("/tmp/metrics.duckdb"))

# Query vault data
df = db.get_all_daily_prices()
print(df)

db.close()

Initialise the database connection.

Parameters

path – Path to the DuckDB file. Parent directories will be created if needed.

Methods summary

__init__(path)

Initialise the database connection.

close()

Close the database connection.

delete_vault_daily_prices(vault_address)

Delete all daily price records for a vault.

detect_broken_vaults()

Detect vaults with anomalous share price data.

get_all_daily_prices()

Get all daily price data across all vaults.

get_all_vault_metadata()

Get metadata for all vaults.

get_existing_dates(vault_address)

Get all dates with existing data for a vault.

get_latest_leader_fractions()

Get the latest leader_fraction for each vault.

get_leader_fraction_history(vault_address)

Get the recorded leader_fraction snapshots for a vault.

get_recently_tracked_addresses([within_days])

Return vault addresses that have price data recorded within the last within_days days.

get_vault_count()

Get the number of unique vaults with price data.

get_vault_daily_price_count(vault_address)

Get the number of daily price records for a vault.

get_vault_daily_prices(vault_address)

Get daily price data for a specific vault.

get_vault_last_date(vault_address)

Get the last date with price data for a vault.

mark_vaults_disappeared(known_addresses)

Set TVL to zero for vaults that have disappeared from the API.

recompute_all_share_prices()

Recompute share prices for all vaults in the database.

recompute_vault_share_prices(vault_address)

Recompute share prices for a vault from stored tvl/pnl data.

save()

Force a checkpoint to ensure data is written to disk.

update_vault_tvl_bulk(updates)

Bulk-update TVL, is_closed, and APR for existing vaults.

upsert_daily_prices(rows[, cutoff_date])

Bulk upsert daily price rows for a vault.

upsert_vault_metadata(vault_address, name, ...)

Insert or update a vault's metadata.

__init__(path)

Initialise the database connection.

Parameters

path (pathlib.Path) – Path to the DuckDB file. Parent directories will be created if needed.

upsert_vault_metadata(vault_address, name, leader, description, is_closed, relationship_type, create_time, commission_rate, follower_count, tvl, apr, allow_deposits=True, flow_data_earliest_date=None)

Insert or update a vault’s metadata.

Parameters
update_vault_tvl_bulk(updates)

Bulk-update TVL, is_closed, and APR for existing vaults.

Only updates rows that already exist in vault_metadata. Does not insert new rows. Used to refresh TVL for vaults that fell below the processing threshold but still appear in the bulk stats-data API.

Parameters

updates (list[tuple[float, bool, float | None, str]]) – List of tuples (tvl, is_closed, apr, vault_address). The vault_address must be lowercased.

mark_vaults_disappeared(known_addresses)

Set TVL to zero for vaults that have disappeared from the API.

For any vault in vault_metadata whose address is NOT in known_addresses, sets tvl=0. Does not change is_closed — disappearing from the bulk listing does not necessarily mean the vault is permanently closed.

Parameters

known_addresses (set[str]) – Set of lowercased vault addresses currently present in the bulk stats-data API response.

upsert_daily_prices(rows, cutoff_date=None)

Bulk upsert daily price rows for a vault.

Parameters
  • rows (list[eth_defi.hyperliquid.daily_metrics.HyperliquidDailyPriceRow]) –

    List of HyperliquidDailyPriceRow items.

    The is_closed, allow_deposits, leader_fraction, and leader_commission fields should be None for historical rows and only set for the latest (today’s) row, so that we track how these values evolve over time.

    The flow columns (daily_deposit_count etc.) should be None for dates outside the backfill window and 0 for dates with no activity within the backfill window. COALESCE preserves existing values when None is passed.

  • cutoff_date (datetime.date | None) – If provided, only store rows up to this date (inclusive). Used for incremental scanning / testing.

get_all_daily_prices()

Get all daily price data across all vaults.

Returns

DataFrame with all daily price records, ordered by vault then date.

Return type

pandas.DataFrame

get_vault_daily_prices(vault_address)

Get daily price data for a specific vault.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to query.

Returns

DataFrame with price records for this vault, ordered by date.

Return type

pandas.DataFrame

get_existing_dates(vault_address)

Get all dates with existing data for a vault.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to query.

Returns

Set of dates that already have data in the database.

Return type

set[datetime.date]

get_all_vault_metadata()

Get metadata for all vaults.

Returns

DataFrame with one row per vault.

Return type

pandas.DataFrame

get_recently_tracked_addresses(within_days=4)

Return vault addresses that have price data recorded within the last within_days days.

Used to identify vaults that recently dropped below the TVL processing threshold but still need a few more daily bars to capture the closing share price at near-zero TVL.

Parameters

within_days (int) – Number of days to look back from today.

Returns

Set of lowercased vault addresses.

Return type

set[str]

get_vault_count()

Get the number of unique vaults with price data.

Return type

int

get_vault_daily_price_count(vault_address)

Get the number of daily price records for a vault.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to query.

Returns

Number of daily price records.

Return type

int

get_vault_last_date(vault_address)

Get the last date with price data for a vault.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to query.

Returns

The latest date, or None if no data.

Return type

datetime.date | None

get_latest_leader_fractions()

Get the latest leader_fraction for each vault.

Queries the most recent row per vault that has a non-NULL leader_fraction value.

Returns

Dict mapping lowercased vault address to leader_fraction.

Return type

dict[str, float]

get_leader_fraction_history(vault_address)

Get the recorded leader_fraction snapshots for a vault.

Returns only rows where leader_fraction was recorded (non-NULL). Each row represents a day when the scanner ran and observed the value. Over time, daily scans build up a sparse history of leader capital ownership that can be used to detect threshold crossings.

Parameters

vault_address (str) – Vault address to query (will be lowercased).

Returns

DataFrame with columns date and leader_fraction, ordered by date ascending. Empty if no snapshots recorded.

Return type

pandas.DataFrame

delete_vault_daily_prices(vault_address)

Delete all daily price records for a vault.

Used by the healing script to clear corrupted share prices before re-computing from fresh API data.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to delete.

Returns

Number of rows deleted.

Return type

int

recompute_vault_share_prices(vault_address)

Recompute share prices for a vault from stored tvl/pnl data.

Uses the stored tvl (total_assets), daily_pnl (pnl_update), and cumulative_pnl columns to reconstruct netflow_update, then feeds the data through _calculate_share_price() with the current chain-linked epoch reset logic.

This allows healing of share price data without re-fetching from the Hyperliquid API.

Parameters

vault_address (eth_typing.evm.HexAddress) – Vault address to recompute.

Returns

Dict with before/after statistics: {"rows": int, "old_sp_min": float, "old_sp_max": float, "new_sp_min": float, "new_sp_max": float, "epoch_resets": int, "changed_rows": int}

Return type

dict

recompute_all_share_prices()

Recompute share prices for all vaults in the database.

Iterates over every vault and calls recompute_vault_share_prices() for each.

Returns

Summary dict: {"total_vaults": int, "vaults_with_changes": int, "vaults_with_epoch_resets": int, "total_changed_rows": int, "per_vault": dict[str, dict]}

Return type

dict

detect_broken_vaults()

Detect vaults with anomalous share price data.

Checks for four classes of breakage:

  1. Extreme daily returns (> 10,000%): typically epoch reset artefacts where share price jumped to/from 1.0

  2. Share price at cap (>= 9,999): overflow from zero total_supply

  3. Share price stuck at 1.0: share price identical for 5+ rows, suggesting failed epoch reset logic

  4. Missing epoch_reset column: rows computed with old code that lacked the epoch_reset column (all NULL)

Returns

DataFrame with columns: vault_address, name, issue_type, affected_rows, example_value

Return type

pandas.DataFrame

save()

Force a checkpoint to ensure data is written to disk.

close()

Close the database connection.