Solana Indexer Database Schema Reference

This reference describes the database schema used by our Solana transaction indexer. Below you'll find comprehensive documentation of all tables, their columns, and relationships. The indexer maintains the following core tables:

  1. solana_mints - New Solana tokens (excluding Pump.fun tokens)
  2. raydium_all_swaps - Raydium transactions
  3. pumpfun_token_creation - New Solana tokens on Pump.fun
  4. pumpfun_all_swaps - Pump.fun transactions
  5. pumpswap_all_swaps - PumpSwap transactions
  6. pfamm_migrations - PumpFun AMM migration transactions
  7. meteora_swaps - Meteora DLMM DEX swap transactions
  8. raydium_launchpad_swaps - Raydium Launchpad transactions
  9. raydium_launchpad_migrations - graduations from the Raydium launchpad.

Shared Column Definitions

Several tables utilize a consistent set of columns, especially for identifying transactions:

Temporal and Transaction Fields

Column NameData TypeDescription
block_timeDateTimeTimestamp when the block containing the transaction was processed
block_date_utc / block_dateDateDate of the block in UTC timezone (truncated to day)
slotUInt32/UInt64Solana blockchain slot number
tx_idxUInt16/UInt32Index of the transaction within the block
signatureStringUnique Solana transaction signature

Wallet-Related Fields

Column NameData TypeDescription
signing_wallet / userStringThe wallet that signed the transaction
fee_payerStringThe wallet that paid for the transaction fees
from_walletStringOriginal wallet that initiated a transaction (when different from signing wallet)

Compute and Fee Details

Column NameData TypeDescription
provided_gas_feeUInt64Gas price set by the user for the transaction
provided_gas_limitUInt64Gas limit set by the user for the transaction
fee / fee_paidUInt64Actual transaction fee paid
consumed_gasUInt64Actual compute units consumed by the transaction

Storage Engine Settings

The majority of tables are configured with identical storage parameters:

  • Engine: MergeTree
  • Partition By: block_date_utc or block_date (daily partitions)
  • Order By: (slot, tx_idx) as primary key
  • TTL: Data is retained for 1 year
  • Settings:
    • index_granularity = 8192
    • min_bytes_for_wide_part = 10485760 (10MB)

Indexing Strategy

Tables employ a unified indexing approach for optimal query performance:

  • Bloom filter indexes on wallet addresses (signing_wallet, from_wallet, etc.)
  • MinMax index on block_time for efficient time-range queries
  • Bloom filter indexes on token addresses and pool identifiers

Individual Table Specifications

PumpSwap All Swaps

This table captures comprehensive swap data from the PumpSwap protocol on Solana.

Trade Direction and Token Addresses

Column NameData TypeDescription
directionLowCardinality(String)Swap direction - "B" for buy or "S" for sell
base_tokenStringSolana address of the base token in the swap
quote_tokenStringSolana address of the quote token in the swap

Swap Quantities

Column NameData TypeDescription
base_token_amountUInt64Amount of base tokens involved in the swap
quote_token_amountUInt64Amount of quote tokens involved in the swap
quote_token_amount_limitUInt64Maximum amount of quote tokens the user is willing to swap
quote_token_amount_without_lp_feeUInt64Quote token amount before LP fees are applied

Trader Account Details

Column NameData TypeDescription
user_base_token_accountStringSolana address of the user's base token account
user_quote_token_accountStringSolana address of the user's quote token account
user_base_token_reservesUInt64User's base token balance
user_quote_token_reservesUInt64User's quote token balance

Liquidity Pool Reserves

Column NameData TypeDescription
pool_idStringPool ID
pool_base_token_reserves_beforeUInt64Pool's base token reserves before the swap
pool_quote_token_reserves_beforeUInt64Pool's quote token reserves before the swap
pool_base_token_reserves_afterUInt64Pool's base token reserves after the swap
pool_quote_token_reserves_afterUInt64Pool's quote token reserves after the swap

Fee Structure

Column NameData TypeDescription
lp_fee_basis_pointsUInt16LP fee in basis points (1 basis point = 0.01%, 1/10000)
lp_feeUInt64LP fee amount in token units
protocol_fee_basis_pointsUInt16Protocol fee in basis points
protocol_feeUInt64Protocol fee amount in token units

Key Implementation Details

  • The direction column encodes buy operations as "B" and sell operations as "S"
  • For multi-instruction transactions, the signing wallet and fee payer may be different accounts
  • Swap data is extracted via the PumpSwapAnySwapDetector, which scans for PumpSwap program instructions
  • The parser handles both top-level and nested (inner) instructions

PumpFun AMM Migrations

This table records token migration events occurring within the PumpFun AMM ecosystem.

Field Reference

Column NameData TypeDescription
userStringBase58-encoded address of the user who initiated the migration
mintStringBase58-encoded address of the token mint being migrated
mint_amountUInt64Amount of tokens being migrated (in smallest denomination)
sol_amountUInt64Amount of SOL involved in the migration (in lamports)
pool_migration_feeUInt64Fee charged by the protocol for the migration (in lamports)
bonding_curveStringBase58-encoded address of the bonding curve used in the migration
timestampUInt32Unix timestamp of when the migration occurred
poolStringBase58-encoded address of the destination pool for the migration

Data Ingestion Process

The PumpFunAmmMigrationDetector populates this table by:

  • Monitoring Solana transactions for PumpFun program calls
  • Detecting migration instructions via their discriminator (9beae792ec9ea21e)
  • Matching associated migration log events (e445a52e51cb9a1d)
  • Parsing and storing structured migration metadata

Sample Queries — PumpFun AMM Migrations

  1. Retrieve migrations for a particular wallet:
SELECT 
    block_time,
    mint,
    mint_amount,
    sol_amount,
    pool_migration_fee,
    pool
FROM pfamm_migrations
WHERE user = '5tGTGzHcqnEaFP7Vp3SxwAyTJJhLSKY9Yx5hUVEwuuQQ'
ORDER BY block_time DESC
LIMIT 100
  1. Aggregate daily migration statistics:
SELECT 
    block_date_utc,
    count() AS migration_count,
    sum(sol_amount) / 1000000000 AS total_sol_volume,
    sum(pool_migration_fee) / 1000000000 AS total_fees_sol
FROM pfamm_migrations
WHERE block_date_utc BETWEEN '2023-01-01' AND '2023-01-31'
GROUP BY block_date_utc
ORDER BY block_date_utc
  1. Identify tokens with the highest migration activity:
SELECT 
    mint,
    count() AS migration_count,
    sum(mint_amount) AS total_tokens_migrated,
    sum(sol_amount) / 1000000000 AS total_sol_volume
FROM pfamm_migrations
WHERE block_date_utc >= dateAdd(day, -30, today())
GROUP BY mint
ORDER BY migration_count DESC
LIMIT 20

Meteora Swaps

This table contains swap transaction records from the Meteora DLMM (Dynamic Liquidity Market Maker) exchange on Solana.

Field Reference

Column NameData TypeDescription
base_coinStringSolana address of the base token in the trading pair
quote_coinStringSolana address of the quote token in the trading pair
base_coin_amountUInt64Amount of base tokens involved in the swap (raw value)
quote_coin_amountUInt64Amount of quote tokens involved in the swap (raw value)
start_bin_idInt32ID of the starting liquidity bin used in the swap
end_bin_idInt32ID of the ending liquidity bin used in the swap
feeUInt64Total swap fee collected (raw value)
protocol_feeUInt64Portion of the fee paid to the protocol (raw value)
fee_bps_lowUInt64Lower bound of the fee rate in basis points
fee_bps_highUInt64Upper bound of the fee rate in basis points
host_feeUInt64Portion of the fee paid to the host/referrer (raw value)
lb_pairStringSolana address of the liquidity book pair contract
swap_for_yUInt8Boolean flag (0/1) indicating swap direction

Handling Raw Token Amounts

Monetary values are stored as raw integers without decimal normalization. To obtain human-readable amounts, divide by 10^(token_decimals). For instance, a token with 9 decimals requires division by 10^9.

Sample Queries — Meteora Swaps

  1. Daily trading volume grouped by token pair:
SELECT 
    block_date,
    base_coin,
    quote_coin,
    COUNT(*) AS num_swaps,
    SUM(base_coin_amount) AS total_base_volume_raw,
    SUM(quote_coin_amount) AS total_quote_volume_raw
FROM default.meteora_swaps
WHERE block_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY block_date, base_coin, quote_coin
ORDER BY block_date, num_swaps DESC
  1. Fee breakdown by pool:
SELECT 
    block_date,
    lb_pair,
    COUNT(*) AS num_swaps,
    SUM(fee) AS total_fees_raw,
    SUM(protocol_fee) AS total_protocol_fees_raw,
    SUM(host_fee) AS total_host_fees_raw
FROM default.meteora_swaps
WHERE block_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY block_date, lb_pair
ORDER BY total_fees_raw DESC
  1. Top traders by activity:
SELECT 
    signing_wallet,
    COUNT(*) AS num_swaps,
    COUNT(DISTINCT lb_pair) AS num_pools_used,
    COUNT(DISTINCT block_date) AS num_active_days
FROM default.meteora_swaps
WHERE block_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY signing_wallet
ORDER BY num_swaps DESC
LIMIT 100
  1. Liquidity bin utilization metrics:
SELECT 
    lb_pair,
    AVG(end_bin_id - start_bin_id) AS avg_bin_range,
    MAX(end_bin_id - start_bin_id) AS max_bin_range,
    MIN(end_bin_id - start_bin_id) AS min_bin_range
FROM default.meteora_swaps
WHERE block_date BETWEEN '2025-01-01' AND '2025-01-31'
GROUP BY lb_pair
ORDER BY avg_bin_range DESC

Protocol Background

Meteora implements a concentrated liquidity model on Solana, conceptually similar to Uniswap v3 but with distinct characteristics:

  • Liquidity is allocated across discrete price bins instead of continuous tick ranges
  • The start_bin_id and end_bin_id columns reveal which bins participated in the swap
  • The swap_for_y flag denotes swap direction (X→Y when true, Y→X when false)

CPMM Pool Migrations

Column nameData typeDescription
payerStringFee payer pubkey (base58) for the transaction.
base_mintStringMint pubkey (base58) of the base token.
quote_mintStringMint pubkey (base58) of the quote token.
platform_configStringPlatform configuration account pubkey (global launchpad settings).
cpswap_programStringcpSwap program id (pubkey) deployed on-chain.
cpswap_poolStringcpSwap pool account pubkey (pool instance).
cpswap_authorityStringAuthority pubkey (often a PDA) controlling the cpSwap pool.
cpswap_lp_mintStringLP token mint pubkey for the cpSwap pool.
cpswap_base_vaultStringToken account pubkey holding the pool's base token (cpSwap-specific).
cpswap_quote_vaultStringToken account pubkey holding the pool's quote token (cpSwap-specific).
cpswap_configStringcpSwap-specific config/account pubkey for pool parameters.
cpswap_create_pool_feeStringAccount pubkey that receives pool-creation fees (fee receiver).
cpswap_observationStringObservation/oracle account pubkey (e.g., TWAP or metrics) used by cpSwap.
lock_programStringLocking/vesting program id (pubkey) used to lock tokens/LPs.
lock_authorityStringAuthority pubkey (PDA or owner) of the lock program instance.
lock_lp_vaultStringToken account pubkey where LP tokens are locked under the lock program.
authorityStringGeneral administrative/owner pubkey for the pool or migration.
pool_stateStringAccount pubkey storing serialized pool state (account data) or a state identifier.
global_configStringGlobal configuration account pubkey for the launchpad platform.
base_vaultStringGeneral vault token account pubkey for base token (non-cpSwap/general pool).
quote_vaultStringGeneral vault token account pubkey for quote token (non-cpSwap/general pool).
pool_lp_tokenStringToken account pubkey that holds LP tokens for the liquidity pool (distinct from LP mint).

Raydium Launchpad Token Creation

creatorFixedString(48)Creator account pubkey (base58) that initiated the token creation.
nameFixedString(20)Token name (fixed-length string metadata).
symbolFixedString(16)Token symbol/ticker (fixed-length string metadata).
urlFixedString(256)Metadata URL associated with the token (fixed-length string).
mintFixedString(48)Mint account pubkey (base58) for the created token.
bundle_sizeUInt8Number of items/operations included in the creation bundle.
gas_usedUInt64Compute units consumed by the creation (implementation-specific runtime units).
amount_of_instructionsInt32Number of instructions executed in the transaction.
amount_of_lookup_readsInt32Number of address lookup table reads used by the transaction.
amount_of_lookup_writesInt32Number of address lookup table writes used by the transaction.
bundle_structureStringSerialized structure of the bundle (e.g., JSON describing bundled ops).
bundled_buysUInt64Total amount (smallest token units) purchased across bundled buy operations.
bundled_buys_countUInt64Count of individual buy operations included in the bundle.
dev_balanceUInt64Developer/team balance related to the creation (value in smallest token units).
creation_ix_indexInt32Index of the creation instruction within the transaction (instruction position).
pool_stateFixedString(48)Pool state account pubkey (base58) associated with the token/pool.
base_vaultFixedString(48)Token account pubkey holding base token reserves for the pool.
quote_vaultFixedString(48)Token account pubkey holding quote token reserves for the pool.
raydium_program_indexUInt8Identifier/index of the Raydium program variant used (implementation-specific).
direct_raydium_invocationBoolWhether Raydium program was invoked directly (true/false).
decimalsUInt8Token decimal places (number of fractional digits).
cpmm_typeUInt8CPMM pool type identifier/enum (implementation-specific).
supplyUInt64Token supply at creation (amount in smallest token units).
bonding_curve_sell_amountUInt64Amount (smallest units) designated for bonding-curve sell operations.
bonding_curve_raise_amountUInt64Amount (smallest units) expected/raised via the bonding curve.
migrate_typeUInt8Migration type identifier/enum (implementation-specific).

Query Optimization Guidelines

Follow these best practices when querying the indexed data:

  1. Include block_date or block_date_utc filters whenever possible to take advantage of partition pruning
  2. For extended time-series analysis, leverage the minmax index on block_time
  3. Queries filtering by token or wallet addresses benefit from bloom filter indexes
  4. When joining across tables, use signature as the join key since it uniquely identifies each transaction
  5. Remember to apply decimal conversion when working with raw token amounts

Data Lifecycle

All tables enforce a 1-year TTL policy; records are automatically purged once they exceed this retention period.