Cryptocurrency markets operate 24/7/365, generating massive amounts of data with unprecedented volatility and liquidity dynamics. Unlike traditional markets, crypto provides perfect conditions for quantitative trading: programmatic access, no market hours restrictions, and rich historical data. This three-part series will guide you from fundamental concepts to production-ready quantitative trading systems.
In Part 1, we’ll establish the foundational knowledge and tools you need to begin quantitative analysis of cryptocurrency markets.
What is Quantitative Trading in Crypto?
Quantitative trading (quant trading) applies mathematical and statistical models to identify trading opportunities and execute trades systematically. In cryptocurrency markets, this approach is particularly powerful due to:
Why Crypto is Ideal for Quant Trading
Traditional Markets vs Crypto Markets
Traditional Stock Markets:
- Trading hours: 9:30 AM - 4:00 PM (6.5 hours)
- 252 trading days per year
- Weekend gaps in data
- Settlement: T+2 days
- Limited API access
- High transaction costs
Cryptocurrency Markets:
- Trading hours: 24/7/365 (never closes)
- 365 trading days per year
- Continuous data streams
- Settlement: ~10 minutes (Bitcoin)
- Free API access (most exchanges)
- Low transaction costs (0.1-0.5%)
Result: More data, more opportunities, faster iteration
Key Advantages of Crypto Quant Trading
| Advantage | Description | Impact |
|---|---|---|
| High Volatility | Bitcoin can move 5-10% daily | More trading opportunities |
| Market Inefficiency | Young market with pricing errors | Easier to find alpha |
| 24/7 Trading | Continuous price action | No overnight risk gaps |
| Low Barriers | Start with $100, no broker required | Accessible to individuals |
| Rich Data | Tick-level data freely available | Detailed analysis possible |
| Fast Settlement | Trades settle in minutes | Quick iteration cycles |
Essential Concepts
1. Market Microstructure
Understanding how crypto markets work at a fundamental level:
1# Market Microstructure Components
2
3Order Book Structure:
4┌─────────────────────────────────────┐
5│ ASK Side (Sell Orders) │
6├─────────────┬──────────────┬────────┤
7│ Price │ Quantity │ Total │
8├─────────────┼──────────────┼────────┤
9│ $43,150.00 │ 0.5 BTC │ $21,575│ ← Best Ask (Lowest Sell)
10│ $43,140.00 │ 1.2 BTC │ $51,768│
11│ $43,130.00 │ 0.8 BTC │ $34,504│
12├─────────────┴──────────────┴────────┤
13│ Spread: $20 (0.046%) │ ← Bid-Ask Spread
14├─────────────┬──────────────┬────────┤
15│ $43,130.00 │ 1.5 BTC │ $64,695│ ← Best Bid (Highest Buy)
16│ $43,120.00 │ 0.9 BTC │ $38,808│
17│ $43,110.00 │ 2.1 BTC │ $90,531│
18├─────────────┼──────────────┼────────┤
19│ BID Side (Buy Orders) │
20└─────────────────────────────────────┘
21
22Key Metrics:
23- Mid Price = (Best Bid + Best Ask) / 2 = $43,140
24- Spread = Best Ask - Best Bid = $20
25- Liquidity = Sum of order book depth
Critical Concepts:
1# 1. Slippage: Price impact when executing large orders
2def calculate_slippage(order_size, order_book):
3 """
4 Calculate expected slippage for a market order
5
6 order_size: Size in BTC
7 order_book: List of (price, quantity) tuples
8 """
9 remaining = order_size
10 total_cost = 0
11
12 for price, quantity in order_book:
13 if remaining <= 0:
14 break
15
16 executed = min(remaining, quantity)
17 total_cost += executed * price
18 remaining -= executed
19
20 avg_price = total_cost / order_size
21 best_price = order_book[0][0]
22 slippage = (avg_price - best_price) / best_price
23
24 return slippage
25
26# Example: Execute 10 BTC market buy order
27order_book = [
28 (43150, 0.5),
29 (43160, 1.2),
30 (43170, 2.0),
31 (43180, 3.5),
32 (43190, 5.0),
33]
34
35slippage = calculate_slippage(10, order_book)
36print(f"Slippage: {slippage:.2%}") # Output: Slippage: 0.08%
37
38# 2. Market Impact: How your order moves the market
39def estimate_market_impact(order_size, avg_daily_volume):
40 """
41 Estimate market impact using square root law
42
43 order_size: Size in USD
44 avg_daily_volume: Average daily trading volume in USD
45 """
46 participation_rate = order_size / avg_daily_volume
47 market_impact = 0.1 * (participation_rate ** 0.5)
48
49 return market_impact
50
51# Example: $1M order in $5B daily volume market
52impact = estimate_market_impact(1_000_000, 5_000_000_000)
53print(f"Expected market impact: {impact:.2%}") # Output: 0.14%
2. Price Data Types
Different data types serve different purposes:
1from datetime import datetime
2import pandas as pd
3
4# OHLCV (Open, High, Low, Close, Volume)
5# Most common format for quantitative analysis
6
7ohlcv_example = pd.DataFrame([
8 {
9 'timestamp': datetime(2026, 1, 24, 10, 0),
10 'open': 43100.0,
11 'high': 43250.0,
12 'low': 43050.0,
13 'close': 43200.0,
14 'volume': 125.5, # BTC
15 },
16 {
17 'timestamp': datetime(2026, 1, 24, 10, 1),
18 'open': 43200.0,
19 'high': 43300.0,
20 'low': 43180.0,
21 'close': 43280.0,
22 'volume': 98.3,
23 },
24])
25
26# Trade Data (Tick Data)
27# Individual executed trades
28
29trade_example = pd.DataFrame([
30 {
31 'timestamp': datetime(2026, 1, 24, 10, 0, 5, 123456),
32 'price': 43150.0,
33 'quantity': 0.5,
34 'side': 'buy', # Taker side
35 },
36 {
37 'timestamp': datetime(2026, 1, 24, 10, 0, 5, 234567),
38 'price': 43151.0,
39 'quantity': 1.2,
40 'side': 'sell',
41 },
42])
43
44# Order Book Snapshots
45# Complete order book state at specific times
46
47orderbook_example = {
48 'timestamp': datetime(2026, 1, 24, 10, 0),
49 'bids': [
50 (43130, 1.5),
51 (43120, 0.9),
52 (43110, 2.1),
53 ],
54 'asks': [
55 (43150, 0.5),
56 (43160, 1.2),
57 (43170, 0.8),
58 ],
59}
3. Market Regimes
Crypto markets exhibit distinct behavioral regimes:
1import numpy as np
2
3def identify_market_regime(returns, window=20):
4 """
5 Identify current market regime based on volatility and trend
6
7 returns: Series of percentage returns
8 window: Lookback period for calculations
9 """
10 # Calculate metrics
11 volatility = returns.rolling(window).std() * np.sqrt(365 * 24) # Annualized
12 trend = returns.rolling(window).mean() * 365 * 24 # Annualized
13
14 # Define regimes
15 regimes = []
16 for vol, tr in zip(volatility, trend):
17 if pd.isna(vol) or pd.isna(tr):
18 regimes.append('unknown')
19 elif abs(tr) < 0.1 and vol < 0.5:
20 regimes.append('low_volatility_ranging')
21 elif abs(tr) < 0.1 and vol >= 0.5:
22 regimes.append('high_volatility_ranging')
23 elif tr > 0.1 and vol < 0.8:
24 regimes.append('steady_bull')
25 elif tr > 0.1 and vol >= 0.8:
26 regimes.append('volatile_bull')
27 elif tr < -0.1 and vol < 0.8:
28 regimes.append('steady_bear')
29 else:
30 regimes.append('volatile_bear')
31
32 return pd.Series(regimes, index=returns.index)
33
34# Market Regime Characteristics
35regime_stats = {
36 'low_volatility_ranging': {
37 'frequency': '15%',
38 'best_strategy': 'Mean reversion',
39 'risk': 'Low',
40 },
41 'high_volatility_ranging': {
42 'frequency': '20%',
43 'best_strategy': 'Breakout trading',
44 'risk': 'High',
45 },
46 'steady_bull': {
47 'frequency': '25%',
48 'best_strategy': 'Trend following',
49 'risk': 'Medium',
50 },
51 'volatile_bull': {
52 'frequency': '15%',
53 'best_strategy': 'Momentum',
54 'risk': 'High',
55 },
56 'steady_bear': {
57 'frequency': '15%',
58 'best_strategy': 'Short selling',
59 'risk': 'Medium',
60 },
61 'volatile_bear': {
62 'frequency': '10%',
63 'best_strategy': 'Stay in cash',
64 'risk': 'Extreme',
65 },
66}
Data Collection and Infrastructure
Setting Up Your Development Environment
1# Create virtual environment
2python -m venv crypto_quant_env
3source crypto_quant_env/bin/activate # Linux/Mac
4# crypto_quant_env\Scripts\activate # Windows
5
6# Install essential packages
7pip install pandas numpy scipy matplotlib seaborn
8pip install ccxt # Cryptocurrency exchange library
9pip install ta-lib # Technical analysis library
10pip install yfinance # For comparison with traditional assets
11pip install python-binance # Binance-specific client
12pip install websocket-client # For real-time data
13pip install plotly # Interactive charts
14pip install jupyter # For analysis notebooks
Data Sources
1# 1. Exchange APIs (CCXT - Unified API for 100+ exchanges)
2
3import ccxt
4from datetime import datetime, timedelta
5
6class CryptoDataCollector:
7 """
8 Unified cryptocurrency data collector
9 """
10
11 def __init__(self, exchange_name='binance'):
12 """
13 Initialize exchange connection
14
15 exchange_name: 'binance', 'coinbase', 'kraken', etc.
16 """
17 self.exchange = getattr(ccxt, exchange_name)({
18 'enableRateLimit': True, # Respect rate limits
19 'options': {
20 'defaultType': 'spot', # spot, future, swap
21 }
22 })
23
24 def fetch_ohlcv(self, symbol='BTC/USDT', timeframe='1h',
25 since=None, limit=1000):
26 """
27 Fetch OHLCV data
28
29 symbol: Trading pair (e.g., 'BTC/USDT', 'ETH/USDT')
30 timeframe: '1m', '5m', '15m', '1h', '4h', '1d', '1w'
31 since: Unix timestamp in milliseconds (None = recent)
32 limit: Number of candles (max varies by exchange)
33
34 Returns: DataFrame with OHLCV data
35 """
36 if since is None:
37 # Get last 30 days by default
38 since = int((datetime.now() - timedelta(days=30)).timestamp() * 1000)
39
40 # Fetch data
41 ohlcv = self.exchange.fetch_ohlcv(
42 symbol,
43 timeframe,
44 since=since,
45 limit=limit
46 )
47
48 # Convert to DataFrame
49 df = pd.DataFrame(
50 ohlcv,
51 columns=['timestamp', 'open', 'high', 'low', 'close', 'volume']
52 )
53 df['timestamp'] = pd.to_datetime(df['timestamp'], unit='ms')
54 df.set_index('timestamp', inplace=True)
55
56 return df
57
58 def fetch_historical_data(self, symbol='BTC/USDT', timeframe='1h',
59 start_date=None, end_date=None):
60 """
61 Fetch complete historical data (handles pagination)
62
63 start_date: datetime object or string 'YYYY-MM-DD'
64 end_date: datetime object or string 'YYYY-MM-DD'
65 """
66 if isinstance(start_date, str):
67 start_date = datetime.strptime(start_date, '%Y-%m-%d')
68 if isinstance(end_date, str):
69 end_date = datetime.strptime(end_date, '%Y-%m-%d')
70
71 if start_date is None:
72 start_date = datetime.now() - timedelta(days=365)
73 if end_date is None:
74 end_date = datetime.now()
75
76 # Calculate timeframe in milliseconds
77 timeframe_ms = {
78 '1m': 60 * 1000,
79 '5m': 5 * 60 * 1000,
80 '15m': 15 * 60 * 1000,
81 '1h': 60 * 60 * 1000,
82 '4h': 4 * 60 * 60 * 1000,
83 '1d': 24 * 60 * 60 * 1000,
84 }[timeframe]
85
86 all_data = []
87 current_time = int(start_date.timestamp() * 1000)
88 end_time = int(end_date.timestamp() * 1000)
89
90 print(f"Fetching {symbol} data from {start_date} to {end_date}...")
91
92 while current_time < end_time:
93 try:
94 data = self.fetch_ohlcv(
95 symbol=symbol,
96 timeframe=timeframe,
97 since=current_time,
98 limit=1000
99 )
100
101 if data.empty:
102 break
103
104 all_data.append(data)
105 current_time = int(data.index[-1].timestamp() * 1000) + timeframe_ms
106
107 print(f"Fetched until {data.index[-1]}")
108
109 except Exception as e:
110 print(f"Error fetching data: {e}")
111 break
112
113 if not all_data:
114 return pd.DataFrame()
115
116 # Combine all data
117 df = pd.concat(all_data)
118 df = df[~df.index.duplicated(keep='first')] # Remove duplicates
119 df = df.sort_index()
120
121 return df
122
123 def fetch_orderbook(self, symbol='BTC/USDT', limit=20):
124 """
125 Fetch current order book
126
127 limit: Depth of order book (number of price levels)
128 """
129 orderbook = self.exchange.fetch_order_book(symbol, limit=limit)
130
131 return {
132 'timestamp': datetime.fromtimestamp(orderbook['timestamp'] / 1000),
133 'bids': orderbook['bids'], # [[price, quantity], ...]
134 'asks': orderbook['asks'],
135 }
136
137 def fetch_trades(self, symbol='BTC/USDT', limit=1000):
138 """
139 Fetch recent trades (tick data)
140 """
141 trades = self.exchange.fetch_trades(symbol, limit=limit)
142
143 df = pd.DataFrame([{
144 'timestamp': pd.to_datetime(t['timestamp'], unit='ms'),
145 'price': t['price'],
146 'quantity': t['amount'],
147 'side': t['side'],
148 'trade_id': t['id'],
149 } for t in trades])
150
151 return df
152
153# Example usage
154collector = CryptoDataCollector('binance')
155
156# Fetch recent Bitcoin data
157btc_1h = collector.fetch_ohlcv('BTC/USDT', '1h', limit=100)
158print(btc_1h.head())
159
160# Fetch historical data
161btc_historical = collector.fetch_historical_data(
162 'BTC/USDT',
163 '1h',
164 start_date='2023-01-01',
165 end_date='2024-01-01'
166)
167print(f"Fetched {len(btc_historical)} candles")
168
169# Fetch order book
170orderbook = collector.fetch_orderbook('BTC/USDT')
171print(f"Best bid: ${orderbook['bids'][0][0]:.2f}")
172print(f"Best ask: ${orderbook['asks'][0][0]:.2f}")
Data Storage
1import sqlite3
2from pathlib import Path
3
4class CryptoDataStorage:
5 """
6 Local storage for cryptocurrency data
7 """
8
9 def __init__(self, db_path='crypto_data.db'):
10 """Initialize database connection"""
11 self.db_path = Path(db_path)
12 self.conn = sqlite3.connect(self.db_path)
13 self._create_tables()
14
15 def _create_tables(self):
16 """Create database schema"""
17 cursor = self.conn.cursor()
18
19 # OHLCV data table
20 cursor.execute('''
21 CREATE TABLE IF NOT EXISTS ohlcv (
22 symbol TEXT,
23 timeframe TEXT,
24 timestamp INTEGER,
25 open REAL,
26 high REAL,
27 low REAL,
28 close REAL,
29 volume REAL,
30 PRIMARY KEY (symbol, timeframe, timestamp)
31 )
32 ''')
33
34 # Create index for faster queries
35 cursor.execute('''
36 CREATE INDEX IF NOT EXISTS idx_ohlcv_symbol_time
37 ON ohlcv(symbol, timeframe, timestamp)
38 ''')
39
40 self.conn.commit()
41
42 def save_ohlcv(self, df, symbol, timeframe):
43 """Save OHLCV data to database"""
44 df_copy = df.copy()
45 df_copy['symbol'] = symbol
46 df_copy['timeframe'] = timeframe
47 df_copy['timestamp'] = df_copy.index.astype(int) // 10**9
48
49 df_copy.to_sql(
50 'ohlcv',
51 self.conn,
52 if_exists='append',
53 index=False
54 )
55
56 # Remove duplicates
57 cursor = self.conn.cursor()
58 cursor.execute('''
59 DELETE FROM ohlcv
60 WHERE rowid NOT IN (
61 SELECT MIN(rowid)
62 FROM ohlcv
63 GROUP BY symbol, timeframe, timestamp
64 )
65 ''')
66 self.conn.commit()
67
68 def load_ohlcv(self, symbol, timeframe, start_date=None, end_date=None):
69 """Load OHLCV data from database"""
70 query = f'''
71 SELECT timestamp, open, high, low, close, volume
72 FROM ohlcv
73 WHERE symbol = ? AND timeframe = ?
74 '''
75 params = [symbol, timeframe]
76
77 if start_date:
78 query += ' AND timestamp >= ?'
79 params.append(int(start_date.timestamp()))
80
81 if end_date:
82 query += ' AND timestamp <= ?'
83 params.append(int(end_date.timestamp()))
84
85 query += ' ORDER BY timestamp'
86
87 df = pd.read_sql_query(query, self.conn, params=params)
88 df['timestamp'] = pd.to_datetime(df['timestamp'], unit='s')
89 df.set_index('timestamp', inplace=True)
90
91 return df
92
93 def close(self):
94 """Close database connection"""
95 self.conn.close()
96
97# Example usage
98storage = CryptoDataStorage('crypto_data.db')
99
100# Save data
101collector = CryptoDataCollector('binance')
102btc_data = collector.fetch_ohlcv('BTC/USDT', '1h', limit=1000)
103storage.save_ohlcv(btc_data, 'BTC/USDT', '1h')
104
105# Load data
106loaded_data = storage.load_ohlcv(
107 'BTC/USDT',
108 '1h',
109 start_date=datetime.now() - timedelta(days=30)
110)
111print(f"Loaded {len(loaded_data)} candles from database")
Technical Indicators
Technical indicators transform raw price data into actionable signals.
Trend Indicators
1import pandas as pd
2import numpy as np
3
4class TrendIndicators:
5 """
6 Trend-following technical indicators
7 """
8
9 @staticmethod
10 def sma(prices, period):
11 """Simple Moving Average"""
12 return prices.rolling(window=period).mean()
13
14 @staticmethod
15 def ema(prices, period):
16 """Exponential Moving Average"""
17 return prices.ewm(span=period, adjust=False).mean()
18
19 @staticmethod
20 def macd(prices, fast=12, slow=26, signal=9):
21 """
22 Moving Average Convergence Divergence
23
24 Returns: (macd_line, signal_line, histogram)
25 """
26 ema_fast = TrendIndicators.ema(prices, fast)
27 ema_slow = TrendIndicators.ema(prices, slow)
28
29 macd_line = ema_fast - ema_slow
30 signal_line = macd_line.ewm(span=signal, adjust=False).mean()
31 histogram = macd_line - signal_line
32
33 return macd_line, signal_line, histogram
34
35 @staticmethod
36 def adx(high, low, close, period=14):
37 """
38 Average Directional Index
39 Measures trend strength (0-100)
40 > 25: Strong trend
41 < 20: Weak/no trend
42 """
43 # Calculate True Range
44 tr1 = high - low
45 tr2 = abs(high - close.shift())
46 tr3 = abs(low - close.shift())
47 tr = pd.concat([tr1, tr2, tr3], axis=1).max(axis=1)
48
49 # Calculate Directional Movement
50 up_move = high - high.shift()
51 down_move = low.shift() - low
52
53 plus_dm = np.where((up_move > down_move) & (up_move > 0), up_move, 0)
54 minus_dm = np.where((down_move > up_move) & (down_move > 0), down_move, 0)
55
56 # Smooth with Wilder's smoothing
57 atr = tr.ewm(alpha=1/period, adjust=False).mean()
58 plus_di = 100 * pd.Series(plus_dm).ewm(alpha=1/period, adjust=False).mean() / atr
59 minus_di = 100 * pd.Series(minus_dm).ewm(alpha=1/period, adjust=False).mean() / atr
60
61 # Calculate ADX
62 dx = 100 * abs(plus_di - minus_di) / (plus_di + minus_di)
63 adx = dx.ewm(alpha=1/period, adjust=False).mean()
64
65 return adx, plus_di, minus_di
66
67# Example usage
68df = collector.fetch_ohlcv('BTC/USDT', '1h', limit=500)
69
70# Add indicators
71df['sma_20'] = TrendIndicators.sma(df['close'], 20)
72df['sma_50'] = TrendIndicators.sma(df['close'], 50)
73df['ema_12'] = TrendIndicators.ema(df['close'], 12)
74df['ema_26'] = TrendIndicators.ema(df['close'], 26)
75
76macd_line, signal_line, histogram = TrendIndicators.macd(df['close'])
77df['macd'] = macd_line
78df['macd_signal'] = signal_line
79df['macd_histogram'] = histogram
80
81adx, plus_di, minus_di = TrendIndicators.adx(df['high'], df['low'], df['close'])
82df['adx'] = adx
83df['plus_di'] = plus_di
84df['minus_di'] = minus_di
85
86print(df[['close', 'sma_20', 'sma_50', 'macd', 'adx']].tail())
Momentum Indicators
1class MomentumIndicators:
2 """
3 Momentum and oscillator indicators
4 """
5
6 @staticmethod
7 def rsi(prices, period=14):
8 """
9 Relative Strength Index
10
11 > 70: Overbought
12 < 30: Oversold
13 """
14 delta = prices.diff()
15 gain = delta.where(delta > 0, 0)
16 loss = -delta.where(delta < 0, 0)
17
18 avg_gain = gain.ewm(alpha=1/period, adjust=False).mean()
19 avg_loss = loss.ewm(alpha=1/period, adjust=False).mean()
20
21 rs = avg_gain / avg_loss
22 rsi = 100 - (100 / (1 + rs))
23
24 return rsi
25
26 @staticmethod
27 def stochastic(high, low, close, k_period=14, d_period=3):
28 """
29 Stochastic Oscillator
30
31 Returns: (%K, %D)
32 > 80: Overbought
33 < 20: Oversold
34 """
35 lowest_low = low.rolling(window=k_period).min()
36 highest_high = high.rolling(window=k_period).max()
37
38 k = 100 * (close - lowest_low) / (highest_high - lowest_low)
39 d = k.rolling(window=d_period).mean()
40
41 return k, d
42
43 @staticmethod
44 def cci(high, low, close, period=20):
45 """
46 Commodity Channel Index
47
48 > +100: Overbought
49 < -100: Oversold
50 """
51 typical_price = (high + low + close) / 3
52 sma = typical_price.rolling(window=period).mean()
53 mean_deviation = typical_price.rolling(window=period).apply(
54 lambda x: np.mean(np.abs(x - x.mean()))
55 )
56
57 cci = (typical_price - sma) / (0.015 * mean_deviation)
58
59 return cci
60
61 @staticmethod
62 def roc(prices, period=12):
63 """
64 Rate of Change (Momentum)
65
66 > 0: Upward momentum
67 < 0: Downward momentum
68 """
69 roc = 100 * (prices - prices.shift(period)) / prices.shift(period)
70 return roc
71
72# Example usage
73df['rsi'] = MomentumIndicators.rsi(df['close'], 14)
74df['stoch_k'], df['stoch_d'] = MomentumIndicators.stochastic(
75 df['high'], df['low'], df['close']
76)
77df['cci'] = MomentumIndicators.cci(df['high'], df['low'], df['close'])
78df['roc'] = MomentumIndicators.roc(df['close'], 12)
79
80print(df[['close', 'rsi', 'stoch_k', 'cci', 'roc']].tail())
Volatility Indicators
1class VolatilityIndicators:
2 """
3 Volatility measurement indicators
4 """
5
6 @staticmethod
7 def bollinger_bands(prices, period=20, std_dev=2):
8 """
9 Bollinger Bands
10
11 Returns: (middle_band, upper_band, lower_band)
12 """
13 middle = prices.rolling(window=period).mean()
14 std = prices.rolling(window=period).std()
15
16 upper = middle + (std * std_dev)
17 lower = middle - (std * std_dev)
18
19 return middle, upper, lower
20
21 @staticmethod
22 def atr(high, low, close, period=14):
23 """
24 Average True Range
25 Measures volatility
26 """
27 tr1 = high - low
28 tr2 = abs(high - close.shift())
29 tr3 = abs(low - close.shift())
30
31 tr = pd.concat([tr1, tr2, tr3], axis=1).max(axis=1)
32 atr = tr.ewm(alpha=1/period, adjust=False).mean()
33
34 return atr
35
36 @staticmethod
37 def keltner_channel(high, low, close, period=20, atr_multiplier=2):
38 """
39 Keltner Channel
40
41 Returns: (middle, upper, lower)
42 """
43 middle = close.ewm(span=period, adjust=False).mean()
44 atr = VolatilityIndicators.atr(high, low, close, period)
45
46 upper = middle + (atr * atr_multiplier)
47 lower = middle - (atr * atr_multiplier)
48
49 return middle, upper, lower
50
51 @staticmethod
52 def historical_volatility(returns, period=30):
53 """
54 Historical Volatility (annualized)
55
56 returns: Series of percentage returns
57 """
58 volatility = returns.rolling(window=period).std() * np.sqrt(365 * 24)
59 return volatility
60
61# Example usage
62df['bb_middle'], df['bb_upper'], df['bb_lower'] = \
63 VolatilityIndicators.bollinger_bands(df['close'])
64
65df['atr'] = VolatilityIndicators.atr(df['high'], df['low'], df['close'])
66
67df['kc_middle'], df['kc_upper'], df['kc_lower'] = \
68 VolatilityIndicators.keltner_channel(df['high'], df['low'], df['close'])
69
70# Calculate returns for volatility
71df['returns'] = df['close'].pct_change()
72df['volatility'] = VolatilityIndicators.historical_volatility(df['returns'])
73
74print(df[['close', 'bb_upper', 'bb_lower', 'atr', 'volatility']].tail())
Volume Indicators
1class VolumeIndicators:
2 """
3 Volume-based indicators
4 """
5
6 @staticmethod
7 def obv(close, volume):
8 """
9 On-Balance Volume
10 Cumulative volume based on price direction
11 """
12 direction = np.where(close > close.shift(), 1,
13 np.where(close < close.shift(), -1, 0))
14 obv = (direction * volume).cumsum()
15 return obv
16
17 @staticmethod
18 def vwap(high, low, close, volume):
19 """
20 Volume Weighted Average Price
21 Average price weighted by volume
22 """
23 typical_price = (high + low + close) / 3
24 vwap = (typical_price * volume).cumsum() / volume.cumsum()
25 return vwap
26
27 @staticmethod
28 def mfi(high, low, close, volume, period=14):
29 """
30 Money Flow Index
31 Volume-weighted RSI
32
33 > 80: Overbought
34 < 20: Oversold
35 """
36 typical_price = (high + low + close) / 3
37 money_flow = typical_price * volume
38
39 direction = np.where(typical_price > typical_price.shift(), 1, -1)
40 positive_flow = money_flow.where(direction > 0, 0).rolling(period).sum()
41 negative_flow = money_flow.where(direction < 0, 0).rolling(period).sum()
42
43 mfi = 100 - (100 / (1 + positive_flow / negative_flow))
44 return mfi
45
46 @staticmethod
47 def accumulation_distribution(high, low, close, volume):
48 """
49 Accumulation/Distribution Line
50 Measures buying/selling pressure
51 """
52 clv = ((close - low) - (high - close)) / (high - low)
53 ad = (clv * volume).cumsum()
54 return ad
55
56# Example usage
57df['obv'] = VolumeIndicators.obv(df['close'], df['volume'])
58df['vwap'] = VolumeIndicators.vwap(df['high'], df['low'], df['close'], df['volume'])
59df['mfi'] = VolumeIndicators.mfi(df['high'], df['low'], df['close'], df['volume'])
60df['ad'] = VolumeIndicators.accumulation_distribution(
61 df['high'], df['low'], df['close'], df['volume']
62)
63
64print(df[['close', 'volume', 'obv', 'vwap', 'mfi']].tail())
Statistical Analysis
Basic Statistical Measures
1class StatisticalAnalysis:
2 """
3 Statistical analysis tools for crypto data
4 """
5
6 @staticmethod
7 def calculate_returns(prices, method='simple'):
8 """
9 Calculate returns
10
11 method: 'simple' or 'log'
12 """
13 if method == 'simple':
14 returns = prices.pct_change()
15 elif method == 'log':
16 returns = np.log(prices / prices.shift())
17 else:
18 raise ValueError("method must be 'simple' or 'log'")
19
20 return returns
21
22 @staticmethod
23 def analyze_distribution(returns):
24 """
25 Analyze return distribution
26 """
27 stats = {
28 'mean': returns.mean(),
29 'median': returns.median(),
30 'std': returns.std(),
31 'skewness': returns.skew(),
32 'kurtosis': returns.kurtosis(),
33 'min': returns.min(),
34 'max': returns.max(),
35 }
36
37 # Annualized metrics (for hourly data)
38 stats['annualized_return'] = stats['mean'] * 365 * 24
39 stats['annualized_volatility'] = stats['std'] * np.sqrt(365 * 24)
40 stats['sharpe_ratio'] = stats['annualized_return'] / stats['annualized_volatility']
41
42 return stats
43
44 @staticmethod
45 def correlation_analysis(df, columns):
46 """
47 Calculate correlation matrix
48 """
49 correlation = df[columns].corr()
50 return correlation
51
52 @staticmethod
53 def rolling_statistics(returns, window=24):
54 """
55 Calculate rolling statistics
56 """
57 rolling_stats = pd.DataFrame({
58 'rolling_mean': returns.rolling(window).mean(),
59 'rolling_std': returns.rolling(window).std(),
60 'rolling_sharpe': (
61 returns.rolling(window).mean() /
62 returns.rolling(window).std() *
63 np.sqrt(window)
64 ),
65 })
66
67 return rolling_stats
68
69# Example usage
70df['returns'] = StatisticalAnalysis.calculate_returns(df['close'], 'simple')
71
72# Analyze distribution
73stats = StatisticalAnalysis.analyze_distribution(df['returns'].dropna())
74print("\nReturn Distribution Statistics:")
75for key, value in stats.items():
76 print(f"{key}: {value:.4f}")
77
78# Rolling statistics
79rolling_stats = StatisticalAnalysis.rolling_statistics(df['returns'], window=24)
80df = df.join(rolling_stats)
81
82print(df[['returns', 'rolling_mean', 'rolling_std', 'rolling_sharpe']].tail())
Practical Example: Complete Analysis Pipeline
Let’s put everything together into a complete analysis:
1import matplotlib.pyplot as plt
2import seaborn as sns
3
4class CryptoAnalysisPipeline:
5 """
6 Complete cryptocurrency analysis pipeline
7 """
8
9 def __init__(self, symbol='BTC/USDT', timeframe='1h'):
10 self.symbol = symbol
11 self.timeframe = timeframe
12 self.collector = CryptoDataCollector('binance')
13 self.df = None
14
15 def fetch_data(self, start_date, end_date=None):
16 """Fetch historical data"""
17 print(f"Fetching {self.symbol} data...")
18 self.df = self.collector.fetch_historical_data(
19 self.symbol,
20 self.timeframe,
21 start_date=start_date,
22 end_date=end_date
23 )
24 print(f"Fetched {len(self.df)} candles")
25 return self.df
26
27 def add_indicators(self):
28 """Add all technical indicators"""
29 print("Calculating indicators...")
30
31 df = self.df
32
33 # Trend indicators
34 df['sma_20'] = TrendIndicators.sma(df['close'], 20)
35 df['sma_50'] = TrendIndicators.sma(df['close'], 50)
36 df['sma_200'] = TrendIndicators.sma(df['close'], 200)
37 df['ema_12'] = TrendIndicators.ema(df['close'], 12)
38 df['ema_26'] = TrendIndicators.ema(df['close'], 26)
39
40 macd, signal, hist = TrendIndicators.macd(df['close'])
41 df['macd'] = macd
42 df['macd_signal'] = signal
43 df['macd_histogram'] = hist
44
45 # Momentum indicators
46 df['rsi'] = MomentumIndicators.rsi(df['close'])
47 df['stoch_k'], df['stoch_d'] = MomentumIndicators.stochastic(
48 df['high'], df['low'], df['close']
49 )
50
51 # Volatility indicators
52 df['bb_middle'], df['bb_upper'], df['bb_lower'] = \
53 VolatilityIndicators.bollinger_bands(df['close'])
54 df['atr'] = VolatilityIndicators.atr(df['high'], df['low'], df['close'])
55
56 # Volume indicators
57 df['obv'] = VolumeIndicators.obv(df['close'], df['volume'])
58 df['vwap'] = VolumeIndicators.vwap(
59 df['high'], df['low'], df['close'], df['volume']
60 )
61
62 # Statistical measures
63 df['returns'] = StatisticalAnalysis.calculate_returns(df['close'])
64 df['volatility'] = VolatilityIndicators.historical_volatility(
65 df['returns'], period=24
66 )
67
68 self.df = df
69 print("Indicators added")
70 return df
71
72 def generate_signals(self):
73 """Generate trading signals"""
74 print("Generating signals...")
75
76 df = self.df
77
78 # Trend following signals
79 df['trend_signal'] = np.where(
80 (df['sma_20'] > df['sma_50']) & (df['close'] > df['sma_20']),
81 1, # Bullish
82 np.where(
83 (df['sma_20'] < df['sma_50']) & (df['close'] < df['sma_20']),
84 -1, # Bearish
85 0 # Neutral
86 )
87 )
88
89 # Momentum signals
90 df['momentum_signal'] = np.where(
91 (df['rsi'] < 30) & (df['stoch_k'] < 20),
92 1, # Oversold - Buy
93 np.where(
94 (df['rsi'] > 70) & (df['stoch_k'] > 80),
95 -1, # Overbought - Sell
96 0
97 )
98 )
99
100 # Volatility breakout signals
101 df['breakout_signal'] = np.where(
102 df['close'] > df['bb_upper'],
103 1, # Breakout up
104 np.where(
105 df['close'] < df['bb_lower'],
106 -1, # Breakout down
107 0
108 )
109 )
110
111 # Combined signal (simple average)
112 df['combined_signal'] = (
113 df['trend_signal'] +
114 df['momentum_signal'] +
115 df['breakout_signal']
116 ) / 3
117
118 self.df = df
119 print("Signals generated")
120 return df
121
122 def analyze_statistics(self):
123 """Perform statistical analysis"""
124 print("\n" + "="*60)
125 print(f"Statistical Analysis: {self.symbol}")
126 print("="*60)
127
128 df = self.df
129
130 # Price statistics
131 print(f"\nPrice Range:")
132 print(f" Lowest: ${df['close'].min():,.2f}")
133 print(f" Highest: ${df['close'].max():,.2f}")
134 print(f" Current: ${df['close'].iloc[-1]:,.2f}")
135 print(f" Change: {((df['close'].iloc[-1] / df['close'].iloc[0]) - 1) * 100:.2f}%")
136
137 # Return statistics
138 stats = StatisticalAnalysis.analyze_distribution(df['returns'].dropna())
139 print(f"\nReturn Statistics:")
140 print(f" Mean Return: {stats['mean']:.4f} ({stats['annualized_return']:.2%} annualized)")
141 print(f" Volatility: {stats['std']:.4f} ({stats['annualized_volatility']:.2%} annualized)")
142 print(f" Sharpe Ratio: {stats['sharpe_ratio']:.2f}")
143 print(f" Skewness: {stats['skewness']:.2f}")
144 print(f" Kurtosis: {stats['kurtosis']:.2f}")
145
146 # Signal statistics
147 print(f"\nSignal Distribution:")
148 signal_counts = df['combined_signal'].value_counts().sort_index()
149 for signal, count in signal_counts.items():
150 pct = count / len(df) * 100
151 print(f" Signal {signal:+.1f}: {count:,} ({pct:.1f}%)")
152
153 return stats
154
155 def plot_analysis(self, save_path=None):
156 """Create comprehensive analysis plots"""
157 print("Creating plots...")
158
159 df = self.df.iloc[-500:] # Last 500 candles
160
161 fig, axes = plt.subplots(5, 1, figsize=(15, 20))
162
163 # Plot 1: Price and Moving Averages
164 ax1 = axes[0]
165 ax1.plot(df.index, df['close'], label='Close Price', linewidth=1)
166 ax1.plot(df.index, df['sma_20'], label='SMA 20', alpha=0.7)
167 ax1.plot(df.index, df['sma_50'], label='SMA 50', alpha=0.7)
168 ax1.fill_between(df.index, df['bb_upper'], df['bb_lower'], alpha=0.2)
169 ax1.set_title(f'{self.symbol} Price and Moving Averages')
170 ax1.set_ylabel('Price (USD)')
171 ax1.legend()
172 ax1.grid(True, alpha=0.3)
173
174 # Plot 2: Volume
175 ax2 = axes[1]
176 colors = ['g' if df['close'].iloc[i] > df['close'].iloc[i-1]
177 else 'r' for i in range(len(df))]
178 ax2.bar(df.index, df['volume'], color=colors, alpha=0.5)
179 ax2.plot(df.index, df['obv'] / df['obv'].max() * df['volume'].max(),
180 label='OBV (normalized)', color='blue', linewidth=1)
181 ax2.set_title('Volume and OBV')
182 ax2.set_ylabel('Volume')
183 ax2.legend()
184 ax2.grid(True, alpha=0.3)
185
186 # Plot 3: RSI
187 ax3 = axes[2]
188 ax3.plot(df.index, df['rsi'], label='RSI', color='purple')
189 ax3.axhline(y=70, color='r', linestyle='--', alpha=0.5)
190 ax3.axhline(y=30, color='g', linestyle='--', alpha=0.5)
191 ax3.fill_between(df.index, 30, 70, alpha=0.1)
192 ax3.set_title('Relative Strength Index (RSI)')
193 ax3.set_ylabel('RSI')
194 ax3.set_ylim(0, 100)
195 ax3.legend()
196 ax3.grid(True, alpha=0.3)
197
198 # Plot 4: MACD
199 ax4 = axes[3]
200 ax4.plot(df.index, df['macd'], label='MACD', linewidth=1)
201 ax4.plot(df.index, df['macd_signal'], label='Signal', linewidth=1)
202 ax4.bar(df.index, df['macd_histogram'], label='Histogram', alpha=0.3)
203 ax4.axhline(y=0, color='black', linestyle='-', alpha=0.3)
204 ax4.set_title('MACD')
205 ax4.set_ylabel('MACD')
206 ax4.legend()
207 ax4.grid(True, alpha=0.3)
208
209 # Plot 5: Trading Signals
210 ax5 = axes[4]
211 ax5.plot(df.index, df['combined_signal'], label='Combined Signal',
212 linewidth=2, color='orange')
213 ax5.axhline(y=0, color='black', linestyle='-', alpha=0.3)
214 ax5.fill_between(df.index, 0, df['combined_signal'],
215 where=df['combined_signal'] > 0, color='g', alpha=0.3)
216 ax5.fill_between(df.index, 0, df['combined_signal'],
217 where=df['combined_signal'] < 0, color='r', alpha=0.3)
218 ax5.set_title('Trading Signals')
219 ax5.set_ylabel('Signal Strength')
220 ax5.set_xlabel('Time')
221 ax5.legend()
222 ax5.grid(True, alpha=0.3)
223
224 plt.tight_layout()
225
226 if save_path:
227 plt.savefig(save_path, dpi=300, bbox_inches='tight')
228 print(f"Plot saved to {save_path}")
229 else:
230 plt.show()
231
232 def run_complete_analysis(self, start_date, save_plot=True):
233 """Run complete analysis pipeline"""
234 # Fetch data
235 self.fetch_data(start_date)
236
237 # Add indicators
238 self.add_indicators()
239
240 # Generate signals
241 self.generate_signals()
242
243 # Analyze statistics
244 self.analyze_statistics()
245
246 # Plot results
247 plot_path = f"{self.symbol.replace('/', '_')}_analysis.png" if save_plot else None
248 self.plot_analysis(save_path=plot_path)
249
250 return self.df
251
252# Example usage
253pipeline = CryptoAnalysisPipeline('BTC/USDT', '1h')
254df = pipeline.run_complete_analysis(start_date='2023-01-01', save_plot=True)
255
256# Compare Bitcoin and Ethereum
257btc_pipeline = CryptoAnalysisPipeline('BTC/USDT', '1h')
258eth_pipeline = CryptoAnalysisPipeline('ETH/USDT', '1h')
259
260btc_df = btc_pipeline.run_complete_analysis('2023-01-01')
261eth_df = eth_pipeline.run_complete_analysis('2023-01-01')
262
263# Correlation analysis
264correlation = pd.DataFrame({
265 'BTC_returns': btc_df['returns'],
266 'ETH_returns': eth_df['returns']
267}).corr()
268
269print(f"\nBTC-ETH Correlation: {correlation.iloc[0, 1]:.3f}")
Conclusion and Next Steps
In this first part, we’ve established the foundational knowledge for cryptocurrency quantitative trading:
- ✅ Understanding market microstructure and data types
- ✅ Setting up data collection infrastructure
- ✅ Implementing technical indicators
- ✅ Performing statistical analysis
- ✅ Building a complete analysis pipeline
Key Takeaways
- 24/7 Markets: Crypto never sleeps, providing continuous data and opportunities
- Data Quality Matters: Use reliable sources and handle edge cases properly
- Technical Indicators: Tools for pattern recognition, not crystal balls
- Statistical Foundation: Understanding distributions and correlations is crucial
- Systematic Approach: Build pipelines, not one-off analyses
Coming Up in Part 2
In the next post, we’ll build on these foundations to develop actual trading strategies:
- Backtesting frameworks and methodology
- Strategy development (trend following, mean reversion, arbitrage)
- Risk management and position sizing
- Performance metrics and evaluation
- Multi-asset portfolio strategies
Homework Assignment
Before moving to Part 2, practice these exercises:
1# Exercise 1: Collect data for multiple cryptocurrencies
2symbols = ['BTC/USDT', 'ETH/USDT', 'BNB/USDT', 'SOL/USDT']
3# Fetch 1-year of hourly data and store in database
4
5# Exercise 2: Implement a custom indicator
6# Create a "composite momentum" indicator combining RSI, MACD, and ROC
7
8# Exercise 3: Correlation analysis
9# Calculate rolling 30-day correlations between major cryptocurrencies
10
11# Exercise 4: Regime detection
12# Identify different market regimes in Bitcoin's history
13
14# Exercise 5: Create alerts
15# Build a system that alerts when RSI < 30 or RSI > 70
Continue to Part 2: Crypto Quantitative Trading Part 2: Advanced Strategies and Backtesting
Have questions about the fundamentals? Share them in the comments below! In Part 2, we’ll turn this analysis framework into actual trading strategies.
