Polars in Finance Cheat Sheet

Polars expressions and contexts applied to S&P 500 OHLCV data.
Data Science
Published

April 5, 2026

Modified

April 5, 2026

Polars expressions and contexts applied to S&P 500 OHLCV data. If you know Pandas, the patterns here should click fast.

1. Loading Financial Data

Show the code
import polars as pl
import polars.selectors as cs
from pathlib import Path
import os

1root = Path(os.path.abspath("")).parent.parent

2df = pl.read_ipc(root / "assets" / "data" / "ticker_data.arrow")
3df = df.sort(["ticker", "date"])

df.head(5)
1
Set root path to project directory.
2
Load ticker data from an Arrow file.
3
Sort by ticker and date — required for shifts and rolling windows.
Could not memory_map compressed IPC file, defaulting to normal read. Toggle off 'memory_map' to silence this warning.
shape: (5, 7)
date ticker open high low close volume
date str f32 f32 f32 f32 f32
2006-01-03 "A" 19.918699 20.025999 19.5728 19.9783 5.307088e6
2006-01-04 "A" 20.008101 20.1751 19.900801 20.032 4.195817e6
2006-01-05 "A" 19.9485 20.556801 19.9485 20.556801 4.835402e6
2006-01-06 "A" 20.574699 20.747601 20.3302 20.664101 6.146307e6
2006-01-09 "A" 20.664101 20.753599 20.527 20.6045 4.082859e6

2. Expressions: The Atomic Units of Logic

Expressions are reusable, composable chunks of logic. Define once, apply anywhere.

Show the code
1log_return = (pl.col("close") / pl.col("close").shift(1)).log().over("ticker")

realized_vol = (log_return.rolling_std(window_size=21) * (252**0.5)).over(
    "ticker"
2)


def min_max_scale(name: str) -> pl.Expr:
    col = pl.col(name)
3    return (col - col.min()) / (col.max() - col.min())
1
Log returns: ln(P_t / P_{t-1}), calculated within each ticker via .over().
2
Realized volatility: annualized rolling std (21-day window ≈ 1 trading month).
3
Min-max scaling for ML features.

3. Context: Selection & Feature Engineering

with_columns adds multiple indicators in a single pass.

Show the code
df = df.with_columns(
1    ret=log_return,
    vol=realized_vol,
2    typical_price=(pl.col("high") + pl.col("low") + pl.col("close")) / 3,
)

df.tail(5)
1
Append log returns and realized volatility.
2
Typical price — common input for Money Flow Index.
shape: (5, 10)
date ticker open high low close volume ret vol typical_price
date str f32 f32 f32 f32 f32 f32 f32 f32
2026-04-22 "ZTS" 118.949997 119.910004 116.599998 117.519997 3.5146e6 -0.0056 0.272725 118.010002
2026-04-23 "ZTS" 117.190002 117.599998 114.949997 116.059998 4.5328e6 -0.012501 0.276069 116.203331
2026-04-24 "ZTS" 116.010002 117.050003 115.410004 116.870003 4.1851e6 0.006955 0.276144 116.443344
2026-04-27 "ZTS" 116.620003 119.68 116.599998 117.870003 3.1603e6 0.00852 0.277579 118.050003
2026-04-28 "ZTS" 117.230003 118.290001 116.080002 116.650002 2.9725e6 -0.010404 0.260076 117.006668

4. Context: Aggregation (group_by)

Show the code
risk_summary = df.group_by("ticker").agg(
    annual_return=(pl.col("ret").mean() * 252),
    annual_vol=(pl.col("ret").std() * (252**0.5)),
1    max_drawdown=((pl.col("close") / pl.col("close").cum_max() - 1).min()),
2    sharpe_ratio=(pl.col("ret").mean() / pl.col("ret").std()) * (252**0.5),
)

risk_summary.sort("sharpe_ratio", descending=True).head(5)
1
Max drawdown via cumulative maximum.
2
Sharpe ratio (assuming 0% risk-free rate).
shape: (5, 5)
ticker annual_return annual_vol max_drawdown sharpe_ratio
str f32 f32 f32 f32
"SNDK" 2.785047 0.98521 -0.475009 2.826855
"GEV" 1.023233 0.533674 -0.382856 1.917337
"Q" 0.743194 0.545531 -0.271231 1.362331
"CEG" 0.475289 0.490132 -0.507023 0.969717
"AVGO" 0.351236 0.378535 -0.483 0.927882

5. Power Selectors (polars.selectors)

Target columns by properties instead of hardcoding names — handy for scenario shocks.

Show the code
1price_shock = cs.starts_with("open", "high", "low", "close") * 1.01

df.select(
    "date",
    "ticker",
2    price_shock.name.suffix("_shock"),
).head(3)
1
Apply a 1% upward shock to all OHLC columns at once.
2
Suffix the new column names to keep originals intact.
shape: (3, 6)
date ticker open_shock high_shock low_shock close_shock
date str f32 f32 f32 f32
2006-01-03 "A" 20.117886 20.226259 19.768528 20.178083
2006-01-04 "A" 20.208181 20.376852 20.099808 20.232319
2006-01-05 "A" 20.147984 20.762369 20.147984 20.762369

6. Window Functions (over)

Compute group-level stats and project them back onto individual rows — no merge needed.

Show the code
df.with_columns(
    vol_z=(pl.col("volume") - pl.col("volume").mean().over("ticker"))
1    / pl.col("volume").std().over("ticker"),
2    rel_strength=pl.col("close") / pl.col("close").mean().over("date"),
).head(5)
1
Volume z-score per ticker — flags unusual trading activity.
2
Cross-sectional relative strength: price vs. market average that day.
shape: (5, 12)
date ticker open high low close volume ret vol typical_price vol_z rel_strength
date str f32 f32 f32 f32 f32 f32 f32 f32 f32 f32
2006-01-03 "A" 19.918699 20.025999 19.5728 19.9783 5.307088e6 null null 19.859035 1.077447 0.749611
2006-01-04 "A" 20.008101 20.1751 19.900801 20.032 4.195817e6 0.002684 null 20.035969 0.542916 0.748959
2006-01-05 "A" 19.9485 20.556801 19.9485 20.556801 4.835402e6 0.025861 null 20.354034 0.850562 0.766769
2006-01-06 "A" 20.574699 20.747601 20.3302 20.664101 6.146307e6 0.005206 null 20.580635 1.481119 0.764833
2006-01-09 "A" 20.664101 20.753599 20.527 20.6045 4.082859e6 -0.002888 null 20.628368 0.488583 0.754984

7. Time Series: Rolling Metrics & Asof Joins

Show the code
df.with_columns(
    vwap=(pl.col("close") * pl.col("volume")).rolling_sum(20)
1    / pl.col("volume").rolling_sum(20)
).head(5)
1
20-day rolling VWAP.
shape: (5, 11)
date ticker open high low close volume ret vol typical_price vwap
date str f32 f32 f32 f32 f32 f32 f32 f32 f32
2006-01-03 "A" 19.918699 20.025999 19.5728 19.9783 5.307088e6 null null 19.859035 null
2006-01-04 "A" 20.008101 20.1751 19.900801 20.032 4.195817e6 0.002684 null 20.035969 null
2006-01-05 "A" 19.9485 20.556801 19.9485 20.556801 4.835402e6 0.025861 null 20.354034 null
2006-01-06 "A" 20.574699 20.747601 20.3302 20.664101 6.146307e6 0.005206 null 20.580635 null
2006-01-09 "A" 20.664101 20.753599 20.527 20.6045 4.082859e6 -0.002888 null 20.628368 null

8. Reshaping for Correlation Analysis

Pivot from long to wide for correlation / covariance matrices.

Show the code
wide_returns = df.pivot(
    index="date",
    on="ticker",
    values="ret",
1).drop_nulls()

2corr_matrix = wide_returns.select(cs.numeric()).corr()
corr_matrix.head(5)
1
One column per ticker’s returns.
2
Correlation matrix across all assets.
shape: (5, 501)
A AAPL ABBV ABNB ABT ACGL ACN ADBE ADI ADM ADP ADSK AEE AEP AES AFL AIG AIZ AJG AKAM ALB ALGN ALL ALLE AMAT AMCR AMD AME AMGN AMP AMT AMZN ANET AON AOS APA APD VICI VLO VLTO VMC VRSK VRSN VRT VRTX VST VTR VTRS VZ WAB WAT WBD WDAY WDC WEC WELL WFC WM WMB WMT WRB WSM WST WTW WY WYNN XEL XOM XYL XYZ YUM ZBH ZBRA ZTS
f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64 f64
1.0 0.252005 0.159886 0.387158 0.291319 0.051918 0.314924 0.220464 0.365594 0.082219 0.250166 0.352096 0.01177 -0.145537 0.20628 0.149047 -0.048577 0.142817 0.096217 0.21591 0.184219 0.467583 0.129767 0.202938 0.322944 0.333373 0.084619 0.375688 0.321947 0.361348 0.011551 0.34124 0.173192 0.162829 0.229952 -0.147347 0.205694 0.016885 -0.049898 0.323352 0.333929 0.135326 0.065517 0.177153 0.315086 0.133985 -0.169327 0.253472 -0.07502 0.337305 0.62843 0.056035 0.257507 0.22573 -0.032459 -0.059922 0.214895 0.004414 -0.119114 0.066659 0.013125 0.385525 0.357492 0.103034 0.16268 0.289355 0.007724 -0.119016 0.23687 0.381205 0.216098 0.196699 0.274606 0.362176
0.252005 1.0 0.032607 0.353655 0.207605 0.10381 0.050583 0.132521 0.365957 -0.057716 0.06369 0.121071 -0.158196 -0.146113 0.018032 0.205258 0.216694 0.115361 0.003543 -0.085321 0.028355 0.376391 0.127262 0.008084 0.178174 0.273976 0.092632 0.342363 0.248612 0.294118 0.025718 0.231091 0.098731 -0.052414 0.18999 -0.127354 0.106166 0.109663 -0.125555 0.149918 0.116569 -0.076417 0.007996 0.242075 0.07343 -0.009934 0.081029 0.401407 -0.069197 0.336963 0.267525 0.150331 0.083502 0.115886 -0.190388 0.044669 0.36324 -0.072876 -0.057587 0.033092 0.117635 0.27519 0.175975 0.068427 0.108035 0.365579 -0.149933 -0.108486 0.194755 0.151375 0.146997 0.06224 0.188564 0.252188
0.159886 0.032607 1.0 0.025042 0.17823 0.1801 0.051987 0.065414 0.029073 0.030087 -0.049793 0.074998 0.27563 0.201132 0.01249 0.162784 0.032822 0.151542 0.089796 0.13993 0.098696 0.106788 0.131893 0.054494 0.021722 0.108147 0.003968 0.181843 0.389602 -0.032594 0.105185 -0.112818 -0.002974 0.035057 0.120331 -0.133418 0.137989 0.028821 -0.092267 0.206458 0.08881 0.081341 -0.101082 0.004792 0.433486 -0.154702 0.184427 0.14594 0.123576 0.107561 0.05267 0.036274 -0.130214 -0.060284 0.194187 0.229703 -0.040875 0.105738 0.055912 0.229642 0.057308 0.04018 0.160114 0.013678 0.040071 -0.121142 0.183901 -0.111634 0.084572 0.053288 0.195927 0.009359 -0.03683 -0.018306
0.387158 0.353655 0.025042 1.0 0.179758 0.069289 0.408465 0.469951 0.41908 -0.003487 0.454314 0.500417 -0.086092 -0.187378 -0.005041 0.155108 0.131687 0.26926 0.201723 0.092757 0.196378 0.559737 0.112407 0.127923 0.296559 0.276444 0.173123 0.297822 0.195929 0.433715 -0.006856 0.417421 0.322316 0.226075 0.217916 -0.153585 -0.079396 0.053546 -0.091131 0.448981 0.12535 0.24754 0.193942 0.155101 0.2933 0.108703 -0.09358 0.293449 -0.223877 0.354082 0.348563 0.159664 0.396202 0.156269 -0.177693 -0.075159 0.368768 0.012266 -0.014705 -0.084896 -0.146465 0.44977 0.284118 0.182924 0.037971 0.439027 -0.130453 -0.272673 0.284202 0.451018 0.021685 0.142714 0.384437 0.45244
0.291319 0.207605 0.17823 0.179758 1.0 0.135045 0.082257 -0.01935 0.121082 0.129791 0.053078 -0.057555 0.235583 0.179526 0.038046 0.225051 0.124796 0.100274 0.090429 0.032748 -0.07607 0.23032 0.184531 0.133566 0.092136 0.224374 -0.145982 0.167634 0.177653 0.073567 0.151422 -0.07323 -0.16457 0.005134 0.142195 -0.005853 0.199751 0.141295 -0.051328 0.31528 0.121736 0.047432 0.100863 0.08491 0.126726 0.014749 0.292106 0.13791 0.088639 0.223678 0.151275 0.095275 -0.110802 0.028947 0.160011 0.325483 -0.008272 0.135166 -0.005325 0.131658 0.12927 0.217428 0.32369 0.019537 0.198712 0.092924 0.155969 -0.001896 0.056342 -0.007221 0.300103 0.180629 -0.065155 0.275692

9. Performance: Categoricals & Streaming

Cast string columns to Categorical for memory savings. For datasets that exceed RAM, use LazyFrame + streaming to process in batches.

Show the code
df = pl.scan_ipc("large_dataset.arrow")
.filter(pl.col("vol") > 0.3)
.group_by("ticker")
.agg(pl.col("ret").mean())
1.with_columns(pl.col("ticker").cast(pl.Categorical))
2.collect(streaming=True)
1
Tickers as Categorical — integers under the hood, much less memory.
2
Streaming execution — processes in batches instead of loading everything at once.
Note

Download the companion script here.

Back to top