The Python Data Model

How to unlock the power of Python’s special methods to build intuitive and consistent classes.
Python Academy
Python
Author

bwrob

Published

December 12, 2025

Introduction

If you have ever felt that Python’s built-in types possess a certain intuitive behavior that your custom classes lack, you are not alone. You expect len(obj) to get the size, obj['ticker'] to access an item, and for asset in obj to iterate through elements.

This consistency is a core principle of the Python Data Model. By implementing specific “special methods” (often called dunder methods, for double underscore), your custom classes can behave just like the built-ins. This greatly enhances expressiveness, readability, and compatibility.

In this post, we will build a Portfolio class that leverages the Python Data Model. We will focus on structuring a robust and well-behaved object.

The Goal: A Pythonic Portfolio

We want a Portfolio class that stores a list of financial positions. But we don’t want a clunky interface like portfolio.get_position_by_ticker("AAPL"). We want the market-standard syntax:

  • len(portfolio) to size up our exposure.
  • portfolio["AAPL"] to get a quote instantly.
  • for position in portfolio to audit our holdings.

Let’s start by defining a simple Position container. To keep our balance sheet clean, we’ll track the total USD value rather than quantity and price.

from __future__ import annotations

from dataclasses import dataclass
from collections.abc import Iterator


@dataclass
class Position:
    symbol: str
    value_usd: float

The Initial Portfolio Class

We start with a standard class definition. To index our strategy efficiently, we’ll store the positions in a dictionary keyed by their symbol. We name it _contents to signal it’s an internal implementation detail—a trade secret, if you will—encouraging users to access data through the public interface.

class Portfolio:
    def __init__(self, name: str, managers: tuple[str, ...], contents: list[Position] | None = None):
        self.name = name
        self.managers = managers
        # Store positions in a dictionary keyed by symbol for fast lookup
        self._contents = {pos.symbol: pos for pos in contents} if contents else {}

    def __repr__(self):
        return f"Portfolio(name={self.name!r}, managers={self.managers!r}, contents={list(self._contents.values())!r})"

# Create some data
pos1 = Position("AAPL", 15000.0)
pos2 = Position("GOOG", 140000.0)
p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])
print(p)
Portfolio(name='Tech Fund', managers=('Alice', 'Bob'), contents=[Position(symbol='AAPL', value_usd=15000.0), Position(symbol='GOOG', value_usd=140000.0)])

We included __repr__ right away. This dunder method provides the “official” string representation, useful for debugging the object’s state.

Emulating a Collection

Right now, if we try to gauge the size of our portfolio, it raises a TypeError.

try:
    len(p)
except TypeError as e:
    print(f"Error: {e}")
Error: object of type 'Portfolio' has no len()

To enable the len() function, we implement __len__.

__len__

class Portfolio(Portfolio): # Inheriting to add methods incrementally for this post
    def __len__(self):
        return len(self._contents)

p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])
print(f"Portfolio size: {len(p)}")
Portfolio size: 2

Now len(p) delegates to the underlying self._contents dictionary. It’s a classic delegation strategy.

__getitem__

We don’t want to scan through a list to find a stock; that’s O(N) efficiency, and in performance-critical applications, speed is crucial. The __getitem__ method allows us to use the square bracket notation [] to access items directly.

class Portfolio(Portfolio):
    def __getitem__(self, symbol: str) -> Position | None:
        return self._contents.get(symbol)

p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])
print(f"AAPL position: {p['AAPL']}")
print(f"MSFT position: {p['MSFT']}") # Returns None
AAPL position: Position(symbol='AAPL', value_usd=15000.0)
MSFT position: None

With this, our Portfolio behaves like a mapping—an efficient way to access our assets.

Enabling Iteration: __iter__

Iterating through a collection is fundamental. Since we backed our portfolio with a dictionary, default iteration would just give us the keys (symbols). But when we loop through a portfolio, we usually want the assets themselves. We can dictate this behavior by implementing __iter__.

class Portfolio(Portfolio):
    def __iter__(self) -> Iterator[Position]:
        return iter(self._contents.values())


p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])

print("Iterating through portfolio:")
for pos in p:
    print(f" - {pos.symbol}: ${pos.value_usd:,.2f}")
Iterating through portfolio:
 - AAPL: $15,000.00
 - GOOG: $140,000.00

Now, for pos in p yields the positions directly. We have full control over the traversal.

Membership and Truthiness: __contains__ and __bool__

We can already check for membership because we implemented __iter__, but that’s like manually auditing every file to find one document—it’s O(N). Since we have a hash map (dictionary) underneath, we can do an O(1) check using __contains__.

We also want to know if our portfolio is empty. By default, Python checks len(), but implementing __bool__ allows us to be explicit about what constitutes a “truthy” portfolio.

class Portfolio(Portfolio):
    def __contains__(self, item: str | Position) -> bool:
        if isinstance(item, str):
            return item in self._contents
        if isinstance(item, Position):
            return item.symbol in self._contents
        return False

    def __bool__(self) -> bool:
        return bool(self._contents)


p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])

print(f"Is 'AAPL' in p? {'AAPL' in p}")
print(f"Is 'MSFT' in p? {'MSFT' in p}")
print(f"Is p truthy? {bool(p)}")

empty_p = Portfolio("Empty", ("Nobody",))
print(f"Is empty_p truthy? {bool(empty_p)}")
Is 'AAPL' in p? True
Is 'MSFT' in p? False
Is p truthy? True
Is empty_p truthy? False

Operator Overloading: __add__ (Combining Portfolios)

In Python, objects can be combined using operators. If we have two portfolios, it makes intuitive sense to “add” them together using +.

We implement this via __add__. This isn’t just concatenation; it’s a consolidation. We need to combine managers and sum up the value of overlapping positions.

class Portfolio(Portfolio):
    def __add__(self, other: Portfolio) -> Portfolio:
        if not isinstance(other, Portfolio):
            return NotImplemented

        new_name = f"{self.name} + {other.name}"
        # Combine managers, removing duplicates
        new_managers = tuple(sorted(set(self.managers + other.managers)))
        # Merge contents
        all_positions = {}

        # Add positions from self
        for pos in self:
            all_positions[pos.symbol] = Position(pos.symbol, pos.value_usd)

        # Add positions from other
        for pos in other:
            if pos.symbol in all_positions:
                existing = all_positions[pos.symbol]
                existing.value_usd += pos.value_usd
            else:
                all_positions[pos.symbol] = Position(pos.symbol, pos.value_usd)
        return Portfolio(new_name, new_managers, list(all_positions.values()))


# Example Usage
p1 = Portfolio("Tech", ("Alice",), [Position("AAPL", 15000.0)])
p2 = Portfolio(
    "Growth", ("Bob",), [Position("AAPL", 8000.0), Position("MSFT", 30000.0)]
)

p3 = p1 + p2
print(f"Merged Portfolio: {p3}")
print(f"Merged AAPL: {p3['AAPL']}")
Merged Portfolio: Portfolio(name='Tech + Growth', managers=('Alice', 'Bob'), contents=[Position(symbol='AAPL', value_usd=23000.0), Position(symbol='MSFT', value_usd=30000.0)])
Merged AAPL: Position(symbol='AAPL', value_usd=23000.0)

String Representation: __repr__ vs __str__

Finally, we need to decide how our portfolio presents itself to the world.

  • __repr__ (The Private Ledger): Unambiguous, developer-focused. It should ideally allow you to recreate the object.
  • __str__ (The Public Face): Readable, user-focused. “What does this object represent to a human?”

If __str__ is not defined, Python falls back to __repr__. But we want a nice summary.

class Portfolio(Portfolio):
    def __str__(self):
        return f"Portfolio '{self.name}' with {len(self)} positions managed by {', '.join(self.managers)}"


p = Portfolio("Tech Fund", ("Alice", "Bob"), [pos1, pos2])
print(f"Str: {str(p)}")
print(f"Repr: {repr(p)}")
Str: Portfolio 'Tech Fund' with 2 positions managed by Alice, Bob
Repr: Portfolio(name='Tech Fund', managers=('Alice', 'Bob'), contents=[Position(symbol='AAPL', value_usd=15000.0), Position(symbol='GOOG', value_usd=140000.0)])

Why This Matters

By implementing these methods, we’ve turned a basic container into a robust and functional class.

  • Lookup by Symbol: Instant access with p['AAPL'].
  • Intuitive Iteration: Seamless traversal with for pos in p.
  • Arithmetic: Combining objects made easy with p1 + p2.
  • Expressiveness: The code reads clearly and naturally.

Summary

The Python Data Model allows your custom objects to integrate seamlessly with built-in types and Python’s core features.

  • __init__: Initialization.
  • __repr__: Detailed developer-focused representation.
  • __str__: User-friendly summary.
  • __len__: Getting the size.
  • __getitem__: Direct item access.
  • __iter__: Enabling iteration.
  • __contains__: Fast membership checking.
  • __bool__: Truthiness check.
  • __add__: Combining objects.

By embracing these patterns, you align with Python’s design philosophy and create clean, Pythonic code.

Note
The complete code for this example is available here: [portfolio.py](portfolio.py).
Back to top