Strategies for Numerical Integration

Implementing the strategy design pattern in Python.
Pythonic Distractions
Design Patterns
Numerical Methods
Published

April 28, 2024

Modified

August 30, 2024

Numerical integration

Calculation of many financial methods or metrics relies on a mathematical tool called numerical integration. In simple terms, numerical integration takes a function that represents a continuous process (like the changing value of an investment over time) and approximates the area under its curve. This area can then be used to calculate important quantities, like the total return of the investment or price of a derivative instrument.

So we are tasked with a problem, and one that has many different ways of solving, or rather approximating the solution. You most likely (taking into account you’re still reading this) encountered rectangle rule, or Riemann summation in your Calculus 101 course/self-learning. But there are many other techniques, which we call schemes.

For quantitative analysts, the choice of the integration method matters. Different integration schemes offer varying levels of accuracy and efficiency. Unfortunately there’s no best technique. Same algorithm can be a perfect fit for problems with certain characteristics, but unusable for others. And then there is the old as time performance vs. accuracy trade-off.

As programmers working in finance, we need to be adaptable and leverage solutions that allow us to switch between these methods seamlessly.

Design patterns

This is where design patterns come in. Design patterns are reusable solutions to common programming problems. Their widespread adoption in software development is largely attributed to the publication of Design Patterns: Elements of Reusable Object-Oriented Software in 1994. Authored by E. Gamma, R. Helm, R. Johnson, and J. Vlissides (often referred to as the “Gang of Four” or GoF), this book cataloged 23 essential software design patterns. These patterns provided solutions to common design problems in object-oriented programming, promoting code reusability, maintainability, and flexibility.

Some design patterns can feel clunky or inelegant when implemented in Python. The language itself often has built-in features or idioms that achieve the same result in a more Pythonic way (meaning it follows Python’s style and conventions). Sometimes, design patterns can be seen as overcomplicating simple problems. On the other hand, usage of well-known and understood patterns may enhance your engineering skills and improve code readability.

Ultimately, the decision of whether or not to use design patterns in Python depends on the specific context of your project and your coding style. There’s no right or wrong answer. But first, you need to know the classics to diss the classics. We’ll hold on with the dissing for now, cause in the example below chosen design pattern makes for a very clean implementation. You’ll see for yourself.

Strategy pattern

We know the stage now — one problem statement, multiple strategies to tackle. Important observation here is that we don’t actually care which one is used. When you substitute the integral value to client code — a formula or further algorithm — it’s irrelevant how it was computed, as long its correct to required level of accuracy. This means that the problem should be decoupled from algorithms to solve it. We should target a implementation where you can state a problem Calculate the integral of \(\sin(x)\) from \(0\) to \(\pi\) and then just throw different algorithms at it to obtain a solution. So let’s get coding!

Note

I will show you this pattern through an example. If you prefer more generic setup see Refactoring Guru’s implementation. The customary ‘software engineering’ example used to present the SDP is sorting a list of integers using different sorting algorithms.

Abstract schema

Each scheme that we’d come up with, even the most complex ones, would have the same main purpose — ‘integrate’. To make the implementation for it, we create a template class that all concrete schemes will inherit from.

Show the code
from abc import ABC, abstractmethod
from typing import Callable

1class IntegrationScheme(ABC):
    """Abstract base class for integration schemas."""

2    @abstractmethod
    def integrate(
        self,
3        integrand: Callable[[float], float],
        *,
4        start: float,
        end: float,
    ) -> float:
        """Abstract method for integrating a function."""
1
ABC defines abstract classes, serving as a template. Concrete subclasses inherit from it; abstract classes cannot be instantiated.
2
The @abstractmethod decorator signifies this method must be overridden by all concrete subclasses.
3
Callable annotations define function-like objects that can be executed.
4
Type annotations improve readability and can be used by static type checkers like mypy to detect errors early.

Concrete schema implementations

Rectangle Rule

It’s the simplest way of estimating the area under a curve you can think of — cover it with smaller and smaller rectangles with the value of a function at the leftmost point as height constant width.

Implementing this idea is trivial when using numpy, but let’s add some syntactic sugar so the class is sweeter to work with.

Show the code
import numpy as np

1class RectangleScheme(IntegrationScheme):
    """Schema for rectangle integration."""

2    def __init__(
        self,
        steps: int,
    ) -> None:
        """Initializes the rectangle integration config."""
        if steps <= 0:
            raise ValueError("Steps must be greater than 0.")
        self._steps = steps

3    def __str__(self) -> str:
        """Returns the string representation of the schema."""
        return f"Rectangle schema with {self._steps} steps"

4    def integrate(
        self,
        integrand: Callable[[float], float],
        *,
        start: float,
        end: float,
    ) -> float:
        """Integrates a function using rectangle integration."""
        x_points = np.linspace(start, end, self._steps)
        values = integrand(x_points)
        dx = (end - start) / np.float64(self._steps)
        return np.sum(values) * dx
1
RectangleScheme inherits from IntegrationScheme, requiring implementation of integrate.
2
__init__ sets up the rectangle count, including validation for positive values.
3
__str__ customizes the string representation of the class instance.
4
integrate performs the calculation: linear spacing, function evaluation, and summing area elements.

Simple Monte Carlo

This guy sounds fancy with its luxurious Monaco vibes, but it’s just a peasant in a nice suit. Instead of looking at equaly-spaced points, we shuffle them from uniform distribution on the interval of integration. We calculate the integrand function values at those points and sum them up. Then multiply the sum by the average distance between points and through the magic of probability theory (and not opening actual probability textbook in 10 years) you get a good probabilistic estimator of the integral value. The implementation is analogous to the RectangleScheme.

Show the code
from typing import Optional

class MonteCarloScheme(IntegrationScheme):
    """Schema for Monte Carlo integration."""

1    def __init__(
        self,
        random_points: int,
        random_seed: Optional[int] = None,
    ) -> None:
        """Initializes the rectangle integration config."""
        if random_points <= 0:
            raise ValueError("Points must be greater than 0.")
        self.__random_points = random_points
        self.__random_seed = random_seed

    def __str__(self) -> str:
        """Returns the string representation of the schema."""
        points_msg = f"Monte Carlo schema with {self.__random_points} random points"
        seed_msg = f" and seed {self.__random_seed}" if self.__random_seed else ""
        return f"{points_msg}{seed_msg}"

    def integrate(
        self,
        integrand: Callable[[float], float],
        *,
        start: float,
        end: float,
    ) -> float:
        """Integrates a function using Monte Carlo integration."""
        np.random.seed(seed=self.__random_seed)
        x_points = np.random.uniform(start, end, self.__random_points)
        values = integrand(x_points)
        average_dx = (end - start) / np.float64(self.__random_points)
        return np.sum(values) * average_dx
1
Optional[int] allows random_seed to be an integer or None, defaulting to None for non-reproducible results. Note: int | None is preferred in Python 3.11+.
Note

The Optional stands for could be None as well, it doesn’t affect if the input is mandatory or not. In our case it’s not, but thats stated by the = None part. In Python 3.11 onwards it’s recommended to use int | None instead.

Integrator

What’s left is to have a way of defining the problem to solve and define how our schemes (strategies) interact with it.

Show the code
"""An integrator class that allows to perform integration using different schemas."""
from typing import Callable

class Integrator:
    """An integrator class that allows to perform integration using different
    schemas as strategies."""

1    def __init__(
        self,
        integrand: Callable[[float], float],
        interval_start: float,
        interval_end: float,
    ) -> None:
        """Initializes the integrator class."""
        if interval_start >= interval_end:
            raise ValueError("Start value must be less than end value.")
        self.__integrand = integrand
        self.__interval_start = interval_start
        self.__interval_end = interval_end

2    def __call__(
        self,
        schema: IntegrationScheme,
    ) -> float:
        """
        Calculates the definite integral value of a function.

        Args:
            schema: integration schema
        """
        print(f"Using {schema}.")
        return schema.integrate(
            self.__integrand,
            start=self.__interval_start,
            end=self.__interval_end,
        )
1
The __init__ takes in the obvious parameters — function to integrate, start and end of the interval. It also checks if it’s a proper integral.
2
We get to implement our own __call__ method now. It’s clear what Integrator class does. No need to have a method with a descriptive name like Integrator.integrate. To use it you pass through the integration scheme into the integrator — notice annotation of the abstract IntegrationScheme. It prints the info on strategy used (using the __str__ methods) and calls integrate method of the scheme. No care in the world on how the value is actually calculated.

Let’s integrate!

Ok, now to the integrating! Let’s set up the stage:

Show the code
start, end = 0, np.pi / 2.0

def f(x: float) -> float:
    return np.sin(x) + np.cos(x)

Excited? Don’t be… yet.

We should get some benchmark value first. As none of us would bother to integrate this by hand, we’ll use SciPy. Unexpectedly (SciPy uses C and Fortran underneath), we get the result in a breeze and it is very close to actual value of 2.0.

Show the code
from scipy.integrate import quad

scipy_quad, err = quad(f, start, end)
print(scipy_quad)
1.9999999999999998

Now let’s use our Integrator class and see.

Show the code
integrator = Integrator(
    f,
    interval_start=start,
    interval_end=end,
)

iterations = [2**i for i in range(0,21,5)]
rectangle_results = [integrator(RectangleScheme(steps=i)) for i in iterations]
mc_results = [integrator(MonteCarloScheme(random_points=i)) for i in iterations]

print(f"Rectangle schema results:\n{rectangle_results}.")
print(f"Monte Carlo schema results:\n{mc_results}.")
Using Rectangle schema with 1 steps.
Using Rectangle schema with 32 steps.
Using Rectangle schema with 1024 steps.
Using Rectangle schema with 32768 steps.
Using Rectangle schema with 1048576 steps.
Using Monte Carlo schema with 1 random points.
Using Monte Carlo schema with 32 random points.
Using Monte Carlo schema with 1024 random points.
Using Monte Carlo schema with 32768 random points.
Using Monte Carlo schema with 1048576 random points.
Rectangle schema results:
[np.float64(1.5707963267948966), np.float64(1.986172817555692), np.float64(1.9995804632216618), np.float64(1.9999869013603688), np.float64(1.9999995906791064)].
Monte Carlo schema results:
[np.float64(2.2214230318037416), np.float64(1.9990607056532828), np.float64(1.9899131764383533), np.float64(1.999842210451933), np.float64(2.000049750991304)].

The performance and convergence of those schemes is terrible. Like anything in Python, if you want robust and performing code, you need to implement it with C or use any/all of the enhancement frameworks that Python provides (see Numba). Additionally, the simple methods we implemented are very naive. The standard numerical packages use sophisticated algorithms honed for many decades.

But I was wrong! You should be excited! We just learned new approach for setting up extensible and readable code! Look how cleanly the problem statement is separated form different strategies to solve it.

If you are now wondering how much we could improve by using more advanced techniques (like stratified Monte Carlo or adaptive quadrature) you just need to implement new subclass of `IntegrationSchema’ and you’re done. No changes to the existing code are needed, just simple extension. And that’s the idea behind strategy pattern.

Note

Download the whole code here.

Back to top