Using Lookup Tables And Trig Functions To Optimize Non-Uniform Random Numbers
Speeding Up Non-Uniform Random Number Generation
Generating non-uniform random numbers is a common task in statistical sampling, Monte Carlo simulations, and other applications. However, basic algorithms for generating non-uniform distributions are often inefficient. As the number of random values needed grows into the millions or billions, the computational costs can become prohibitive.
This article examines two optimization techniques to speed up non-uniform random number generation: lookup tables and trigonometric functions. We provide code examples in Python demonstrating the usage and performance benefits of each method. Finally, we give recommendations on when to utilize lookup tables versus trig functions or a hybrid approach for different use cases.
The Problem With Basic Approaches
A basic technique for generating non-uniform random numbers is inverse transform sampling. This involves calculating the inverse cumulative distribution function (CDF) for each random number desired. For example, to generate random numbers with a Gaussian distribution, the erf() function could be used to compute inverse CDF values from a stream of uniform random numbers.
However, computing inverse CDFs can be computationally expensive, often requiring iterative methods or special functions. As more random numbers are needed, performing these calculations becomes very slow. Just generating one million Gaussian random variables via the erf() inverse CDF already requires seconds of computation time in Python.
Likewise, basic rejection sampling techniques that discard uniform samples falling outside a distribution’s range also require extensive computations to generate large quantities of non-uniform random numbers. More optimized approaches are needed.
Introducing Lookup Tables
Precalculating Values in Tables
An optimization technique that can accelerate non-uniform random generation is to utilize lookup tables. This involves precomputing the inverse CDF mapping and storing the results in an array or dict structure ahead of time.
For instance, to generate Gaussian distributed random numbers, we could create a table with one million erf() inverse CDF mappings computed in advance. Then during random number generation, we simply use each uniform random number as an index into the table to lookup the corresponding precomputed Gaussian value in constant time. There is essentially no runtime computational cost.
Code Example of Lookup Table Creation
import numpy as np
import math
# Create lookup table of one million inverse erf() values
size = 1000000
gaussian_lookup = {}
for i in range(size):
uniform_val = (i + 0.5) / size
gaussian_lookup[uniform_val] = math.sqrt(2) * erfinv(2*uniform_val - 1)
# Generate random numbers using lookup table
uniform_rands = np.random.rand(10)
gauss_rands = [gaussian_lookup[u] for u in uniform_rands]
As we can see, populating the lookup table requires substantial precomputation, but we amortize this cost by reusing the table billions of times later. Random number generation is reduced to a fast dict lookup per value.
Leveraging Trigonometric Functions
Mapping Uniform Distribution to Non-Uniform Using Sine Waves
In addition to lookup tables, we can also optimize non-uniform random number generation by exploiting properties of trigonometric functions. Specifically, we can use sine waves to remap uniform random variables into other distributions.
By tuning parameters like amplitude, phase shift, sine wave mix, and frequency, we can stretch and morph an input uniform distribution into nearly any output distribution desired. For example, mixing just a few sine waves allows efficiently generating random numbers with Gaussian, exponential, and multimodal distributions.
Code Example of Sine Wave Mapping
import numpy as np
# Function to map uniform distribution to Gaussian
# via sine waves
def sine_gaussian(uniform_rands):
pi2 = 2 * np.pi
a1 = 2
p1 = 0
f1 = 4
gaus_rands = a1 * np.sin(pi2 * f1 * uniform_rands + p1)
return gaus_rands
# Generate random numbers
uniform_rands = np.random.rand(10)
gauss_rands = sine_gaussian(uniform_rands)
By tuning just three parameters (a1, f1, p1) defining a single sine wave transformation, we can map our uniform random numbers into an approximate Gaussian distribution, no inverse CDF calculations required!
Combining Lookup Tables and Trig Functions
Getting the Best of Both Approaches
For certain cases, the best performance can be achieved by combining both the lookup table and trigonometric function optimization approaches.
Specifically, we can utilize lookup tables for discrete random number distributions with small domain sizes. The trigonometric functions can then handle continuous distribution mappings. Bringing both techniques together allows efficiently generating random samplings from a wide range of distributions.
Code Examples of Hybrid Solution
import numpy as np
import math
# Lookup table for small 16-value custom distribution
distribution = [0.1, 0.2, 0.0, 0.15, ...]
lookup = {i: v for i, v in enumerate(distribution)}
# Sine wave transform for continuous log-normal distribution
def sine_lognorm(uniform_rands):
# Remap uniform to log-normal distribution via sine waves
a1 = 0.5
f1 = 10
p1 = 1
value = a1 * np.sin(f1*uniform_rands + p1)
return np.exp(value)
# Generate mixed random number sequence
uniform_rands = np.random.rand(100)
rands = []
for r in uniform_rands:
if r < 0.2:
rands.append(lookup[r*16]) # Lookup discrete distribution
else:
rands.append(sine_lognorm(r)) # Continuous log-normal distribution
Here we utilize both a lookup table and sine wave mapping together to efficiently generate a mixed distribution. The lookup handles an arbitrary small discrete distribution, while the sine transform generates continuous log-normal random values.
When to Use Each Optimization Technique
Now that we have explored lookup tables and trigonometric functions for optimizing non-uniform random number generation, when should we utilize each approach? Here are some general guidelines:
Lookup Tables for Small Finite Domains
Lookup tables shine when we need to generate random numbers from a distribution with a relatively small and finite domain, such as a discrete custom distribution. By precomputing the entire distribution into a table, we can later draw samples extremely quickly. Lookup tables also work very well for accelerating generation of common named distributions like Gaussian, exponential, Poisson, etc.
Trig Functions for Continuous Distributions
For continuous distributions with an infinite domain like uniform, Gaussian, log-normal, etc., trigonometric functions allow computationally mapping the distributions without lookup tables. Trig functions have the advantage of not requiring precomputation and providing results dynamically tailored to the domain needed. They are also very fast and avoid iterative methods or special functions.
Hybrid for Large or Complex Domains
In some cases, employing both lookup tables and trigonometric functions together as a hybrid approach is beneficial. Lookup tables can be used to generate random samples from arbitrary discrete distributions. The trig functions complement this by handling continuous distribution mappings. By combining approaches, we can support random number generation efficiently across a very wide range of distributions and domain sizes.
In summary, utilizing lookup tables and trigonometric function optimizations provides tremendous speedups for non-uniform random number generation. Applying these techniques enables efficiently generating the vast quantities of random values often required by modern statistical applications.