Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Usage Patterns

This interactive guide shows common patterns for using dags. Run each cell to see how the library works step by step.

Let’s start by importing the libraries:

import numpy as np

import dags

Pattern 1: Building Computational Pipelines

The core use case for dags is combining multiple interdependent functions into a single callable. Let’s build a simple data processing pipeline step by step.

First, we define three functions that depend on each other:

def cleaned_data(raw_data):
    """Filter out negative values."""
    return [x for x in raw_data if x > 0]
def statistics(cleaned_data):
    """Compute summary stats from cleaned data."""
    return {
        "mean": sum(cleaned_data) / len(cleaned_data),
        "count": len(cleaned_data),
    }
def report(statistics, cleaned_data):
    """Generate a text report."""
    return f"Processed {statistics['count']} items, mean: {statistics['mean']}"

Notice how the parameter names create a dependency graph:

  • cleaned_data depends on raw_data (external input)

  • statistics depends on cleaned_data

  • report depends on both statistics and cleaned_data

Now let’s combine them into a pipeline:

functions = {
    "cleaned_data": cleaned_data,
    "statistics": statistics,
    "report": report,
}

pipeline = dags.concatenate_functions(
    functions=functions,
    targets=["report"],
    return_type="dict",
)

Let’s run it with some test data:

result = pipeline(raw_data=[1, -2, 3, 4, -5, 6])
result

Try changing the input data to see different results:

pipeline(raw_data=[10, 20, -30, 40, 50])

Example: Economic Model with Utility Maximization

Now let’s build something more realistic—a consumer choice model. We’ll define the building blocks first:

def utility(consumption, risk_aversion):
    """CRRA utility function."""
    if risk_aversion == 1:
        return np.log(consumption)
    return (consumption ** (1 - risk_aversion)) / (1 - risk_aversion)
def budget_constraint(income, price):
    """Maximum affordable consumption."""
    return income / price
def feasible(consumption, budget_constraint):
    """Check if consumption is affordable."""
    return consumption <= budget_constraint
def optimal_utility(budget_constraint, risk_aversion):
    """Find maximum utility over a grid of consumption values."""
    consumption_grid = np.linspace(0.1, budget_constraint, 100)
    if risk_aversion == 1:
        utilities = np.log(consumption_grid)
    else:
        utilities = (consumption_grid ** (1 - risk_aversion)) / (1 - risk_aversion)
    return float(np.max(utilities))

Let’s collect all functions:

functions = {
    "utility": utility,
    "budget_constraint": budget_constraint,
    "feasible": feasible,
    "optimal_utility": optimal_utility,
}

Now the power of dags becomes clear: we can create different combined functions from the same building blocks depending on what we need.

Use case 1: Solve for optimal utility given income and prices:

solve_model = dags.concatenate_functions(
    functions=functions,
    targets=["optimal_utility"],
    return_type="dict",
)

solve_model(income=1000, price=10, risk_aversion=2)

Use case 2: Evaluate whether a specific consumption choice is feasible and what utility it gives:

evaluate_choice = dags.concatenate_functions(
    functions=functions,
    targets=["utility", "feasible"],
    return_type="dict",
)

evaluate_choice(income=1000, price=10, consumption=50, risk_aversion=2)

What if consumption is too high?

evaluate_choice(income=1000, price=10, consumption=150, risk_aversion=2)

Use case 3: Just compute the budget constraint (dags only runs what’s needed):

get_budget = dags.concatenate_functions(
    functions=functions,
    targets=["budget_constraint"],
    return_type="dict",
)

get_budget(income=1000, price=10)

This pattern is powerful when:

  • You have a complex model with many interrelated components

  • Different use cases require computing different subsets of outputs

  • You want to avoid code duplication by reusing the same function definitions

Pattern 2: Aggregating Multiple Functions

When you need to combine multiple functions into a single result (like checking if ALL constraints are satisfied), use an aggregator.

Let’s define three constraints for a consumption choice:

def positive_consumption(consumption):
    """Consumption must be positive."""
    return consumption > 0
def within_budget(consumption, budget_constraint):
    """Consumption must not exceed budget."""
    return consumption <= budget_constraint
def minimum_savings(consumption, income):
    """Must save at least 10% of income."""
    return consumption <= 0.9 * income

Now combine them with all as the aggregator:

all_feasible = dags.concatenate_functions(
    functions={
        "positive_consumption": positive_consumption,
        "within_budget": within_budget,
        "minimum_savings": minimum_savings,
    },
    targets=["positive_consumption", "within_budget", "minimum_savings"],
    aggregator=all,
)

Test with a valid choice (consumption=80 with income=100 and budget=100):

all_feasible(consumption=80, budget_constraint=100, income=100)

Now test with consumption=95 (violates the 10% savings rule):

all_feasible(consumption=95, budget_constraint=100, income=100)

You can check individual constraints to see which one failed:

print(f"positive: {positive_consumption(95)}")
print(f"within_budget: {within_budget(95, 100)}")
print(f"minimum_savings: {minimum_savings(95, 100)}  <- this fails (95 > 90)")

Pattern 3: Generating Functions for Multiple Scenarios

In economic modeling, you often need similar functions for different time periods, regions, or agent types. You can generate them programmatically.

Here’s a factory function that creates tax calculators:

def create_income_tax(rate, threshold):
    """Create a tax function with given rate and threshold."""

    def income_tax(gross_income):
        taxable = max(0, gross_income - threshold)
        return taxable * rate

    return income_tax

Define tax rules that changed over time:

tax_rules = {
    2020: {"rate": 0.25, "threshold": 10000},
    2021: {"rate": 0.27, "threshold": 12000},
    2022: {"rate": 0.30, "threshold": 12000},
}
tax_rules

Generate year-specific tax functions using rename_arguments to give each function its own input:

functions = {}
for year, params in tax_rules.items():
    tax_func = create_income_tax(params["rate"], params["threshold"])
    functions[f"tax_{year}"] = dags.rename_arguments(
        tax_func, mapper={"gross_income": f"income_{year}"}
    )

list(functions.keys())

Add a function that sums up all years:

def total_tax_burden(tax_2020, tax_2021, tax_2022):
    """Sum of taxes across all years."""
    return tax_2020 + tax_2021 + tax_2022


functions["total_tax_burden"] = total_tax_burden

Combine and compute:

combined = dags.concatenate_functions(
    functions=functions,
    targets=["total_tax_burden"],
    return_type="dict",
)

combined(income_2020=50000, income_2021=55000, income_2022=60000)

Let’s verify this manually:

tax_2020 = (50000 - 10000) * 0.25
tax_2021 = (55000 - 12000) * 0.27
tax_2022 = (60000 - 12000) * 0.30
print(f"2020: {tax_2020}, 2021: {tax_2021}, 2022: {tax_2022}")
print(f"Total: {tax_2020 + tax_2021 + tax_2022}")

Pattern 4: Selective Computation

When your function graph contains expensive computations, create different combined functions that compute only what’s needed. dags automatically prunes the graph.

Here’s a simulation example:

def simulated_data(parameters, n_simulations):
    """Monte Carlo simulation (the expensive part)."""
    rng = np.random.default_rng(42)
    return rng.normal(
        loc=parameters["mean"], scale=parameters["std"], size=n_simulations
    )
def summary_statistics(simulated_data):
    """Compute mean, std from simulations."""
    return {
        "mean": float(np.mean(simulated_data)),
        "std": float(np.std(simulated_data)),
    }
def full_distribution(simulated_data):
    """Compute empirical distribution (percentiles)."""
    return {
        "p10": float(np.percentile(simulated_data, 10)),
        "p50": float(np.percentile(simulated_data, 50)),
        "p90": float(np.percentile(simulated_data, 90)),
    }
def quick_check(parameters):
    """Fast sanity check (doesn't need simulation)."""
    return all(v > 0 for v in parameters.values())
functions = {
    "simulated_data": simulated_data,
    "summary_statistics": summary_statistics,
    "full_distribution": full_distribution,
    "quick_check": quick_check,
}

Create three different combined functions for different purposes:

Validator - only runs quick_check, skips simulation entirely:

validator = dags.concatenate_functions(
    functions=functions,
    targets=["quick_check"],
    return_type="dict",
)

validator(parameters={"mean": 10, "std": 2})

Summarizer - runs simulation + summary stats:

summarizer = dags.concatenate_functions(
    functions=functions,
    targets=["summary_statistics"],
    return_type="dict",
)

summarizer(parameters={"mean": 10, "std": 2}, n_simulations=1000)

Full analysis - runs everything:

full_analysis = dags.concatenate_functions(
    functions=functions,
    targets=["summary_statistics", "full_distribution"],
    return_type="dict",
)

full_analysis(parameters={"mean": 10, "std": 2}, n_simulations=1000)

Pattern 5: Dependency Analysis

Use get_ancestors to analyze which inputs affect specific outputs. This is useful for understanding model structure.

Let’s build a small income model:

def wage(education, experience):
    return 20000 + 5000 * education + 1000 * experience


def capital_income(wealth, interest_rate):
    return wealth * interest_rate


def total_income(wage, capital_income):
    return wage + capital_income


def consumption(total_income, savings_rate):
    return total_income * (1 - savings_rate)


functions = {
    "wage": wage,
    "capital_income": capital_income,
    "total_income": total_income,
    "consumption": consumption,
}

What affects consumption? (includes both functions and their inputs)

ancestors = dags.get_ancestors(
    functions=functions,
    targets=["consumption"],
    include_targets=True,
)
ancestors

What are the external inputs (parameters the user must provide)?

all_args = set()
for func in functions.values():
    all_args.update(dags.get_free_arguments(func))

external_inputs = all_args - set(functions.keys())
external_inputs

Pattern 6: Working with Nested Structures

Use dags.tree for hierarchical function organization. This is useful when you have functions grouped by category.

Here’s a tax-transfer system organized hierarchically:

import dags.tree as dt
functions = {
    "income": {
        "wage": lambda hours, hourly_wage: hours * hourly_wage,
        "capital": lambda wealth, interest_rate: wealth * interest_rate,
    },
    "taxes": {
        "income_tax": lambda income__wage, income__capital: (
            0.3 * (income__wage + income__capital)
        ),
    },
    "transfers": {
        "basic_income": lambda: 500,
    },
    "net_income": lambda income__wage,
    income__capital,
    taxes__income_tax,
    transfers__basic_income: (
        income__wage + income__capital - taxes__income_tax + transfers__basic_income
    ),
}

Flatten the nested structure to qualified names (using __ as separator):

flat_functions = dt.flatten_to_qnames(functions)
list(flat_functions.keys())

Now combine and run:

combined = dags.concatenate_functions(
    functions=flat_functions,
    targets=["net_income"],
    return_type="dict",
)

combined(hours=40, hourly_wage=25, wealth=10000, interest_rate=0.05)

Let’s verify:

  • Wage: 40 × 25 = 1000

  • Capital: 10000 × 0.05 = 500

  • Tax: 0.3 × (1000 + 500) = 450

  • Net: 1000 + 500 - 450 + 500 = 1550 ✓

wage = 40 * 25
capital = 10000 * 0.05
tax = 0.3 * (wage + capital)
net = wage + capital - tax + 500
print(f"Wage: {wage}, Capital: {capital}, Tax: {tax}, Net: {net}")

See the Tree documentation for more details.

Pattern 7: Signature Inspection and Modification

Sometimes you need to inspect or modify function signatures. Here are the tools available:

def model(alpha, beta, gamma):
    return alpha + beta * gamma

Inspect a function’s arguments:

dags.get_free_arguments(model)

Rename arguments to match your naming convention:

renamed = dags.rename_arguments(
    model,
    mapper={
        "alpha": "intercept",
        "beta": "slope",
        "gamma": "x",
    },
)

dags.get_free_arguments(renamed)

Test the renamed function:

renamed(intercept=1, slope=2, x=3)

Get type annotations from a function:

def typed_func(x: float, y: int) -> float:
    return x + y


dags.get_annotations(typed_func)

Best Practices

  1. Use descriptive function names: Since dags uses names for dependency resolution, clear names make the DAG easier to understand and debug.

  2. Keep functions focused: Each function should do one thing well, making the DAG modular and testable.

  3. Document dependencies: Even though dags infers dependencies from parameter names, documenting expected inputs in docstrings helps maintainability.

  4. Use enforce_signature=False for dynamic cases:

combined = dags.concatenate_functions(
    functions={
        "report": report,
        "statistics": statistics,
        "cleaned_data": cleaned_data,
    },
    targets=["report"],
    enforce_signature=False,
)
  1. Set annotations for type checking:

combined = dags.concatenate_functions(
    functions={
        "report": report,
        "statistics": statistics,
        "cleaned_data": cleaned_data,
    },
    targets=["report"],
    set_annotations=True,
)