Using task returns¶

The tutorial Defining dependencies and products presented different ways to specify products. What might seem unintuitive at first is that usually one would associate the return of functions with their products. But, none of the approaches uses function returns.

This guide shows how you can specify products of tasks via function returns. While being a potentially more intuitive interface, it allows the user to turn any function, even third-party functions, into task functions. It also requires more knowledge about pytask’s internals which is why it is not suitable as a tutorial.

Return type annotations¶

One way to declare the returns of functions as products is annotating the return type. In the following example, the second value of typing.Annotated is a path that defines where the return of the function, a string, should be stored.

Python 3.10+

from pathlib import Path
from typing import Annotated


def task_create_file() -> Annotated[str, Path("file.txt")]:
    return "This is the content of the text file."

Python 3.8+

from pathlib import Path

from typing_extensions import Annotated

def task_create_file() -> Annotated[str, Path("file.txt")]:
    return "This is the content of the text file."

It works because internally the path is converted to a pytask.PathNode that is able to store objects of type str and bytes.

Task decorator¶

In case you are not able to set a return type annotation to the task function, for example, because it is a lambda or a third-party function, you can use @pytask.task(produces=...).

from pathlib import Path

from pytask import task

func = lambda *x: "This is the content of the text file."

task_create_file = task(produces=Path("file.txt"))(func)

Multiple returns¶

If a task function has multiple returns, you can use multiple nodes to store each of the returns in a different place. The following example shows how to accomplish it with both of the previous interfaces.

Python 3.10+

from pathlib import Path
from typing import Annotated


def task_create_files() -> Annotated[str, (Path("file1.txt"), Path("file2.txt"))]:
    return "This is the first content.", "This is the second content."

Python 3.8+

from pathlib import Path

from typing_extensions import Annotated

def task_create_files() -> Annotated[str, (Path("file1.txt"), Path("file2.txt"))]:
    return "This is the first content.", "This is the second content."

@pytask.task

from pathlib import Path

from pytask import task

func = lambda *x: "This is the first content.", "This is the second content."

task_create_file = task(produces=(Path("file1.txt"), Path("file2.txt")))(func)

Each return is mapped to its node by respecting its position in the tuple.

In fact, any PyTree can be used. The only requirement is that the PyTree of nodes defined to capture the function returns has the same structure as the returns or is a shallower tree.

The following example shows how a task function with a complex structure of returns is mapped to the defined nodes.

Python 3.10+

from typing import Annotated
from typing import Any

from pytask import PythonNode

nodes = [
    {"first": PythonNode(name="dict1"), "second": PythonNode(name="dict2")},
    (PythonNode(name="tuple1"), PythonNode(name="tuple2")),
    PythonNode(name="int"),
]


def task_example() -> Annotated[Any, nodes]:
    return [{"first": "a", "second": {"b": 1, "c": 2}}, (3, 4), 5]

Python 3.8+

from typing import Any

from pytask import PythonNode
from typing_extensions import Annotated

nodes = [
    {"first": PythonNode(name="dict1"), "second": PythonNode(name="dict2")},
    (PythonNode(name="tuple1"), PythonNode(name="tuple2")),
    PythonNode(name="int"),
]


def task_example() -> Annotated[Any, nodes]:
    return [{"first": "a", "second": {"b": 1, "c": 2}}, (3, 4), 5]

@pytask.task

from pytask import PythonNode
from pytask import task

nodes = [
    {"first": PythonNode(name="dict1"), "second": PythonNode(name="dict2")},
    (PythonNode(name="tuple1"), PythonNode(name="tuple2")),
    PythonNode(name="int"),
]


func = lambda *x: [{"first": "a", "second": {"b": 1, "c": 2}}, (3, 4), 5]


task_example = task(produces=nodes)(func)

The returns are mapped to the nodes as follows.

PythonNode(name="dict1") <- "a"
PythonNode(name="dict2") <- {"b": 1, "c": 2}
PythonNode(name="tuple1") <- 3
PythonNode(name="tuple2") <- 4
PythonNode(name="int") <- 5