Functional Interface#
pytask offers a functional interface to users who want more flexibility than is given by a command line interface. It even allows you to run pytask from a Python interpreter or a Jupyter notebook like this article here.
Let’s see how it works!
[1]:
from pathlib import Path
from typing import Annotated
import pytask
from pytask import task
Here is a small workflow where two tasks create two text files and the third task merges both of them into one file.
One important bit to note here is that the second task is created from a lambda function. So, you can use dynamically defined functions to create tasks.
It also shows how easy it is to wrap any third-party function where you have no control over the signature, but you can still easily wrap them with pytask.
[2]:
def task_create_first_file() -> Annotated[str, Path("first.txt")]:
return "Hello, "
task_create_second_file = task(
name="task_create_second_file", produces=Path("second.txt")
)(lambda *x: "World!")
def task_merge_files(
first: Path = Path("first.txt"), second: Path = Path("second.txt")
) -> Annotated[str, Path("hello_world.txt")]:
return first.read_text() + second.read_text()
Now, let us execute this little workflow.
[3]:
session = pytask.build(
tasks=[task_create_first_file, task_merge_files, task_create_second_file]
)
────────────────────────────────────────────── Start pytask session ───────────────────────────────────────────────
Platform: linux -- Python 3.11.5, pytask 0.4.0, pluggy 1.3.0
Root: /home/tobia/git/pytask
Collected 3 tasks.
╭─────────────────────────┬─────────╮ │ Task │ Outcome │ ├─────────────────────────┼─────────┤ │ task_create_first_file │ . │ │ task_create_second_file │ . │ │ task_merge_files │ . │ ╰─────────────────────────┴─────────╯
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────
╭─────────── Summary ────────────╮ │ 3 Collected tasks │ │ 3 Succeeded (100.0%) │ ╰────────────────────────────────╯
──────────────────────────────────────────── Succeeded in 0.19 seconds ────────────────────────────────────────────
The information on the executed workflow can be found in the session
.
[4]:
session
[4]:
Session(config={'pm': <pluggy._manager.PluginManager object at 0x7f3c1b438090>, 'markers': {'depends_on': 'Add dependencies to a task. See this tutorial for more information: [link https://bit.ly/3JlxylS]https://bit.ly/3JlxylS[/].', 'filterwarnings': 'Add a filter for a warning to a task.', 'persist': 'Prevent execution of a task if all products exist and even if something has changed (dependencies, source file, products). This decorator might be useful for expensive tasks where only the formatting of the file has changed. The state of the files which have changed will also be remembered and another run will skip the task with success.', 'produces': 'Add products to a task. See this tutorial for more information: [link https://bit.ly/3JlxylS]https://bit.ly/3JlxylS[/].', 'skip': 'Skip a task and all its dependent tasks.', 'skip_ancestor_failed': 'Internal decorator applied to tasks if any of its preceding tasks failed.', 'skip_unchanged': 'Internal decorator applied to tasks which have already been executed and have not been changed.', 'skipif': 'Skip a task and all its dependent tasks if a condition is met.', 'task': 'Mark a function as a task regardless of its name. Or mark tasks which are repeated in a loop. See this tutorial for more information: [link https://bit.ly/3DWrXS3]https://bit.ly/3DWrXS3[/].', 'try_first': 'Try to execute a task a early as possible.', 'try_last': 'Try to execute a task a late as possible.'}, 'config': None, 'database_url': sqlite:////home/tobia/git/pytask/.pytask/.pytask.sqlite3, 'editor_url_scheme': 'file', 'export': <_ExportFormats.NO: 'no'>, 'ignore': ['.codecov.yml', '.gitignore', '.pre-commit-config.yaml', '.readthedocs.yml', '.readthedocs.yaml', 'readthedocs.yml', 'readthedocs.yaml', 'environment.yml', 'pyproject.toml', 'setup.cfg', 'tox.ini', '.git/*', '.venv/*', '*.egg-info/*', '.ipynb_checkpoints/*', '.mypy_cache/*', '.nox/*', '.tox/*', '_build/*', '__pycache__/*', 'build/*', 'dist/*', 'pytest_cache/*'], 'paths': [], 'layout': 'dot', 'output_path': 'dag.pdf', 'rank_direction': <_RankDirection.TB: 'TB'>, 'expression': '', 'marker_expression': '', 'nodes': False, 'strict_markers': False, 'directories': False, 'exclude': [None, '.git/*'], 'mode': <_CleanMode.DRY_RUN: 'dry-run'>, 'quiet': False, 'capture': <CaptureMethod.NO: 'no'>, 'debug_pytask': False, 'disable_warnings': False, 'dry_run': False, 'force': False, 'max_failures': inf, 'n_entries_in_table': 15, 'pdb': False, 'pdbcls': None, 's': False, 'show_capture': True, 'show_errors_immediately': False, 'show_locals': False, 'show_traceback': True, 'sort_table': True, 'trace': False, 'verbose': 1, 'stop_after_first_failure': False, 'check_casing_of_paths': True, 'pdb_cls': '', 'tasks': [<function task_create_first_file at 0x7f3c1b55f6a0>, <function task_merge_files at 0x7f3c1b407e20>, <function <lambda> at 0x7f3c1b407d80>], 'task_files': ['task_*.py'], 'command': 'build', 'root': PosixPath('/home/tobia/git/pytask'), 'filterwarnings': []}, hook=<pluggy._hooks.HookRelay object at 0x7f3c3c31bbc0>, collection_reports=[CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x7f3c1b55f6a0>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.0767577, 1696055304.077608)}), exc_info=None), CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x7f3c1b407e20>, depends_on={'first': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt')), 'second': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.123595, 1696055304.1244528)}), exc_info=None), CollectionReport(outcome=<CollectionOutcome.SUCCESS: 1>, node=TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x7f3c1b407d80>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.025182, 1696055304.0267167)}), exc_info=None)], tasks=[TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x7f3c1b55f6a0>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.0767577, 1696055304.077608)}), TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x7f3c1b407e20>, depends_on={'first': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt')), 'second': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.123595, 1696055304.1244528)}), TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x7f3c1b407d80>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.025182, 1696055304.0267167)})], dag=<networkx.classes.digraph.DiGraph object at 0x7f3c1b440810>, resolving_dependencies_report=None, execution_reports=[ExecutionReport(task=TaskWithoutPath(name='task_create_second_file', function=<function <lambda> at 0x7f3c1b407d80>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.025182, 1696055304.0267167)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[]), ExecutionReport(task=TaskWithoutPath(name='task_create_first_file', function=<function task_create_first_file at 0x7f3c1b55f6a0>, depends_on={}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.0767577, 1696055304.077608)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[]), ExecutionReport(task=TaskWithoutPath(name='task_merge_files', function=<function task_merge_files at 0x7f3c1b407e20>, depends_on={'first': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/first.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/first.txt')), 'second': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/second.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/second.txt'))}, produces={'return': PathNode(name='/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt', path=PosixPath('/home/tobia/git/pytask/docs/source/how_to_guides/hello_world.txt'))}, markers=[Mark(name='task', args=(), kwargs={})], report_sections=[], attributes={'duration': (1696055304.123595, 1696055304.1244528)}), outcome=<TaskOutcome.SUCCESS: 1>, exc_info=None, sections=[])], exit_code=<ExitCode.OK: 0>, collection_start=1696055303.989013, collection_end=1696055303.9959698, execution_start=1696055304.0121965, execution_end=1696055304.207084, n_tasks_failed=0, scheduler=TopologicalSorter(dag=<networkx.classes.digraph.DiGraph object at 0x7f3c1b4a76d0>, priorities={'task_create_first_file': 0, 'task_merge_files': 0, 'task_create_second_file': 0}, _dag_backup=<networkx.classes.digraph.DiGraph object at 0x7f3c1b4a2e10>, _is_prepared=True, _nodes_out=set()), should_stop=False, warnings=[])
Configuring the build#
To configure the build, {func}pytask.build
has many more options that are the same that you find on the commandline.
[5]:
pytask.build?
Signature:
pytask.build(
*,
capture: "Literal['fd', 'no', 'sys', 'tee-sys'] | CaptureMethod" = <CaptureMethod.NO: 'no'>,
check_casing_of_paths: 'bool' = True,
config: 'Path | None' = None,
database_url: 'str' = '',
debug_pytask: 'bool' = False,
disable_warnings: 'bool' = False,
dry_run: 'bool' = False,
editor_url_scheme: "Literal['no_link', 'file', 'vscode', 'pycharm'] | str" = 'file',
expression: 'str' = '',
force: 'bool' = False,
ignore: 'Iterable[str]' = (),
marker_expression: 'str' = '',
max_failures: 'float' = inf,
n_entries_in_table: 'int' = 15,
paths: 'str | Path | Iterable[str | Path]' = (),
pdb: 'bool' = False,
pdb_cls: 'str' = '',
s: 'bool' = False,
show_capture: 'bool' = True,
show_errors_immediately: 'bool' = False,
show_locals: 'bool' = False,
show_traceback: 'bool' = True,
sort_table: 'bool' = True,
stop_after_first_failure: 'bool' = False,
strict_markers: 'bool' = False,
tasks: 'Callable[..., Any] | PTask | Iterable[Callable[..., Any] | PTask]' = (),
task_files: 'str | Iterable[str]' = 'task_*.py',
trace: 'bool' = False,
verbose: 'int' = 1,
**kwargs: 'Any',
) -> 'Session'
Docstring:
Run pytask.
This is the main command to run pytask which usually receives kwargs from the
command line interface. It can also be used to run pytask interactively. Pass
configuration in a dictionary.
Parameters
----------
capture
The capture method for stdout and stderr.
check_casing_of_paths
Whether errors should be raised when file names have different casings.
config
A path to the configuration file.
database_url
An URL to the database that tracks the status of tasks.
debug_pytask
Whether debug information should be shown.
disable_warnings
Whether warnings should be disabled and not displayed.
dry_run
Whether a dry-run should be performed that shows which tasks need to be rerun.
editor_url_scheme
An url scheme that allows to click on task names, node names and filenames and
jump right into you preferred edior to the right line.
expression
Same as ``-k`` on the command line. Select tasks via expressions on task ids.
force
Run tasks even though they would be skipped since nothing has changed.
ignore
A pattern to ignore files or directories. Refer to ``pathlib.Path.match``
for more info.
marker_expression
Same as ``-m`` on the command line. Select tasks via marker expressions.
max_failures
Stop after some failures.
n_entries_in_table
How many entries to display in the table during the execution. Tasks which are
running are always displayed.
paths
A path or collection of paths where pytask looks for the configuration and
tasks.
pdb
Start the interactive debugger on errors.
pdb_cls
Start a custom debugger on errors. For example:
``--pdbcls=IPython.terminal.debugger:TerminalPdb``
s
Shortcut for ``pytask.build(capture"no")``.
show_capture
Choose which captured output should be shown for failed tasks.
show_errors_immediately
Show errors with tracebacks as soon as the task fails.
show_locals
Show local variables in tracebacks.
show_traceback
Choose whether tracebacks should be displayed or not.
sort_table
Sort the table of tasks at the end of the execution.
stop_after_first_failure
Stop after the first failure.
strict_markers
Raise errors for unknown markers.
tasks
A task or a collection of tasks that is passed to ``pytask.build(tasks=...)``.
task_files
A pattern to describe modules that contain tasks.
trace
Enter debugger in the beginning of each task.
verbose
Make pytask verbose (>= 0) or quiet (= 0).
Returns
-------
session : pytask.Session
The session captures all the information of the current run.
File: ~/git/pytask/src/_pytask/build.py
Type: function
[6]:
# Cleanup
for name in ("first.txt", "second.txt", "hello_world.txt"):
Path(name).unlink()