Skip to content

Commit

Permalink
fixup! Move Sch.ThunkOptions into Options
Browse files Browse the repository at this point in the history
  • Loading branch information
jpsamaroo committed Dec 16, 2024
1 parent 4450769 commit 747dab3
Show file tree
Hide file tree
Showing 4 changed files with 30 additions and 37 deletions.
1 change: 0 additions & 1 deletion docs/src/api-dagger/types.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@ DTask
## Task Options Types
```@docs
Options
Sch.ThunkOptions
Sch.SchedulerOptions
```

Expand Down
2 changes: 1 addition & 1 deletion docs/src/checkpointing.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ z = collect(Z)
```

Two changes were made: first, we `enumerate(X.chunks)` so that we can get a
unique index to identify each `chunk`; second, we specify a `ThunkOptions` to
unique index to identify each `chunk`; second, we specify options to
`delayed` with a `checkpoint` and `restore` function that is specialized to
write or read the given chunk to or from a file on disk, respectively. Notice
the usage of `collect` in the `checkpoint` function, and the use of
Expand Down
61 changes: 28 additions & 33 deletions docs/src/task-spawning.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ or `spawn` if it's more convenient:

`Dagger.spawn(f, Dagger.Options(options), args...; kwargs...)`

When called, it creates an [`DTask`](@ref) (also known as a "thunk" or
"task") object representing a call to function `f` with the arguments `args` and
keyword arguments `kwargs`. If it is called with other thunks as args/kwargs,
When called, it creates an [`DTask`](@ref) (also known as a "task" or
"thunk") object representing a call to function `f` with the arguments `args` and
keyword arguments `kwargs`. If it is called with other tasks as args/kwargs,
such as in `Dagger.@spawn f(Dagger.@spawn g())`, then, in this example, the
function `f` gets passed the results of executing `g()`, once that result is
available. If `g()` isn't yet finished executing, then the execution of `f`
Expand All @@ -29,22 +29,16 @@ it'll be passed as-is to the function `f` (with some exceptions).

!!! note "Task / thread occupancy"
By default, `Dagger` assumes that tasks saturate the thread they are running on and does not try to schedule other tasks on the thread.
This default can be controlled by specifying [`Sch.ThunkOptions`](@ref) (more details can be found under [Scheduler and Thunk options](@ref)).
This default can be controlled by specifying [`Options`](@ref) (more details can be found under [Task and Scheduler options](@ref)).
The section [Changing the thread occupancy](@ref) shows a runnable example of how to achieve this.

## Options

The [`Options`](@ref Dagger.Options) struct in the second argument position is
optional; if provided, it is passed to the scheduler to control its
behavior. [`Options`](@ref Dagger.Options) contains a `NamedTuple` of option
key-value pairs, which can be any of:
- Any field in [`Sch.ThunkOptions`](@ref) (see [Scheduler and Thunk options](@ref))
- `meta::Bool` -- Pass the input [`Chunk`](@ref) objects themselves to `f` and
not the value contained in them.

There are also some extra options that can be passed, although they're considered advanced options to be used only by developers or library authors:
- `get_result::Bool` -- return the actual result to the scheduler instead of [`Chunk`](@ref) objects. Used when `f` explicitly constructs a [`Chunk`](@ref) or when return value is small (e.g. in case of reduce)
- `cache::Bool` -- cache the result of this Thunk such that if the thunk is evaluated again, one can just reuse the cached value. If it’s been removed from cache, recompute the value.
key-value pairs, which can be any field in [`Options`](@ref)
(see [Task and Scheduler options](@ref)).

## Simple example

Expand All @@ -65,7 +59,7 @@ s = Dagger.@spawn combine(p, q, r)
@assert fetch(s) == 16
```

The thunks `p`, `q`, `r`, and `s` have the following structure:
The tasks `p`, `q`, `r`, and `s` have the following structure:

![graph](https://user-images.githubusercontent.com/25916/26920104-7b9b5fa4-4c55-11e7-97fb-fe5b9e73cae6.png)

Expand Down Expand Up @@ -122,23 +116,24 @@ x::DTask
@assert fetch(x) == 3 # fetch the result of `@spawn`
```

This is useful for nested execution, where an `@spawn`'d thunk calls `@spawn`. This is detailed further in [Dynamic Scheduler Control](@ref).
This is useful for nested execution, where an `@spawn`'d task calls `@spawn`.
This is detailed further in [Dynamic Scheduler Control](@ref).

## Errors

If a thunk errors while running under the eager scheduler, it will be marked as
having failed, all dependent (downstream) thunks will be marked as failed, and
any future thunks that use a failed thunk as input will fail. Failure can be
If a task errors while running under the eager scheduler, it will be marked as
having failed, all dependent (downstream) tasks will be marked as failed, and
any future tasks that use a failed task as input will fail. Failure can be
determined with `fetch`, which will re-throw the error that the
originally-failing thunk threw. `wait` and `isready` will *not* check whether a
thunk or its upstream failed; they only check if the thunk has completed, error
originally-failing task threw. `wait` and `isready` will *not* check whether a
task or its upstream failed; they only check if the task has completed, error
or not.

This failure behavior is not the default for lazy scheduling ([Lazy API](@ref)),
but can be enabled by setting the scheduler/thunk option ([Scheduler and Thunk options](@ref))
but can be enabled by setting the scheduler/task option ([Task and Scheduler options](@ref))
`allow_error` to `true`. However, this option isn't terribly useful for
non-dynamic usecases, since any thunk failure will propagate down to the output
thunk regardless of where it occurs.
non-dynamic usecases, since any task failure will propagate down to the output
task regardless of where it occurs.

## Cancellation

Expand Down Expand Up @@ -197,7 +192,7 @@ end
```

Alternatively, if you want to compute but not fetch the result of a lazy
operation, you can call `compute` on the thunk. This will return a `Chunk`
operation, you can call `compute` on the task. This will return a `Chunk`
object which references the result (see [Chunks](@ref) for more details):

```julia
Expand All @@ -214,16 +209,15 @@ Note that, as a legacy API, usage of the lazy API is generally discouraged for m
- Distinct schedulers don't share runtime metrics or learned parameters, thus causing the scheduler to act less intelligently
- Distinct schedulers can't share work or data directly

## Scheduler and Thunk options
## Task and Scheduler options

While Dagger generally "just works", sometimes one needs to exert some more
fine-grained control over how the scheduler allocates work. There are two
parallel mechanisms to achieve this: Scheduler options (from
[`Sch.SchedulerOptions`](@ref)) and Thunk options (from
[`Sch.ThunkOptions`](@ref)). These two options structs contain many shared
options, with the difference being that Scheduler options operate
globally across an entire DAG, and Thunk options operate on a thunk-by-thunk
basis.
parallel mechanisms to achieve this: Task options (from [`Options`](@ref)) and
Scheduler options (from [`Sch.SchedulerOptions`](@ref)). These two options
structs contain many shared options, with the difference being that Scheduler
options operate globally across an entire DAG, and Task options operate on a
task-by-task basis.

Scheduler options can be constructed and passed to `collect()` or `compute()`
as the keyword argument `options` for lazy API usage:
Expand All @@ -237,7 +231,7 @@ compute(t; options=opts)
collect(t; options=opts)
```

Thunk options can be passed to `@spawn/spawn`, `@par`, and `delayed` similarly:
Task options can be passed to `@spawn/spawn`, `@par`, and `delayed` similarly:

```julia
# Execute on worker 1
Expand All @@ -250,8 +244,9 @@ delayed(+; single=1)(1, 2)

## Changing the thread occupancy

One of the supported [`Sch.ThunkOptions`](@ref) is the `occupancy` keyword.
This keyword can be used to communicate that a task is not expected to fully saturate a CPU core (e.g. due to being IO-bound).
One of the supported [`Options`](@ref) is the `occupancy` keyword.
This keyword can be used to communicate that a task is not expected to fully
saturate a CPU core (e.g. due to being IO-bound).
The basic usage looks like this:

```julia
Expand Down
3 changes: 1 addition & 2 deletions test/options.jl
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,8 @@ end
# Special handling
(:scope, AnyScope(), ProcessScope(first_wid), ProcessScope(last_wid)),
(:processor, OSProc(), Dagger.ThreadProc(first_wid, 1), Dagger.ThreadProc(last_wid, 1)),
# ThunkOptions field
# Options field
(:single, 0, first_wid, last_wid),
# Thunk field
(:meta, false, true, false)
]
# Test local and remote default values
Expand Down

0 comments on commit 747dab3

Please sign in to comment.