-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process unification #310 #348
base: draft
Are you sure you want to change the base?
Conversation
271042c
to
9b4e914
Compare
9b4e914
to
541e9f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(initial review)
openapi.yaml
Outdated
For ease of use, it is NOT RECOMMENDED to use long randomly generated | ||
identifiers. More readable user identifiers like `john_doe` support a | ||
better user-experience as the user identifier is used in URIs for shared | ||
processes, e.g. `https://example.org/api/v1.0/processes/@john_doe/my_ndvi`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that john_doe
is nicer, but this recommendation does not play well with what OIDC gives us. For example google returns user ids like us-a44d63b6-e090-6059-81d7-cbe7afeff6ce
and microsoft: nIrHDS4rhk4ri738TRhtLHXdoUQ6OxZo9Ob0AS3vTig
there is a piece of the puzzle missing here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no user name / id to derive such a slug from, what to use instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed, I'm not even sure the returned userid is stable, e.g. if a user registers a new client id to work with, maybe an OIDC providers could bump/rehash the userid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, I would expect that a back-end assigns (or let the user choose) a separate user-id in addition to the external user-id. That is how I've seen it implemented as in most all cases you need to store additional user data anyway.
description: |- | ||
Lists all user-defined processes (process graphs) of the | ||
Redirects to all user-defined processes of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also add some kind of deprecation notice to the description of these endpoints?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least it should probably describe what to use/do instead. But I'm still not sure whether to use this redirect behavior (breaking) or leave /process_graphs as it is (basically as an alias, but non-breaking).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do you mean that the redirect is breaking? That (some) clients don't properly handle a redirect on POST/PUT/PATCH/DELETE?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I (have to) assume that not all clients would handle it properly. But if we test JS, R and Python and all work flawlessly, it's probably okay to call it non-breaking ;-)
The namespace `backend` is an alias for predefined processes. | ||
|
||
Back-end implementations MAY implement other namespaces that don't | ||
conflict with any of the namespaces mentioned above. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see discussion at #310 (comment)
I think the namespace "format" should be more flexible/generic than: @
is for per-user namespaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I disagree in #310 (comment) ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime I'm fine with more flexibility, but on the other hand we should likely restrict the allowed characters:
#478
Seems not stable enough for 1.1, so moving to 1.2. Would be good to have an implementation first, too. |
@@ -1617,6 +1632,187 @@ paths: | |||
$ref: '#/components/responses/client_error_auth' | |||
5XX: | |||
$ref: '#/components/responses/server_error' | |||
'/processes/{namespace}': | |||
get: | |||
summary: List all user-defined processes in a namespace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation of /processes/{namespace}
and /processes/{namespace}/{process_id}
talk about "user-defined processes in a "namespace", but listing "predefined" processes should also work, right?
e.g. GET /processes/backend
returns the same as GET /processes
and GET /process/backend/filter_temporal
would return the metadata of filter_temporal
process?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the thinking was that there would be duplication as you've pointed out. You usually get all details from /processes for pre-defined processes so that this "extension" is mostly for user-defined processes (and was mostly copied from /process_graphs). So we can discuss whether we should remove the "user-defined" here. The backend namespace is somewhat different though as it is read-only...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that de duplication between GET /processes/backend
and GET /processes
is that much of an issue, especially because "backend" can be considered to be a default namespace. So yes, I would argue that the "user-defined" can be dropped here.
Also, a back-end is also free to define custom namespaces, and these could also contain pre-defined, non-user-defined, "read only", possibly proprietary processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'm fine with that. We can add additional wording that clarifies any specifics for user-defined processes. Edit: It's not that simple, see below...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, looking into the OpenAPI file made me remember that user-defined processes and pre-defined processes have a different schema. So indeed the endpoints were only meant to support user-defined processes so far.
User-defined processes for example require a process_graph, but that's not possible for predefined processes.
On the other hand, pre-defined processes require e.g. parameters and return values while this is optional for user-defined processes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to split up into two separate endpoints to make sure the schemas can be applied correctly
Or we don't allow exposing this endpoint as it's already exposed via /processes.
Both options are equally bad for us, VITO, as we are already using a non-"backend" namespace containing predefined proprietary processes (without a "process_graph"), which would be invalid according to both of these options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... maybe we can add a discriminator to GET /processes/{namespace}. If it contains type: user: true
(or something similar) in the response it applies the user schema to the processes array, otherwise the predefined schema. That would also help clients to know whether they can do non-GET requests on that namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think part of the current problem with "predefined" vs "user-defined", is that we are lumping together a couple of process concepts, for example:
predefined | user defined |
---|---|
live in default namespace | live in "user" namespace |
implemented "natively" by backend | implemented through openEO process graph |
has no "process_graph" field in metadata | has a "process_graph" field in metadata |
public (by default?) | private (by default?) |
no public API to add/update/remove | created and managed though openEO endpoints |
parameters and return values must be declared | parameters and return values are optional |
By sticking to this binary division, each with own "schema", we probably make it hard for ourselves to create a clean API in the long term.
For example, in VITO backend we already have processes that mix properties from both columns, e.g. private, natively implemented processes that live in a namespace that is neither default or per-user. Another example is defining predefined processes through a "process_graph", instead of "natively" (e.g. "ndvi" or "evi").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remember that we are just moving things around here. This is to move the endpoints to /processes/... for unification and to prepare for v2, but we have to stay compliant with the API v1.x line for now. We can't change a lot wrt the schemas for example as that usually is a breaking change. So until we go for API v2 we may need to live with some compromises.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I understand, I was just reflecting on the tension between the constraints of the v1.x API and what we are doing in VITO backend (or want to do) with custom namespaces
# Conflicts: # openapi.yaml
# Conflicts: # openapi.yaml
# Conflicts: # openapi.yaml
Some additional thoughts on namespaces:
Lastly, I'm thinking to not merge this into the "core" API, but instead, make this a separate extension. Then this would be a pure addition and the |
I'm afraid I have to agree 😄 . |
|
||
If multiple processes with the same identifier exist, Clients SHOULD | ||
inform the user that it's recommended to select a namespace. | ||
process_namespace: | ||
type: string | ||
pattern: ^@?[\w\-\.~]+$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For VITO we'd need to add a double colon here (i.e. for the u:asd
replacement for @GreatEmerald )
This is a first draft for #310 that tries to unify /processes and /process_graphs.
Sharing might be a separate PR.