Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow URLs as process namespace #515

Closed
m-mohr opened this issue Oct 4, 2023 · 12 comments · Fixed by #538
Closed

Allow URLs as process namespace #515

m-mohr opened this issue Oct 4, 2023 · 12 comments · Fixed by #538

Comments

@m-mohr
Copy link
Member

m-mohr commented Oct 4, 2023

To be able to load external processes (UDPs) in process graphs easily, it would be valuable to have the namespace being a URL for read-only access. The process ID stays as it is.

@soxofaan
Copy link
Member

soxofaan commented Oct 4, 2023

FYI: the VITO backend already supports this (because this turned out a valuable solution in a lot of practical use cases).
It is also already documented in python client: https://open-eo.github.io/openeo-python-client/cookbook/udp_sharing.html#using-a-public-udp-through-url-based-namespace

@m-mohr
Copy link
Member Author

m-mohr commented Oct 4, 2023

I was also wondering whether the namespace should not include the process ID, but then I realized that the URL may happen to have (or not have) a .json file extension and that makes it ambiguous. So specifying the full url in the namespace (for URLs) makes sense.

@soxofaan
Copy link
Member

soxofaan commented Oct 4, 2023

FYI: in VITO implementation we currently support it as follows

if user provides process_id and namespace="https://.... we try the following URLs and take the first hit:

  • {namespace}{process_id}
  • {namespace}{process_id}.json
  • {namespace}

e.g. with process_id="ndvi" and namespace="https://example.com/udp/" these two URLs are first attempted:

  • https://example.com/udp/ndvi
  • https://example.com/udp/ndvi.json

The last option is to allow specifying the UDP URL fully through the namespace (ignoring process_id)

@m-mohr
Copy link
Member Author

m-mohr commented Oct 4, 2023

Yeah, I think for namespace URLs we should simplify that the URL is the exact URL to the file and you don't need to do any try&error. The only thing that you need to check is the response. If it's a single process, then just use it. If it is a process list response, take the process from the list with the given process ID.

If not a URL, resolve as usual.

@jdries
Copy link

jdries commented May 21, 2024

This is becoming a key issue for federated setups and for ESA APEx.

In APEx, the use case is to have the UDP's stored as json in github rather than being managed in a specific backend.
This should then make UDP invocation less backend dependant, and simplify the management.

@soxofaan @m-mohr We'll probably want to discuss what is still to be done here.

@m-mohr
Copy link
Member Author

m-mohr commented Jun 5, 2024

Community meeting: No objections, next step is creating a PR for review.

@soxofaan
Copy link
Member

soxofaan commented Jun 5, 2024

Yeah, I think for namespace URLs we should simplify that the URL is the exact URL to the file and you don't need to do any try&error.

yes that's fine for me

The only thing that you need to check is the response. If it's a single process, then just use it. If it is a process list response, take the process from the list with the given process ID.

I'm not sure that process listing support is even necessary.
I would work from the assumption that users prefer a 1-on-1 mapping between UDPs and files. For example, the python client has some helpers to store/load a UDP from/to a file. If you want to manage multiple UDPs in the same file, you have to do a lot more cumbersome housekeeping on your own.

So I would keep the API extension very basic for now:

  • if namespace starts with https?://: assume this is a URL of JSON file containing a UDP representation and load the UDP directly form this URI
  • else: proceed as usual

Also note that even with such a simple API spec, there is still room at the level of clients to provide a more rich UI (e.g. user passes a URL as process id -> client automatically converts that to actual process id + URL namespace)

@m-mohr
Copy link
Member Author

m-mohr commented Jun 11, 2024

The reasoning behind allowing lists of processes (as in GET /processes) is that it's the only official public endpoint we have to load processes from. Everything else loads from non openEO API contexts.

@soxofaan
Copy link
Member

Ah ok, makes sense. So with a process listing document you mean a listing in the style of GET /processes and GET /process_graphs:

{
  "processes": [
     {"id": "..", "parameters": [...], "process_graph": {...}},
     {"id": "..", "parameters": [...], "process_graph": {...}},
  ],
  "links": [..]
}

Problem is that the spec for process listing currently allows and even recommends to not include the "process_graph" fields in the process listing, while this is actually the vital thing for UDPs

@m-mohr
Copy link
Member Author

m-mohr commented Jun 13, 2024

Yes, that's waht I meant. The process listing might be more interesting outside of the /processes context, but more for a public variant of /process_graphs. Maybe we need to think about #348 again...

@m-mohr
Copy link
Member Author

m-mohr commented Jun 20, 2024

@jdries @soxofaan PR is up at #538

m-mohr added a commit that referenced this issue Jul 1, 2024
Co-authored-by: Stefaan Lippens <[email protected]>
@soxofaan
Copy link
Member

soxofaan commented Jul 1, 2024

With #538 being merged now, I think this ticket #515 can be closed for now I think

@m-mohr m-mohr closed this as completed Jul 1, 2024
@m-mohr m-mohr linked a pull request Jul 1, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants