-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use PyPI JSON and "Simple" API instead of deprecated XML-RPC API #201
Conversation
The XML-RPC API is deprecated. Tests using it were failing due to what seems like a malformed url. I have no idea if this API has already been dropped or if there's some usage issue inside stdeb but I was not motivated to investigate since the entire API is deprecated. Instead I replaced the functionality with a new Requests HTTP session client that uses the [JSON representation][1] of the [Simple API][2] and the [JSON API][3] where best appropriate. It could all be done with the JSON API but parts of it are also deprecated... There is one regression in behavior, which is that packages with a release on PyPI but no download url on PyPI will not be checked for an external download url as this also deprecated. [1]: https://peps.python.org/pep-0691/ [2]: https://warehouse.pypa.io/api-reference/legacy.html [3]: https://warehouse.pypa.io/api-reference/json.html
083e630
to
be6c994
Compare
if download_url is None: | ||
# PyPI doesn't have package. Is download URL provided? | ||
result = _call(pypi.release_data, package_name, version) | ||
if result['download_url'] != 'UNKNOWN': | ||
download_url = result['download_url'] | ||
# no download URL provided, see if PyPI itself has download | ||
urls = _call(pypi.release_urls, result['name'], result['version']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the behavior for which I could find no sensible replacement in the current PyPI APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think the concept carries over to the new API.
if verbose >= 2: | ||
myprint( | ||
'found default release: %s' % (', '.join(default_releases),)) | ||
'found default release: %s' % (', '.join(default_release),)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is actually a long-lived typo and the intent was always to log the single default release found.
It's not complete because I'm juggling a number of other changes to fix the test scripts but with this and a few other changes (mostly related to |
stdeb/pypi_simple.py
Outdated
USER_AGENT = "pypi-install/%s ( https://github.com/astraw/stdeb )" % \ | ||
stdeb.__version__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the only current use case passes in an explicit USER_AGENT
anyway, it might be good to remove the duplicated string here and make the default behavior not specify a user agent at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to make this change but I agree with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In 0787a84 I ended up importing the USER_AGENT constant from the downloader. I think adding a download method to the pypi client would also be another way to push this down. But this cuts the duplication although it keeps the default user agent.
stdeb/pypi_simple.py
Outdated
|
||
def release_versions(self, package_name, current_only=False): | ||
package_name = normalize_package_name(package_name) | ||
if current_only: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two code paths determined by current_only
share only the package name normalization which is already in a separate function. I'd recommend introducing a new function for the True
path here, especially since the return type seems to change based on that, which is pretty confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point! I was culting forward a pretty straight port of the interface from the old XML-RPC transport, but even that was a carry-over from an interface that was not actually being used for what it is for. I've split the functions in 2d0a089 and the result is much better. Thanks for spotting it.
stdeb/pypi_simple.py
Outdated
if current_only: | ||
response = self._http_session.get("%s/pypi/%s/json" % (self.pypi_url, package_name)) | ||
data = response.json() | ||
return data['info']['version'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that all of the other string quotes in this file are double quotes. It would be nice to be consistent.
That said, the other file you're changing here seems to use single quotes. Your call.
return data['info']['version'] | |
return data["info"]["version"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think the ruff
style which I'm exploring for this project uses double quotes ubiquitously and this project currently uses a mix of quote characters.
Fixed in 2d0a089
if url["packagetype"] == "sdist" and \ | ||
url["python_version"] == "source" and \ | ||
url["url"].endswith((".tar.gz", ".zip")): | ||
download_url = url["url"] | ||
md5_digest = url["digests"]["md5"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original algorithm has a break
here, meaning that the first encountered match is used. This continues to loop, meaning that the last match is used. Is that an intentional deviation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! No it isn't. I don't know that there would be a situation with multiple sdist payloads for a given version so I think the break
should just be a minor optimization, but it is worth preserving the behavior. In the event that there are multiple sdist urls I'm not sure that the sorting from the server is reliable so we just have to trust that the first one we get is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored the break
in 86d8edc
if download_url is None: | ||
# PyPI doesn't have package. Is download URL provided? | ||
result = _call(pypi.release_data, package_name, version) | ||
if result['download_url'] != 'UNKNOWN': | ||
download_url = result['download_url'] | ||
# no download URL provided, see if PyPI itself has download | ||
urls = _call(pypi.release_urls, result['name'], result['version']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think the concept carries over to the new API.
I was tracking quite closely the XML-RPC API interface that was used for this information previously which repurposed a "show_hidden" parameter to return either the current release or all releases in the response. Moving these to separate client methods is a much clearer internal and external organization.
The XML-RPC API is deprecated. Tests using it were failing due to what
seems like a malformed url. I have no idea if this API has already been
dropped or if there's some usage issue inside stdeb but I was not
motivated to investigate since the entire API is deprecated.
Instead I replaced the functionality with a new Requests HTTP session
client that uses the JSON representation of the Simple API and
the JSON API where best appropriate. It could all be done with the
JSON API but parts of it are also deprecated...
There is one regression in behavior, which is that packages with a
release on PyPI but no download url on PyPI will not be checked for an
external download url as this also deprecated, so I find this acceptable.