-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add collection redirect page and consolidate htaccess configuration files #261
base: main
Are you sure you want to change the base?
Conversation
templates/collections.html
Outdated
<ul class="fa-ul"> | ||
<li> | ||
<span class="fa-li"><i class="fas fa-book"></i></span> | ||
<a href="{{ index.quicklinks.collection_index.link }}" target="_blank">{{ index.quicklinks.collection_index.label }}</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
builtin modules are by far the most popular. Can we add a quick link here for https://docs.ansible.com/ansible/latest/collections/ansible/builtin/index.html#plugins-in-ansible-builtin
I copied the .httaccess to https://htaccess.madewithlove.com/ and typed in a bunch of urls to see what it would redirect to:
So ansible (precollections) 2.3 through 2.6 redirect today to /latest/ and we republished those guides to an archive site so those who do still need them can fine them. We didn't do this with 2.7 and 2.8 because I couldn't republish them (jenkins updates made it too difficult so I gave up). Just putting that here for history on why these proposed new redirects don't go beyond 2.6. |
Some problems (running same test as prior comment):
|
My general thoughts - though it does 'lower' the user experience ( aka I used to get to the correct module page from an ancient stack overflow page that predated collections), it is acceptable since collections are now 4+ years old. I agree with @oraNod that this is necessary both from a maintenance point of view (see k8s module currently 404s because we can't test/update 1k redirects to keep them all accurate), and from the overall stratey to move to RTD (which cannot handle 1k redirects anyway). |
oh last thoughts - once we get enough approvers, we should blast out a warning in matrix before we merge (and don't merge on a Friday lol) in case it all blows up... |
@samccann I plan to create a forum post on this but would like to get some thoughts from folks on the review list first. I'll also announce this on the bullhorn for at least two editions. |
thanks for the initial test findings! I'm going to create a separate PR for can you be more specific about the redirects you were testing in this comment, please?
Are you referring to the Line 22 in bee311b
Or did you observe something weird with the specific guide redirects here? Line 24 in bee311b
There aren't any rules configured for |
also @samccann I think that catch all rule at the end might be causing a problem: Line 37 in bee311b
I haven't set this up on the test server yet (will do that tomorrow morning b/c vpn...) but I'm getting a server 500 error now. this rule might be causing an issue. specifically the that catch all redirect came from the original config file that Toshio set up: Line 4051 in 2ee83dc
I bet we could just nuke that. it's not even in all the old 2.x config files and likely more hassle than it's worth. I think we really only need to be concerned with the pre-collections plugins and modules stuff. |
wrt user experience I think it will be possible to embed a straightforward enough search implementation with Javascript - or better ReadTheDocs query with ElasticSearch indexing - so that you can search through at the same time I don't think we should go overkill with it and create another maintenance nightmare for ourselves, especially as collections are 4+ years old at this point. tbh avoiding 404s and degrading SEO authority from thousands of broken links is more of a concern than users not being taken directly to the corresponding collection. now that I think about it we should also point users to the forum to ask for help as well as the collection index and other doc links... |
So yes, based on that tool I was using, all the guides will go someplace wonky. So anything that is NOT a plugin or NOT that vault page, seems to end up in a strange place. just tried https://docs.ansible.com/ansible/2.3/reference_appendices/YAMLSyntax.html and in that tool, it triggers this redirect: So you are correct, it's caused by the combined redirect that should be pushing older EOL docs to latest. |
The catchall redirect came from me, not toshio. It's the redirect that takes all links to 2.6 and redirects to latest. We need to keep this category of redirects as that is what ensures those top google hits that used to point to 2.4-2.6 docs now end up on latest. |
The vault redirects aren't working and I wonder if they ever did. I'll try to test this more tomorrow, but the original vault location for 2.3, 2.4 for example was https://docs.ansible.com/archive/ansible/2.4/vault.html but that doesn't match what I put in the EOL redirects in the 2.4 directory.... |
Thanks for the extra details @samccann - I'll have to come back to that. I was thinking that catchall redirect might have been causing a conflict with other redirect rules that was borking the test server.
I think the good news here is that we can add these to Read The Docs like this:
See https://docs.readthedocs.io/en/stable/user-defined-redirects.html#redirecting-a-directory I know we're limited to 100 redirects per RTD project but if all this consolidation works as expected then we'll be well below that limit for the top-level project. |
7905896
to
a4b3ab2
Compare
de740da
to
d1dd82c
Compare
d1dd82c
to
f64b482
Compare
@samccann I've rebased and updated everything. If you want to do any more validation, I've put the same I also pushed a commit to tidy up somethings by adding more comments and removing some old lines that were commented out. Something else we can do is to get rid of this configuration: https://github.com/ansible/docsite/blob/main/ansible/.htaccess
Docs on the multiviews option are in the content negotiation page and options directive page.
I guess the idea is to make it seem like users remain in the package docs, which is good. Another approach might be to improve the 404 in the landing pages and just use that but it might be frustrating for users to have to navigate back to the package docs. However my thinking would be to provide some useful resources / links in the main 404 page that could help folks with finding what they need more easily. Since we have a custom 404 page we should make the most of it. Finally, I'd suggest the next steps before merging this are to finalize that new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My directive syntax is sooo rusty, I don't want to "approve" as in "I've verified this for correctness", but I'm a big +1 for the overall approach.
As I previously shared with @oraNod privately, I think this approach is really the only reasonable way to go- while I hate breaking links, we just don't have the resources to permalink everything we've ever done while both accounting for all past structural/content/ownership changes and avoiding paralysis for future changes.
Thanks!
Testing the changes in this PRI've been testing the consolidated redirect rules in the Test methodologyI created a couple of scripts to iterate over URLs and return the http status code for pages in the Ansible package docs. You can find the scripts and all the test details here: https://github.com/oraNod/url-status-checker (I plan to add that Python script to this repo but I want to make a few tweaks to it first.) I used Complete details about how everything works is available in the README here: https://github.com/oraNod/url-status-checker/blob/main/README.md Test resultsI've attached txt and csv files generated from the url checker script in this tarball:
Target page 404sApproximately 870 target pages (the page to which you are redirected) return a 404 status. A large number of these pages get created by the catch all redirect rules when a page from an older version gets redirected to a non-existent latest version. For example, this page in 2.10 docs: https://docs.ansible.com/ansible/2.10/collections/amazon/aws/aws_az_facts_module.html The catch all rule constructs a redirect to Another example are pages that get renamed, such as Many of these are 404 pages that already exist outside of this PR. For example:
Some of the 404 pages are caused by the catch all rules in this PR though. For example:
I think it's reasonable to accept the fact that older pages will break and 404s are going to happen. The target page 404s are all for Ansible 2.x versions and there are less than 900 of them out of 19,646 URLs tested. That is less than 5% of Ansible 2.x redirects leading to a 404 page. |
8a6da4a
to
a10f424
Compare
link: "https://docs.ansible.com/ansible/latest/collections/all_plugins.html" | ||
builtin_index: | ||
label: Index of all modules and plugins contained in ansible-core | ||
link: "https://docs.ansible.com/ansible/latest/collections/ansible/builtin/index.html" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nicer if this link goes to https://docs.ansible.com/ansible-core/...
, but unfortunately there's no latest
equivalent for ansible-core...
.htaccess
Outdated
##################################################################### | ||
|
||
# Redirect plugin and module pages for devel and latest | ||
RedirectMatch permanent "^/ansible/(devel|latest)/(plugins|modules)/(.+)\.html$" "/collections.html" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One big downside of this redirect is that it breaks a lot of existing links on the web (in blog posts, stack overflow answers, mailing list answers, ...) from Ansible 2.9 and before that right now still work.
I have no idea how frequently these URLs are still used though. Finding the module/plugin in question isn't too hard with the new collections.html page, though, so I guess it will be OK...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look @felixfontein I can also see this is going to hose the links to these pages: https://docs.ansible.com/ansible/latest/plugins/
I overlooked those plugin pages, but glad I caught them. I'll raise this at the next DaWGs meeting but maybe we should remove this redirect rule and avoid globbing any latest or devel urls to the collections.html
page.
Either that or we add a negative lookahead for the plugin pages which do exist, such as:
RedirectMatch permanent "^/ansible/(devel|latest)/modules/(.+)\.html$" "/collections.html"
RedirectMatch permanent "^/ansible/(devel|latest)/plugins/(?!action\.html|become\.html|cache\.html|callback\.html|cliconf\.html|connection\.html|docs_fragment\.html|filter\.html|httpapi\.html|inventory\.html|lookup\.html|module\.html|module_util\.html|netconf\.html|plugins\.html|shell\.html|strategy\.html|terminal\.html|test\.html|vars\.html)/(.+)\.html$" "/collections.html"
An alternative is to take all the latest source pages and redirect them to collections using the https://pypi.org/project/sphinx-reredirects/ extension.
It doesn't remove the maintenance overhead and maybe adds to the build time. We could probably add some regex to reduce the number of redirects that we'd need and point to specific collections, for example:
redirects = {
"modules/azure_*": "collections/azure/index.html",
}
I guess we can see but we need to rethink the devel and latest rules here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that works like a champ. I tried an experimental commit: oraNod/ansible-documentation@ec0790a
And deployed it to the test site on pages. All the latest redirects are there and it doesn't seem to add to the build time.
@samccann (and all) Heads up that I've pushed a final commit: 467753a This restores the redirects for any "latest" urls so that the changes in this PR are limited in scope to only the 2.x versions. We've got a couple of options for the "latest" urls and can do them separately. I'll send related PRs and update the forum post to explain everything. |
This PR makes the following changes:
As a result of this change users that access plugin and module pages will be redirected to collections.html. For example, this link is available from an
ansible.com
blog post: https://docs.ansible.com/ansible/latest/modules/k8s_module.htmlA rule exists to redirect this page to the corresponding collection page, which has been moved and results in a 404 when the redirect is followed:
docsite/ansible/10/.htaccess
Line 2466 in 2ee83dc
Instead of having to maintain all the rules in the htaccess configuration files between releases, we can redirect users to a single page. While true that this will require users to search for the appropriate documentation, it does avoid 404s and any resulting SEO degradation.
For any reviewers, here's a link to the
collections.html
page on the RTD preview build: https://ansible--261.org.readthedocs.build/collections.htmlTo evaluate these changes on the test server, use this branch in my fork because it tests the
collections.html
page on the test server. Note to say that we can't actually evaluate these changes with the test server because the rsync command that the jenkins job uses doesn't include the--delete
flag so the old.htaccess
configuration files aren't getting removed. They aren't playing nicely with the redirects here. Plan B is to stand up a test server somewhere. I don't want to prune old files off the actual test server just to validate these redirects.