Thursday, September 22, 2022

Tangled issues with permanent HTTP redirects

We have a general purpose web server, which includes user home pages. Historically, every so often people moved on but wanted their home pages to redirect to elsewhere, and we generally obliged, using various Apache mechanisms to set up HTTP redirections (most recently with Apache's RewriteMap). However, we haven't had any such new requests for years and years, which means that by now all of our existing such redirections are very old (and, naturally, not all of them still went to working destinations).

When we set up any HTTP redirection, we have historically tended to initially make them 'temporary' redirections (ie, HTTP status 302). Partly this is because it's usually the Apache default, and partly this is because we're concerned that we may have made a mistake (either in configuration or intentions) and historically permanent redirects could be cached in browsers, although I'm not sure how much that happens today. Our most recent version of redirections for people's old home pages were set up this way, and so they've stayed for four years.

Recently we had cause to look at how frequently these old redirections were still being used. To my surprise, a fair number of them were being used fairly often, and not just by search engines crawling them. Some of these uses may be from old URLs embedded in various places, but some of them seem to come from people following search engine links. I don't know for sure that search engines wouldn't be providing these links if we'd using permanent HTTP redirections, but it probably wouldn't hurt. So, more than four years after we set up things as temporary redirections just in case, we got around to making them permanent redirections. Quite possibly we should have left ourselves a note to do it sooner than that, once things were all proven and working.

Except, of course, there is a catch. Every so often we want to remove such a redirection (for example, because it's broken, or no longer desired), and then perhaps later the login name and thus the home page URL will be reused for another person. When that happens, we definitely don't want search engines (or browsers) to be convinced that '<us>/~user/' is permanently redirected to elsewhere, and to refuse to index or use the new, real, non-redirected version. If permanent HTTP redirections make this less likely, we should probably keep our redirections as temporary ones, even if this has other effects.

In part this is a conflict between the needs of the old and the new users of these URLs (or of any URLs). Permanent redirects may help the old users but hurt the new users, while temporary redirects may be the reverse. In theory this means that we should prioritize the needs of new users (who will be our current users) and use temporary redirects, but on the other hand the new users are generally only a theoretical future thing while the redirections for the old users exist now. I don't think I have any simple answers here.

(Let's take it as a given that the redirections will eventually go away and the URLs will eventually be reused. In some ideal worlds, URLs would be permanently claimed by and for their first use, but this is not the world we exist in in practice.)



from Hacker News https://ift.tt/gXjr2EQ

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.