25

We have a database of resources, be they products, blog posts or something. We need to design a URL scheme to address them, for the public website.

Here are two examples that are database ID bound:

Here's an example that's friendly:

(A little glimpse into my browsing life there)

I like the friendly URLs since you have an idea about what's on the end of the URL when you hover or see it in an email or document. It's better for SEO, or it used to be.

What happens when the document or product is renamed? Either because it changed (Wiki may not change but our resources could) or due to a typo, right? Our resources are very technical, long words and error prone.

Also, we have a database ID, which is a number. Let's look at an idea for an address of a video using a pretend rental store:

The ID is obvious and is used in the DB look-up. Fine.

The sliding-doors bit is non-unique and just generated from the video title, it could be verified on GET, so if gliding-doors was entered and doesn't match what's really in doc 287171 it responds 404.

Or maybe it could be ignored, allowing humans to stick whatever they like in there, if someone ever cared to. So this URL would also work:

The issue with verifying the friendly part is, as mentioned, the problem of renaming or typo correction. If the name changed, and in our domain that does happen, we don't want to break the URLs that are out there, so should we:

  • Just not verify the friendly part.

  • Verify, but add a 'history' of friendly parts to the database record so any previous friendly IDs still work!

Your thoughts and ideas are welcome.

Luke

Luke Puplett
  • 719
  • 1
  • 7
  • 12
  • 11
    even this very site uses a combination `http://programmers.stackexchange.com/questions/255684/providing-friendly-urls-for-a-website-vs-realities-of-database-ids` (using a non-verified version in light of title changes, also the shorter "share" link is just the id: `http://programmers.stackexchange.com/q/255684/25768` (and user id for badge tracking) – ratchet freak Sep 08 '14 at 09:56
  • 11
    If you have a unique id in your URL I don't see why you would want to verify the slug part at all. Use it for the looks and ignore it for the lookups. – thorsten müller Sep 08 '14 at 09:58
  • If either of you want to give a proper answer, I'll vote up so you get the points. I'll let the votes come in and award the answer to the most-voted in a couple of days. – Luke Puplett Sep 08 '14 at 10:19
  • 1
    http://programmers.stackexchange.com/questions/231483/slugify-via-helper-or-store-slug-on-database – gnat Sep 08 '14 at 11:10
  • 3
    Never knew the term slug before. I must have been under a rock. Geddit? – Luke Puplett Sep 08 '14 at 12:31

5 Answers5

7

Keeping the ID in the URL is the most future proof method and as you demonstrated, the URLs can still look relatively good.

Another option used by multiple projects is to keep an history of previously used slugs. When the title changes, you update the slug and if someone tries looking for an obsolete slug, search in the list of old slugs. That way old slugs can be reused for new content (or not depending on your implementation).

Wordpress did that and so did the friendly_id gem which is probably the most used gem for managing friendly ids for Rails.

Also, while I like good looking URLs, I think it's important to remember that this is most likely a feature used by more tech savvy users. Some browsers are even starting to hide URLs (or part of it).

mbillard
  • 1,739
  • 1
  • 14
  • 18
  • 2
    This slug history is what I was considering. Since posting the question, I've noticed many big named sites that have a slug that is not checked, you can alter it to say anything. http://www.amazon.co.uk/Blah-Blah-Blah/dp/B004R276L8 works. StackExchange is clever since it 'corrects' and redirects the browser to ensure the right link is shown and shared. – Luke Puplett Sep 09 '14 at 12:12
  • A "slug" is less useful for people, and more useful for Search Engine Optimization, as a "slug" or "friendly URL" should have keywords relating to the page's content. Advanced users aren't the reason to include friendly URLs in your site. Search engine rankings tend to be the main reason. – Greg Burghardt Sep 09 '14 at 16:36
  • I disagree. URLs with just IDs are hard to work with; its hard to remember from a list of them which one you might want to return to. Or whether there's going to be something inappropriate at the other end of the link. Chrome's address bar suggests on any part of the URL, too, which is useful. – Luke Puplett Sep 10 '14 at 13:10
  • 1
    @LukePuplett yes I believe SE's way of dealing with URLs is the easiest when it comes to slugs. – mbillard Sep 10 '14 at 20:23
  • @GregBurghardt the only difference is in the clickthrough rate, users tend to click slightly more on friendly URLs: http://stackoverflow.com/questions/505793/do-seo-friendly-urls-really-affect-a-pages-ranking – mbillard Sep 10 '14 at 20:25
3

I have used two different scenario's in the past.

  1. /id/some-slug where the id is used to lookup, the slug not. Thus the slug can be anything. But, when the slug does not match with the actual slug, the user is redirected to the current version.

  2. /permalink for cases where we didn't want an id in the url or where the url should never change, even though there is an id available (see [1] and [2]). Of course, in this case the permalink is used for the lookup. Both the current slug and the permalink (the first slug) are stored in the database.

In neither of these ways you need to keep a history of slugs in your database, which would get problematic very soon.


ps: In the second case you'll need some very specific routing to keep social credits:

  • if you want, redirect users to the current (non permalink) url
  • have the permalink used as the url in the social buttons
  • always redirect the facebook crawler to the permalink

See [1] and [2] again.

Lode
  • 220
  • 2
  • 8
  • Why it will be problematic? If I keep and ID and slug is anything, the visitor will go to the actual page. Will it be harmful for SEO? – Jnanaranjan Nov 23 '19 at 11:21
  • You mean keeping a history of slugs? What do you do when someone wants to re-use such slug? For the same or another id? How do you design database and/or code to prevent multiple redirects? Do you want to hide existence after deletion and are redirects exposing previous existence? All of this is not impossible, but it raises all kinds of questions which I rather just prevent by design. – Lode Nov 23 '19 at 16:03
  • What I wanted to say is if the ID is present in in the URL then no matter what the slug is it will redirected to the requested page. Then the slug history doesn't matter. I agree that it is problematic for android though. – Jnanaranjan Nov 25 '19 at 04:51
  • 1
    Ah okay. That is what I added a scenario 1 right? Or do you mean something else? – Lode Nov 26 '19 at 16:16
  • Yes. That is correct. – Jnanaranjan Nov 27 '19 at 06:35
2

What happens when the document or product is renamed?

HTTP response 301 (Moved) was designed for this purpose. If any client goes to the old URI you simply send them the new URI and they can redirect to that.

The sliding-doors bit is non-unique and just generated from the video title, it could be verified on GET, so if gliding-doors was entered and doesn't match what's really in doc 287171 it responds 404.

If I follow correctly this is duplicating work, you have both a name identifier for the resource and an id in the same URI. That doesn't serve any purpose.

If you are worried about multiple movies having the same name you can add extra information about the film into the URL

http://vidsyeah.com/video/2000/sliding_doors
http://vidsyeah.com/video/1932/sliding_doors

or

http://vidsyeah.com/video/studios/paramount/sliding_doors
http://vidsyeah.com/video/studios/warnerbros/sliding_doors

Having said that there is nothing wrong with using IDs if that makes sense for your data model, particularly if the only thing you are grouping by is that they are videos.

http://vidsyeah.com/video/210232
http://vidsyeah.com/video/2342

The client, either a computer or a human user shouldn't too reliant on the URI structure in the first place, they should be looking at the content you have returned to figure out which resource to find.

There is nothing wrong with having a sensible URI system that makes it easy for someone to just guess a location of a resource or to navigate up and down the structure based on shared properties (ie all movies in 2004), but your system should not rely on that and no client should break if you change your URIs

Or to put it another way, you should be able to change over night from

http://vidsyeah.com/video/studios/paramount/sliding_doors

to

http://vidsyeah.com/video/12323

and no client should break because the clients should be looking at content not URLs.

Cormac Mulhall
  • 5,032
  • 2
  • 19
  • 19
  • Like Jon's answer, I think you're not wearing your UX hat when thinking about this. I want to increase the useability of the address. See my comment in the question: "I like the friendly URLs since you have an idea about what's on the end of the URL when you hover or see it in an email or document. It's better for SEO, or it used to be." – Luke Puplett Sep 09 '14 at 12:09
  • 2
    In order to throw a 301, I'd need to be able to lookup the correct resource, thus I'd need a history. – Luke Puplett Sep 09 '14 at 12:13
  • 1
    You would need a history, but if you have a site with resources that change that is a good idea anyway. – Cormac Mulhall Sep 09 '14 at 12:28
  • There is no problem with friendly URIs. I wouldn't do the scheme that the URI can be anything but still work if it has an ID at the end. That doesn't really solve any issue (the user still has to remember the ID) and introduces a confusing URI scheme (user might legitimately ask why two different URIs, one with a spelling mistake, go to the same resource) – Cormac Mulhall Sep 09 '14 at 12:31
  • 1
    If you are concerned about spelling mistakes in URIs a common way to deal with this is suggested URIs in the 404 error page for the incorrectly spelt URL. You can do a word pattern search and give back what you think the user might be looking for. – Cormac Mulhall Sep 09 '14 at 12:33
  • Love the last idea. – Luke Puplett Sep 10 '14 at 13:02
1

The BBC use slugs that are:

  • alpha-numeric (for compactness)
  • unique (for lookups)
  • non-sequential (so that the order things are added to the db isn't exposed)

e.g. http://www.bbc.co.uk/programmes/b006mk7h

Each public programme has both an ID and a slug. IDs can then be auto-incrementing integers as usual, and gaps aren't exposed.

Nicholas Shanks
  • 337
  • 2
  • 12
0

From a RESTful standpoint, URIs should follow a predictable and perphaps hierarchical structure to enhance usability.

This will make them easier to use by consumers. If your data has relationships, then some sort of hierarchy would be necessary.

It looks like the scheme is : \video\[name]\[id]

If the name isn't being used for any further classification it could be dropped in favor of \video\[id].

However, if you wish classify the videos then maybe the name is useful.

Examples:

  • \video\SwingingDoors\123
  • \video\SwingingDoors\124
  • \video\SlidingDoors\125
  • \video\SlidingDoors\126

Its really a design decision on how the access is modeled.

Jon Raynor
  • 10,905
  • 29
  • 47
  • I think you're thinking about this from an API/site information architecture PoV. I was looking to introduce a generated friendly URL part to help humans and SEO. Apparently this is a common thing and goes by the name of 'slug'. The name is not being used for classification and is added (not dropped) to make a better UX with the URL and our site/brand. – Luke Puplett Sep 09 '14 at 12:07