Jekyll with Human Friendly URLs
I’ve been using Jekyll to generate static web sites and then hosting them on the Rackspace Cloud Files CDN which uses Akamai’s content delivery network (CDN).
With Rackspace Cloud Files I have a CDN-enabled container and I have enabled my container to serve static web site files. This means I can use Rackspace Cloud Files with Akamai CDN to serve all my static web sites and I do not need to run or manage my own servers for web hosting. I simply use Cloud Files to store and serve my site. Some bonuses to using Cloud Files with CDN are my site is served to visitors very fast and my site can easily handle a very large number of visitors. Basically, my static site can handle web scale traffic.
What are Clean URLs?
I’m an advocate of using clean URLs, or human-readable URLs, in my sites. Clean URLs have many benefits:
- Search engine optimization
- Improved usability
- Improved accessibility
- Simplifies URLs
- Easier to remember URLs
- Do not contain implementation details of your site (Example: no php / html / asp / etc extensions on the URL)
Here’s an example of an un-clean URL:
http://www.domain.com/category/post-name-here.html
And here’s an example of a clean URL:
http://www.domain.com/category/post-name-here
Notice there is no .html and the URL looks better. Cleaner.
What is Jekyll?
Jekyll is a simple, blog aware, static site generator written in Ruby. It lets you create text-based posts and pages and a default layout that will be used across all of your posts. So you can easily change the look and feel of your site by modifying your default template and then re-generate your site, and the changes will be applied to all of your blog posts. Jekyll also generates static files that you can use on your CDN or host them yourself on your own server.
Jekyll does not create clean URLs by default, however. It will append .html to the file name and reference URLs with the .html suffix. Not ideal for a clean URL.
How To Get Clean URLs with Jekyll
I’m using a jekyll plugin which rewrites the file name and URL reference so that the html suffix is not included. It turns your blog-post.html file name in to “blog-post” without the .html extension.
To use Clean URLs with jekyll, you’ll need to set your permalink format in your jekyll _config.yml and use a jekyll plugin to generate your web site files without the .html extension.
Here is the _config.yml permalink structure I use for my site:
permalink: /:categories/:title
This will create a friendly URL in the form of: http://www.domain.com/articles/my-awesome-article
If you don’t want to display the category in the URL, you can change the permalink to:
permalink: /:title
And this will create a URL in the format of: http://www.domain.com/my-awesome-article
Check out the jekyll plugin I’m using on my github here: jekyll-rackspace-cloudfiles-clean-urls
Rackspace Cloud Files with Jekyll and Clean URLs
I came across another problem: Rackspace Cloud Files does not know what type of file “blog-post” is as there is no file extension on it. When you browse to my CDN-hosted site to a clean URL, your browser would try to download the file instead of rendering it as html. The reason is that Cloud Files can’t peer inside the file and see that it’s all HTML code and apply the correct content type. I needed to manually set the content type myself and tell Rackspace Cloud Files that “blog-post” is type “text/html” so that a web browser can properly display it.
In order to solve this problem I have written a python helper script to apply the “text/html” content type automatically for my jekyll generated sites. My python helper script will upload my site to Rackspace Cloud Files for me and check the files it has uploaded to see if they are HTML files or not. If an HTML file is found, the python helper script will tell Cloud Files it is type “text/html”, allowing Cloud Files to properly display the html to a browser.
Download my Cloud Files / jekyll helper script from my github: jekyll-rackspace-cloudfiles-clean-urls