Browser history and urls seo friendly with async page load

In modern web sites, we use async load of pages to reduce the amount of data asked to the server and to improve the browsing experience of the user. Basically an SPA web application.

Two problems you will probably face, the browser history not updating and the urls that are not much SEO friendly.

In this article, I will show you a way to solve these two issues.

Some background

One way of loading a page asynchronously, is to use the hash (#) in the URL to be able to know what to load client-side.
Something like this:

Go to new URL

The problems with this approach are evident:

  • No full URL (bye bye crawler)
  • With no JavaScript enabled, it will not work
  • History buttons do not work

Without JavaScript, the URL would not work, because the hash is used only in the client part. When a user click on the link, nothing, on the right of the hash symbol, is passed to the server.

To solve the crawling problem, Google published an article explaining how to create correct urls.

Looking at the section Specification, they suggest to use the hashbang (#!) so the crawler know how to handle the right part of the hash. What is after the symbol #! is converted in “_escaped_fragment_=”.

In practice, when the crawler stumble upon a URL like

/blogs#!article1

will translate it into

/blogs?_escaped_fragment_=article1

This allow us to get all the info server side.
In C# is enough to get the Request’s value and then proceed with our logic.

var myParams = Request["_escaped_fragment_"];
//Do what we need with the value in myParams

While in PHP would be something like that

var myParams = $_GET['_escaped_fragment_'];  //Do what we need with the value in myParams

This works but a problem persists…no JavaScript, no links.

Progressive enhancement

This term was coined by Steve Champeon at SXSW of 2003. At that time, everyone was attached to the graceful degradation. You would develop your application on the latest versions of the browsers, than fallback on older versions by removing JavaScript functions. Not a great solution in the long run.
Aaron Gustafson, to explain the progressive enhancement, made the example of the M&M’s. The inside nut is the content, the chocolate coating is the CSS and the hard candy shell is the fancy JavaScript.

m-m

Applying this concept to our urls can be done by using full urls and then add some JavaScript on top of them.

article 1

And the JavaScript:

//We enhance all the local urls and not the ones that link outside our website
$("body").on('click', 'a:not([href^=http], [href^=#], [class^=tel], [class^=mailto])', function (e) {
    e.preventDefault();
    if ($(e.currentTarget).attr('href'))
        $.Ajax({ url: $(this).attr('href'), success: function (data) {
                /* process the data we get back from the server */
            }
        });
});

Server-side, we simply check if the call is an Ajax one or not and act accordingly.

C#

if (Request.IsAjaxRequest())     
    return Json(model, JsonRequestBehavior.AllowGet); //We return our model serialized in json
return View(model); //We return the full page.

PHP

if (isset($_SERVER['HTTP_X_REQUESTED_WITH']) &&
    strtolower($_SERVER['HTTP_X_REQUESTED_WITH']) === 'xmlhttprequest') { 
  // Handle Ajax request 
} else { 
  // Handle non Ajax request 
}

With this we solve almost all the problems:

  • We have full urls for the crawler
  • Everything works with JavaScript disabled

We still need to solve one problem. The browser history navigation when JavaScript kicks in.

history.pushState

With the history.pushState we can solve the last problem, update the browser’s history while loading pages using Ajax.

The history object, in HTML5, other than give us access to the user’s history, allow us to manipulate it. You can read a good explanation on the Mozilla Developer Network.

An implementation would look like this:

$.Ajax({ url: $(this).attr('href'), success: function (data) {
        // Process the data we get back from the server
        // Then we manipulate the history
        history.pushState(data, "Article1's title", "blogs/article1");
    }
});

$(document).ready(function () {
    window.addEventListener('popstate', function (event) { /* handle the event */ }, false);
});

The popState event allow us to react when the user press the back or forward buttons of the browser and load the correct page.

To see which browsers and relative versions support this function, look here.

Using this technique, you now have full urls, async page load and full support for the search engines.

The price to pay with older browsers that do not support history, is to disable JavaScript. Everything will works fine except for the fancy animations you may have created as a transition from a page to another.

In the next article, I will show you a problem we encountered, and how we solved it, with history.pushState and Chrome.

Update: The problem is solved with version 34 of chrome, so I will not write about it anymore. If you want to read the full story look here and here