5

I'm building a SPA (single page application) so that when a browser request a page from my server, it only receives a small HTML and a big JavaScript app that then asks the appropriate data from the server, renders the HTML locally and generally drives the local app. Think of apps such as Gmail or Google Maps that never ever reload the page again.

This makes apps feel very snappy but it means that if the user agent doesn't execute JavaScript, then there's no content. This is a problem when the web site is being indexed by search engines or when an app requires a some content for it (think about posting links to Facebook, Twitter, LinkedIn, how you get a snippet of the content).

To circumvent that problem, I'm pre-rendering pages by running the JavaScript part on the server. It is working, but it is rather slow. Since most of the time this execution of JavaScript won't be needed, I'm thinking of whitelisting some browsers, like Chrome, Safari, Firefox, even recent version of IEs to not get pre-render.

Would this work or are most useful bots out there identifying themselves as browsers? How can I gather this information? Any source on user agent of good and bad bots?

Pablo Fernandez
  • 313
  • 1
  • 9
  • If you're supporting non-JavaScript clients, why aren't you detecting presence of JavaScript, rather than relying on user-agent? Suppose I turn JavaScript off in my user agent. Then your app should just show me the cached pre-rendered version. – Brandin Sep 08 '15 at 15:42
  • @Brandin pre-rendering has a negative performance impact that I'm trying to avoid when it's not needed, which is most browsers. Once I can verify the presence of JavaScript, it's too late to make a decision about pre-rendering or not. – Pablo Fernandez Sep 09 '15 at 16:48
  • As an alternative, why not just white-list the bots that you care about (e.g. Google's bots) and then the others on an as-needed basis). E.g. only do the expensive pre-rendering to whitelisted bots where you know this is needed. If a bot "lies" about what it is, then you don't have to care about it! – Brandin Sep 09 '15 at 17:52

1 Answers1

1

I would suggest you take a look at things like AJAX Crawling to influence you here.

The pre-rendering does sound like overkill, but if you need javascript to calculate page content, then better to only run the javascript pre-render when a bot (such as googlebot) has requested a page like http://mySPA/example.net/SPA#That-one-page-with-the-stuff and then return the normal (lite HTML with javascript doing the heavy lifting) page when an actual user requests http://mySPA/example.net/SPA.

Google calls this a "HTML Snapshot", and will abide by it in search results. Not sure if Facebook play's nice too.

Tersosauros
  • 768
  • 3
  • 19
  • When you serve the pre-render on the HTML Snapshot, make sure you include a script hook back into the main application. That way you'll avoid deep-linking stalling the app and meaning the user sees a static (bot friendly) page instead. – Tersosauros Sep 26 '15 at 08:51