Using Push State in a Marionette Application for SEO

Using Push State in a Marionette Application for SEO - javascript

I did my homework and read through this mini series about pushstate:
http://lostechies.com/derickbailey/2011/09/26/seo-and-accessibility-with-html5-pushstate-part-2-progressive-enhancement-with-backbone-js/
From what understand the hard part of implementing push state is making sure that my server side is going to serve the actual pages for the corresponding urls.
I feel like this is going to be a HUGE task, previously I was just sending a simple jade page as simple as:
body
header
section
div#main
footer.site-footer
div.footer-icons.footer-element
div.footer-element
span.footer-link Contact Us
span.footer-link Terms of Service
script(src='/javascripts/lib/require.js', data-main='/javascripts/application.js')
and I was doing all the rendering with my Marionette Layouts and Composite Views, and to be honest it was a bit complicated.
So from what I understand I need to replicate all that complicated nesting/rendering using jade on the server side for pushState to work properly?
I used underscore templates in the client-side, what is an easy way to re-use them on the server side?

I depends on what you want to do...
To "just" use pushState, the only requirement is that your server returns a valid page for each URL that can be reached by your app. However, the content returned by the server does NOT have to match what will get rendered client side. In other words, you could use a "catch all" route on the server side that always returns the page you have above, and then let Backbone/Marionette trigger its route to handle the rendering and display.
That said, if you want to use pushState for SEO, you likely want to have the static HTML sent by the server on the first call, then have the Marionette app start to enhance the interactivity. In this case, it is much more complex and you might want to experiment with using options to trigger the proper behavior (e.g. using attachView when enhancing existing HTML, showing views normally after that initial case).

Push state can work properly WITHOUT your server actually serving your application in the way that is suggested.
Push state is merely an alternative to hashbang url's, and it is supported in modern browsers. Check out the history docs here, you will see there is no mention of having your site serve your application statically at the url's for your application (but bear in mind it is opt-in).
What the article you reference is saying, is that for good SEO, you should do this. That's because you cannot guarantee when a search engine crawls your site, that it will execute your javascript, and pick up your routes etc. So serving the site statically is simply to give the search engine a way to get your content without executing any javascript.
Like you say, by doing this you are essentially building two sites in parallel, and it does literally double the amount of work you need to do. This may be ok if you're building a relatively simple site filled with static content, but if you are creating a complicated application, then it is probably too much in most situations.
Although I would add, if you are building an application, then SEO doesn't really matter, so it's a null point.

Related

PHP template system vs javascript AJAX template

My PHP template looks like this:
$html=file_get_contents("/path/to/file.html");
$replace=array(
"{title}"=>"Title of my webpage",
"{other}"=>"Other information",
...
);
foreach(replace AS $search=>$replace){
$html=str_replace($search,$replace,$html);
}
echo $html;
I am considering switching to a javascript/ajax template system. The AJAX will fetch the $replace array in JSON format and then I'll use javascript to replace the HTML.
The page would then be a plain .html file and a loading screen would be shown until the ajax was complete.
Is there any real advantages to this or is the transition a waste of time?
A few of the reasons I think this will be beneficial:
Page will still load even if the Mysql or PHP services are down. If the ajax fails I can handle it with an error message.
Bot traffic (and anything else that doesnt run JS) will cause very little load to my server since the ajax will never be sent.
Please let me know what your thoughts are.

My 2cents is it is better to do the logic on the template side (javascript). If you have a high traffic site you can off load some of the processing to each computer calling the site. Maybe less servers.
With Javascript frameworks like AngularJs the template stuff is pretty simple and efficient. And the framework will do caching for you.
Yes, SEO can be an issue with certain sites. There at proxy tools you can put in place that will render the site and return the static html to the bot. Plus I think some bots render javascript these days.
Lastly, I like to template on the front-end because I like the backend to be a generic data provider (RESTful API). This way I can build a generic backend that drives web / mobile and other platforms in a generic way. The UI logic can be its separate thing in javascript.
But it comes down to the design needs of your application. I build lots of Software as a service applications so a single page application works well for me.

I've worked with similar design pattern in other projects. There are several ways to do this and the task would involve managing multiple project or application modules. I am assume you are working with a team of developers and not using either PHP or JavaScript MVC framework.
PHP Template
For many reasons, I'm against using “search and replace” method especially using server-side scripting language to parse HTML documents as a templating kit.
Why?
As you maintain business rules and project becomes larger, you will
find yourself reading through a long list of regular expressions,
parse HTML into DOM, and/or complicated algorithms for searching
nodes to replace with correct text(s).
If you had a placeholder, such as {title}, that would help the
script to have fewer search and replace expressions but the design
pattern could lead to messy sharing with multiple developers.
It is ok to parse one or two HTML files to manage the output but not
the entire template. The network response could be slower with
multiple and repetitive trips to server and that's just only for
template. There could be other scripts that is also making trips to
the server for different reason unrelated to template.
AJAX/JavaScript
Initially, AJAX with JavaScript might sound like a neat idea but I'm still not convinced.
Why?
You can't assume web browser is JavaScript-enabled in every mobile
or desktop. You might need to structure the HTML template in few
ways to manage the output for non-JavaScript browsers. You might
need to include <noscript> and/or <iframe> tags on every page. And,
managing alternative template for non-JavaScript browser can be
tedious.
Every web browser interpret JavaScript differently. Most developers
should know few differences between IE, FireFox, Chrome, Safari, and
to name few. You might need to create multiple JavaScript files to
detect then load JavaScript for that specific web browser. You
update one feature, you have to update script for all web browsers.
JavaScript is visible in page source. I wouldn't want to display
confidential JavaScript functions that might include credentials,
leak sensitive data about web services, and/or SQL queries. The idea
is to secure your page as much as possible.
I'm not saying both are impossible. You could still do either PHP or JavaScript for templating.
However, my “ideal” web structure should consist of a reliable MVC like Zend, Spring, or Magnolia. Those MVC framework include many useful features such as web services, data mapping, and templating kits. Granted, it's difficult for beginners with configuration requirements to integrate MVC into your project. But in the end, you could delegate tasks in configurations, MVC concepts, custom SQL queries, and test cases to developers. That's my two cents.

I think the most important aspects you forgot are:
SEO : What about search engine bots ? They wont be able to index your content if it is set by javascript only.
Execution and Network Latency : When your service is working, the browser will wait until the page is loaded (let's say 800ms) before making the extra Ajax calls to get your values. This might add an extra 500ms to get it (depending on network speed and geographic location...). If you have sent all the generated data by your server, you would have spent only ~1ms more to prepare the complete response. You would have a lot of waiting on a blank page.
Caching : You could cache the generated pages on your web app. That way your load will be minimized as well. And also, if you still want to deliver content while your backend services (MySQL/PHP..) are down you could even use Apache or Nginx caching.
But I guess it really depends on what you want to do.
For fast and simple pages, which seems to be your case, stick with backend enhancements.
For a dynamic/interactive app which can afford loading times, and doesn't care about SEO, you can delegate most things to the front-end. But then use an advanced framework like Angular, to handle templating, caching, etc...

How do you use React.js for SEO?

Articles on React.js like to point out, that React.js is great for SEO purposes. Unfortunately, I've never read, how you actually do it.
Do you simply implement _escaped_fragment_ as in https://developers.google.com/webmasters/ajax-crawling/docs/getting-started and let React render the page on the server, when the url contains _escaped_fragment_, or is there more to it?
Being able not to rely on _escaped_fragment_ would be great, as probably not all potentially crawling sites (e.g. in sharing functionalities) implement _escaped_fragment_.

I'm pretty sure anything you've seen promoting React as being good for SEO has to do with being able to render the requested page on the server, before sending it to the client. So it will be indexed just like any other static page, as far as search engines are concerned.
Server rendering made possible via ReactDOMServer.renderToString. The visitor will receive the already rendered page of markup, which the React application will detect once it has downloaded and run. Instead of replacing the content when ReactDOM.render is called, it will just add the event bindings. For the rest of the visit, the React application will take over and further pages will be rendered on the client.
If you are interested in learning more about this, I suggest searching for "Universal JavaScript" or "Universal React" (formerly known as "isomorphic react"), as this is becoming the term for JavaScript applications that use a single code base to render on both the server and client.

As the other responder said, what you are looking for is an Isomorphic approach. This allows the page to come from the server with the rendered content that will be parsed by search engines. As another commenter mentioned, this might make it seem like you are stuck using node.js as your server-side language. While it is true that have javascript run on the server is needed to make this work, you do not have to do everything in node. For example, this article discusses how to achieve an isomorphic page using Scala and react:
Isomorphic Web Design with React and Scala
That article also outlines the UX and SEO benefits of this sort of isomorphic approach.

Two nice example implementations:
https://github.com/erikras/react-redux-universal-hot-example: Uses Redux, my favorite app state management framework
https://github.com/webpack/react-starter: Uses Flux, and has a very elaborate webpack setup.
Try visiting https://react-redux.herokuapp.com/ with javascript turned on and off, and watch the network in the browser dev tools to see the difference…

Going to have to disagree with a lot of the answers here since I managed to get my client-side React App working with googlebot with absolutely no SSR.
Have a look at the SO answer here. I only managed to get it working recently but I can confirm that there are no problems so far and googlebot can actually perform the API calls and index the returned content.

It is also possible via ReactDOMServer.renderToStaticMarkup:
Similar to renderToString, except this doesn't create extra DOM
attributes such as data-react-id, that React uses internally. This is
useful if you want to use React as a simple static page generator, as
stripping away the extra attributes can save lots of bytes.

There is nothing you need to do if you care about your site's rank on Google, because Google's crawler could handle JavaScript very well! You can check your site's SEO result by search site:your-site-url.
If you also care about your site's rank such as Baidu, and your sever side implemented by PHP, maybe you need this: react-php-v8js.

Using node.js to serve content from a Backbone.js app to search crawlers for SEO

Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.
Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.
What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.
EDIT: I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.

First of all, let me add a disclaimer that I think this use of node.js is a bad idea. Second disclaimer: I've done similar hacks, but just for the purpose of automated testing, not crawlers.
With that out of the way, let's go. If you intend to run your client-side app on server, you'll need to recreate the browser environment on your server:
Most obviously, you're missing the DOM (Document Object Model) - basically the AST on top of your parsed HTML document. The node.js solution for this is jsdom.
That however will not suffice. Your browser also exposes BOM (Browser Object Model) - access to browser features like, for example, history.pushState. This is where it gets tricky. There are two options: you can try to bend phantomjs or casperjs to run your app and then scrape the HTML off it. It's fragile since you're running a huge full WebKit browser with the UI parts sawed off.
The other option is Zombie - which is lightweight re-implementation of browser features in Javascript. According to the page it supports pushState, but my experience is that the browser emulation is far from complete - however give it a try and see how far you get.

I'm going to leave it to you to decide whether pushing your rendering engine to the server side is a sound decision.
Because Nodejs is built on V8 (Chrome's engine) it will run javascript, like Backbone.js. Creating your models and so forth would be done in exactly the same way.
The Nodejs environment of course lacks a DOM. So this is the part you need to recreate. I believe the most popular module is:
https://github.com/tmpvar/jsdom
Once you have an accessible DOM api in Nodejs, you simply build its nodes as you would for a typical browser client (maybe using jQuery) and respond to server requests with rendered HTML (via $("myDOM").html() or similar).

I believe you can take a fallback strategy type approach. Consider what would happen with javascript turned off and a link clicked vs js on. Anything you do on your page that can be crawled should have some reasonable fallback procedure when javascript is turned off. Your links should always have the link to the server as the href, and the default action happening should be prevented with javascript.
I wouldn't say this is backbone's responsibility necessarily. I mean the only thing backbone can help you with here is modifying your URL when the page changes and for your models/collections to be both client and server side. The views and routers I believe would be strictly client side.
What you can do though is make your jade pages and partial renderable from the client side or server side with or without content injected. In this way the same page can be rendered in either way. That is if you replace a big chunk of your page and change the url then the html that you are grabbing can be from the same template as if someone directly went to that page.
When your server receives a request it should directly take you to that page rather than go through the main entry point and the load backbone and have it manipulate the page and set it up in a way that the user intends with the url.
I think you should be able to achieve this just by rearranging things in your app a bit. No real rewriting just a good amount of moving things around. You may need to write a controller that will serve you html files with content injected or not injected. This will serve to give your backbone app the html it needs to couple with the data from the models. Like I said those same templates can be used when you directly hit those links through the routers defined in express/node.js

This is on my todo list of things to do with our app: have Node.js parse the Backbone routes (stored in memory when the app starts) and at the very least serve the main pages template at straight HTML—anything more would probably be too much overhead /processing for the BE when you consider thousands of users hitting your site.
I believe Backbone apps like AirBnB do it this way as well but only for Robots like Google Crawler. You also need this situation for things like Facebook likes as Facebook sends out a crawler to read your og:tags.

Working solution is to use Backbone everywhere
https://github.com/Morriz/backbone-everywhere but it forces you to use Node as your backend.
Another alternative is to use the same templates on the server and front-end.
Front-end loads Mustache templates using require.js text plugin and the server also renders the page using the same Mustache templates.
Another addition is to also render bootstrapped module data in javascript tag as JSON data to be used immediately by Backbone to populate models and collections.

Basically you need to decide what it is that you're serving: is it a true app (i.e. something that could stand in as a replacement for a dedicated desktop application), or is it a presentation of content (i.e. classical "web page")? If you're concerned about SEO, it's likely that it's actually the latter ("content site") and in that case the "single-page app" model isn't appropriate; you really want the "progressively enhanced website" model instead (look up such phrases as "unobtrusive JavaScript", "progressive enhancement" and "adaptive Web design").
To amplify a little, "server sends only serialized data and client does all rendering" is only appropriate in the "true app" scenario. For the "content site" scenario, the appropriate model is "server does main rendering, client makes it look better and does some small-scale rendering to avoid disruptive page transitions when possible".
And, by the way, the objection that progressive enhancement means "making sure that a user can see doesn't get anything better than a blind user who uses text-to-speech" is an expression of political resentment, not reality. Progressively enhanced sites can be as fancy as you want them to from the perspective of a user with a high-end rendering system.

Advantages / Disadvantages to websites generated with Javascript

Two good examples would be google and facebook.
I have lately pondered the motivation for this approach. My best guess would be it almost completely separates the logic between your back-end language and the markup. Building up an array to send over in JSON format seems like a tidy way to maintain code, but what other elements am I missing here?
What are the advantages / disadvantages to this approach, and why are such large scale companies doing it?

The main disadvantage is that you have some pain with content indexation of your site.
For Google you can somewhere solve the problem by using Crawling scheme. Google supports crawling that allows you to index dynamically (without page reload) generated content of your page.
To do this your virtual links must be addresses like so: http://yoursite.com/#!/register/. In this case Google requests to http://yoursite/register/ to index content of the address.
When clicking on virtual link there is no page reload. You can provide this by using onclick:
<a href='http://yoursite.com/#!/register/' onclick='showRegister()'>Register</a>
Virtual advantage is that content of a page changed without reloading of the page. In my practice I do not use Javascript generation to do this because I build my interface in fixed positions. When page reloads user does not notice anything because elements of the interface appears in expected places.
So, my opinion that using of dynamic page generation is a big pain. I think Google did it not to separate markup and backend (it's not a real problem, you can use complex structure of backend-frontend to do that) but to use advantages of convenient and nice representation for users.

Advantages
View state is kept on the client (removing load from the server)
Partial refreshes of pages
Server does not need to know about HTML which leads to a Service Oriented Architecture
Disadvantages
Bookmarking (state in the URL) is harder to implement
Making it searchable is still a work in progress
Need a separate scheme to support non-JS users

I don't 100% understand your question, but I'll try my best here...
Google and Facebook both extensively use JavaScript across all of their websites and products. Every major website on the web uses it.
JavaScript is the technology used to modify the behavior of websites.
HTML => defines structure and elements
CSS => styling the elements
Scripting languages => dynamically generating elements and filling them with data
JavaScript => modifies all of the above by interacting with the DOM, responding to events, and styling elements on the fly
This is the 'approach' as you call it to every website on the web today. There are no alternatives to JavaScript/HTML/CSS. You can change the database or scripting language used, but JavaScript/HTML/CSS is a constant.

Consider an example of a simple form validation ...
client sends a request to a server ... the server will execute the server side code containing validation logic and in a response ...the server will send the result to the client ....
if the client has the capability to execute/process (that can be executed on the client side ...) the form ...(perform validation)..the client wont need send request to the server ...and wait for the server to respond to that request ...

i suggest you to take a look at Google Page Speed best practice http://code.google.com/intl/it-IT/speed/page-speed/ to see what are the factors that makes a good page ... generating a page with javascript seems cool because of separation of ui and logic , but it is a totally inefficient in practice

Render partial view on the server or send json data and render template on the client

I was wondering what would be a good approach (or recommended approach) to render partial views in a web application.
I have a requirement where I need to load data into an already rendered page using AJAX, sort of like a "Load more..." link at the end of the page which grabs more information from the server and renders it to the bottom of the page.
The two options I am playing with at the moment for the AJAX response are
Return a JSON representation of the data, and use a client side template library, (e.g. jQuery templates) or just plain javascript to turn the JSON into HTML and append to the bottom of the page
Render the partial view on the server (in my case using grails' render template:'tmplt_name') and send it across the wire and just append the result to the bottom of the page
Are there other ways to do this? If not, given the above options which would be better in terms of maintenance, performance and testability? One thing I am certain about is that the JSON route will (in most cases) use up less bandwidth than sending html across the wire.

This is actually a really interesting question, because it exposes some interesting design decisions.
I prefer rendering partial templates because it gives my applications the ability to change with time. If I need to change from a <table> to a <div> with a chart, it's easy to encapsulate that in a template. With that in mind, I view almost every page as a aggregate of many small templates, which could change. Grails 2.0 default scaffolding has moved towards this type of approach, and it's a good idea.
The question as to whether they should be client-side templates or server-side is the crux of the issue.
Server side templates keep your markup cleaner on initial page load. Even if you use something like Mustache of ICanHazJS, you kinda sorta need to have an empty element in the page (something with your template), style it appropriately, and work with it in Javascript in order to get update your info.
Drawbacks
"chatty" applications
larger envelopes across the wire (responses include HTML, which may be considered static)
slower UI response time
Benefits
Good for people with lots of server side language experience
Maybe easier to manipulate or modify in a server-side environment (e.g. return a page embedded with multiple templates, which are programmatically loaded and included)
Keeps most of the application stuff "in one place"
Client-side templates can really reduce server-load, however. They make an application "less -chatty", in that potentially you could minimize the number of calls back to the server by sending larger collections of JSON back (in the same number of bytes, or fewer, that would be taken up by HTML in a server-side template scheme). They also make UI experience really fast for users, as clicking an "update" link doesn't have to do an AJAX round trip. Some have said:
Anthony Eden #aeden 10 Dec Reply Retweeted Favorite · Open
the future of web apps: requests are handled by functions, logic is always asynchronous, and HTML is never generated on the server.
Drawbacks
Not great for SEO (un-semantic extraneous UI elements on initial page load)
Requires some Javascript foo (to manipulate elements)
Benefits
- Responsive
- Smaller envelopes
Trends seem to be moving towards client side templates, especially with the power exposed by HTML5 additions (like <canvas>)...but if leveraging them requires you to rely on technologies you're not extremely familiar with, and you feel more comfortable with Grails partials, it may be worthwhile to start with those and investigate refactoring towards client side templates based on performance and other concerns later.

in my view the second option rendering partial is better because if you get a json data you are supposed to manipulate it like set style, create elements, set values and similar thing as per the need just after you get ajax response in javascript but if you render partial, you can design your view in that partial before and keep ready and just import using ajax call so there is no responsibility of handling the response data.

I would say it depends on how much data you're sending across the wire as you put it. If you're implementing a "load more" feature then it seems like you wouldn't want to take 5 seconds to load something. Google images is a good example of how quick that feature should be. If you know you'll never have that much data then rendering the partial on the server might be cleaner, but if you're requirements change then it's a hassle to go back to the first approach. So in short, I'd say the first approach allows for more disaster control as far as how long a load more takes to load a large amount of data. I'd say it's also general good practice to take load off server when you can, even if it's a little inconvenient on the client side for the developer.

Develop Reference

JavaScript is the programming language of the Web.