Articles on React.js like to point out, that React.js is great for SEO purposes. Unfortunately, I've never read, how you actually do it.
Do you simply implement _escaped_fragment_ as in https://developers.google.com/webmasters/ajax-crawling/docs/getting-started and let React render the page on the server, when the url contains _escaped_fragment_, or is there more to it?
Being able not to rely on _escaped_fragment_ would be great, as probably not all potentially crawling sites (e.g. in sharing functionalities) implement _escaped_fragment_.
I'm pretty sure anything you've seen promoting React as being good for SEO has to do with being able to render the requested page on the server, before sending it to the client. So it will be indexed just like any other static page, as far as search engines are concerned.
Server rendering made possible via ReactDOMServer.renderToString. The visitor will receive the already rendered page of markup, which the React application will detect once it has downloaded and run. Instead of replacing the content when ReactDOM.render is called, it will just add the event bindings. For the rest of the visit, the React application will take over and further pages will be rendered on the client.
If you are interested in learning more about this, I suggest searching for "Universal JavaScript" or "Universal React" (formerly known as "isomorphic react"), as this is becoming the term for JavaScript applications that use a single code base to render on both the server and client.
As the other responder said, what you are looking for is an Isomorphic approach. This allows the page to come from the server with the rendered content that will be parsed by search engines. As another commenter mentioned, this might make it seem like you are stuck using node.js as your server-side language. While it is true that have javascript run on the server is needed to make this work, you do not have to do everything in node. For example, this article discusses how to achieve an isomorphic page using Scala and react:
Isomorphic Web Design with React and Scala
That article also outlines the UX and SEO benefits of this sort of isomorphic approach.
Two nice example implementations:
https://github.com/erikras/react-redux-universal-hot-example: Uses Redux, my favorite app state management framework
https://github.com/webpack/react-starter: Uses Flux, and has a very elaborate webpack setup.
Try visiting https://react-redux.herokuapp.com/ with javascript turned on and off, and watch the network in the browser dev tools to see the difference…
Going to have to disagree with a lot of the answers here since I managed to get my client-side React App working with googlebot with absolutely no SSR.
Have a look at the SO answer here. I only managed to get it working recently but I can confirm that there are no problems so far and googlebot can actually perform the API calls and index the returned content.
It is also possible via ReactDOMServer.renderToStaticMarkup:
Similar to renderToString, except this doesn't create extra DOM
attributes such as data-react-id, that React uses internally. This is
useful if you want to use React as a simple static page generator, as
stripping away the extra attributes can save lots of bytes.
There is nothing you need to do if you care about your site's rank on Google, because Google's crawler could handle JavaScript very well! You can check your site's SEO result by search site:your-site-url.
If you also care about your site's rank such as Baidu, and your sever side implemented by PHP, maybe you need this: react-php-v8js.
Related
I have a Back-end web application that provides me with custom API endpoints (Java - Spring). I like to keep everything separate. one API application that provides everything else remotely. My question is: What is the best practice to start a new Front-end project that connects to my API?
Requirements:
The Front-end project should be on a different server
The Front-end project should support routing, meaning I will have full control regarding the /paths. so no .extensions at the end.
SEO is very important in this specific case.
My preference is to go with React.js but I have doubts regarding SEO because the project I want to migrate from is WordPress (up and running with a good SEO performance).
I wish that I can find a simple solution with pure HTML, CSS and some kind of JavaScript.
Thank you.
React isn't actually bad for SEO. So long as you're taking the proper steps to ensure that the page load time isn't bad. If the site that you're migrating is massive, make sure you're lazy loading.
If you have doubts that Google or other search engines will render the js, then I suggest going with Nextjs like Rakesh K mentioned.
There's also nothing wrong with recreating the site with a templating language like Handlebars, then rendering it on an Express server, or whatever suits you. Just including this option in case you don't know React, and don't want to have to learn it.
I have a large, globalised web site (not a web app), with 50k+ pages of content which is rendered on a cluster of servers using quite straightforward NodeJS + Nunjucks to generate HTML. For 90% of the site, this is perfectly acceptable, and necessary to achieve SEO visibility, particularly in non-Google search engines which don't index JS well (Yandex, Baidu, etc)
The site is a bit clunky as complexity has increased over time, and I'd like to re-architect some of the functional components that are built mostly using progressively enhanced jQuery as they are quite clunky. I've been looking at React for this with the Redux implementation of the Flux pattern.
Now my question is simply around the following - nearly 100% of the tutorials assume I'm building some sort of SPA, which I'm not. I just want to build a set of containerised reusable components that I can plug into replace the jQuery components. Oh, they have to be WCAG AA/508 accessible as well
Does React play well with being retrofitted into websites and are there any specific considerations around SEO, bootstrapping, accessibility? Examples of implementations or tutorials would be appreciated.
You can mount react component to any DOM Node on your page, so it makes it easy to insert components in statically generated content.
Most of search engines like google would wait for js files to load before they index the page so it will index a page with react component perfectly fine. However if you want to be 100% sure that your page rendered correctly by all crawling bots you have to take a look at react server rendering. If you already use NodeJS for a backend it should not be a big problem.
I never encountered with that kind of problem but my best guess would be to use ReactDOMServer.renderToString to render component on the server and then replace a node in your static html layout. The implementation would depend on you template lang you use. You can use something like handlebars to dynamically create halpers from React Components. So in your static html page you would be able to use them as {{my-component}} But it's only my speculations on that subject, may be there is more elegant solution.
Here is the article that could help.
You'll be happy to know that this is all possible through something called isomorphic javascript. Basically you'll just use React and jsx to render HTML on the server which is then sent to the browser as a fully built web page. This does not assume your app is an SPA, rather that you'll have multiple endpoints for rendering different pages, much like you already have probably.
The benefit here is that you can use the React/Redux architecture but still allow you site to be indexable by crawlers, as requests to your app will yield static pages, not a single page with lots of JS to make it work. You're also free to gradually refactor by converting your Nunjucks rendered endpoints to React one at a time, instead of a big jump to SPA land.
Here's a good tutorial I found on making isomorphic React apps with node:
https://strongloop.com/strongblog/node-js-react-isomorphic-javascript-why-it-matters/
EDIT: I may have misread your actual desire which is to inject React components into your existing web pages. This is also possible, you'll probably want to use ReactDOM to render your components to static markup, and then you can inject that markup string into your Nunjucks via templating.
How you can see in React manual (ReactDOMServer):
If you call ReactDOM.render() on a node that already has this
server-rendered markup, React will preserve it and only attach event
handlers, allowing you to have a very performant first-load
experience.
So does it mean that if I use static index.html in which I just include my react app js file I don't have to use server-side rendering?
Btw which of react-app architecture better for SEO?
Thanks for you answers!
In theory, it's true that you can use static index.html. React will try to render the page on the client side and update your html. This has become much easier to do with React 15 as you no longer need to maintain data-reactid attributes.
Nonetheless, I'd recommend using SSR (server side rendering) because it makes life easier. Granted, it takes effort to set up but it's beneficial. You also get to make use of server side routing, critical path css, and more.
If you want SEO, universal apps are the way to go. Two excellent architectures are:
React redux universal hot example
React starter kit
Good luck!
So does it mean that if I use static index.html in which I just
include my react app js file I don't have to use server-side
rendering?
Of course. You can certainly use React purely on the client without any need for server rendering. However server side rendering can be beneficial for graceful degradation. It also helps from a usability perspective as your user won't have to wait for javascript to be downloaded and executed before any content can be shown.
Btw which of react-app architecture better for SEO?
Now search engines have significantly matured in their ability to crawl dynamic pages. However the support for javascript generated content is a work in progress in most engines. As Google Webmasters blog explains:
Sometimes the JavaScript may be too complex or arcane for us to execute, in which case we can’t render the page fully and accurately.
Some JavaScript removes content from the page rather than adding, which prevents us from indexing the content.
So from SEO perspective it is still better if you opt for server side rendering.
I did my homework and read through this mini series about pushstate:
http://lostechies.com/derickbailey/2011/09/26/seo-and-accessibility-with-html5-pushstate-part-2-progressive-enhancement-with-backbone-js/
From what understand the hard part of implementing push state is making sure that my server side is going to serve the actual pages for the corresponding urls.
I feel like this is going to be a HUGE task, previously I was just sending a simple jade page as simple as:
body
header
section
div#main
footer.site-footer
div.footer-icons.footer-element
div.footer-element
span.footer-link Contact Us
span.footer-link Terms of Service
script(src='/javascripts/lib/require.js', data-main='/javascripts/application.js')
and I was doing all the rendering with my Marionette Layouts and Composite Views, and to be honest it was a bit complicated.
So from what I understand I need to replicate all that complicated nesting/rendering using jade on the server side for pushState to work properly?
I used underscore templates in the client-side, what is an easy way to re-use them on the server side?
I depends on what you want to do...
To "just" use pushState, the only requirement is that your server returns a valid page for each URL that can be reached by your app. However, the content returned by the server does NOT have to match what will get rendered client side. In other words, you could use a "catch all" route on the server side that always returns the page you have above, and then let Backbone/Marionette trigger its route to handle the rendering and display.
That said, if you want to use pushState for SEO, you likely want to have the static HTML sent by the server on the first call, then have the Marionette app start to enhance the interactivity. In this case, it is much more complex and you might want to experiment with using options to trigger the proper behavior (e.g. using attachView when enhancing existing HTML, showing views normally after that initial case).
Push state can work properly WITHOUT your server actually serving your application in the way that is suggested.
Push state is merely an alternative to hashbang url's, and it is supported in modern browsers. Check out the history docs here, you will see there is no mention of having your site serve your application statically at the url's for your application (but bear in mind it is opt-in).
What the article you reference is saying, is that for good SEO, you should do this. That's because you cannot guarantee when a search engine crawls your site, that it will execute your javascript, and pick up your routes etc. So serving the site statically is simply to give the search engine a way to get your content without executing any javascript.
Like you say, by doing this you are essentially building two sites in parallel, and it does literally double the amount of work you need to do. This may be ok if you're building a relatively simple site filled with static content, but if you are creating a complicated application, then it is probably too much in most situations.
Although I would add, if you are building an application, then SEO doesn't really matter, so it's a null point.
Either my google-fu has failed me or there really aren't too many people doing this yet. As you know, Backbone.js has an achilles heel--it cannot serve the html it renders to page crawlers such as googlebot because they do not run JavaScript (although given that its Google with their resources, V8 engine, and the sobering fact that JavaScript applications are on the rise, I expect this to someday happen). I'm aware that Google has a hashbang workaround policy but it's simply a bad idea. Plus, I'm using PushState. This is an extremely important issue for me and I would expect it to be for others as well. SEO is something that cannot be ignored and thus cannot be considered for many applications out there that require or depend on it.
Enter node.js. I'm only just starting to get into this craze but it seems possible to have the same Backbone.js app that exists on the client be on the server holding hands with node.js. node.js would then be able to serve html rendered from the Backbone.js app to page crawlers. It seems feasible but I'm looking for someone who is more experienced with node.js or even better, someone who has actually done this, to advise me on this.
What steps do I need to take to allow me to use node.js to serve my Backbone.js app to web crawlers? Also, my Backbone app consumes an API that is written in Rails which I think would make this less of a headache.
EDIT: I failed to mention that I already have a production app written in Backbone.js. I'm looking to apply this technique to that app.
First of all, let me add a disclaimer that I think this use of node.js is a bad idea. Second disclaimer: I've done similar hacks, but just for the purpose of automated testing, not crawlers.
With that out of the way, let's go. If you intend to run your client-side app on server, you'll need to recreate the browser environment on your server:
Most obviously, you're missing the DOM (Document Object Model) - basically the AST on top of your parsed HTML document. The node.js solution for this is jsdom.
That however will not suffice. Your browser also exposes BOM (Browser Object Model) - access to browser features like, for example, history.pushState. This is where it gets tricky. There are two options: you can try to bend phantomjs or casperjs to run your app and then scrape the HTML off it. It's fragile since you're running a huge full WebKit browser with the UI parts sawed off.
The other option is Zombie - which is lightweight re-implementation of browser features in Javascript. According to the page it supports pushState, but my experience is that the browser emulation is far from complete - however give it a try and see how far you get.
I'm going to leave it to you to decide whether pushing your rendering engine to the server side is a sound decision.
Because Nodejs is built on V8 (Chrome's engine) it will run javascript, like Backbone.js. Creating your models and so forth would be done in exactly the same way.
The Nodejs environment of course lacks a DOM. So this is the part you need to recreate. I believe the most popular module is:
https://github.com/tmpvar/jsdom
Once you have an accessible DOM api in Nodejs, you simply build its nodes as you would for a typical browser client (maybe using jQuery) and respond to server requests with rendered HTML (via $("myDOM").html() or similar).
I believe you can take a fallback strategy type approach. Consider what would happen with javascript turned off and a link clicked vs js on. Anything you do on your page that can be crawled should have some reasonable fallback procedure when javascript is turned off. Your links should always have the link to the server as the href, and the default action happening should be prevented with javascript.
I wouldn't say this is backbone's responsibility necessarily. I mean the only thing backbone can help you with here is modifying your URL when the page changes and for your models/collections to be both client and server side. The views and routers I believe would be strictly client side.
What you can do though is make your jade pages and partial renderable from the client side or server side with or without content injected. In this way the same page can be rendered in either way. That is if you replace a big chunk of your page and change the url then the html that you are grabbing can be from the same template as if someone directly went to that page.
When your server receives a request it should directly take you to that page rather than go through the main entry point and the load backbone and have it manipulate the page and set it up in a way that the user intends with the url.
I think you should be able to achieve this just by rearranging things in your app a bit. No real rewriting just a good amount of moving things around. You may need to write a controller that will serve you html files with content injected or not injected. This will serve to give your backbone app the html it needs to couple with the data from the models. Like I said those same templates can be used when you directly hit those links through the routers defined in express/node.js
This is on my todo list of things to do with our app: have Node.js parse the Backbone routes (stored in memory when the app starts) and at the very least serve the main pages template at straight HTML—anything more would probably be too much overhead /processing for the BE when you consider thousands of users hitting your site.
I believe Backbone apps like AirBnB do it this way as well but only for Robots like Google Crawler. You also need this situation for things like Facebook likes as Facebook sends out a crawler to read your og:tags.
Working solution is to use Backbone everywhere
https://github.com/Morriz/backbone-everywhere but it forces you to use Node as your backend.
Another alternative is to use the same templates on the server and front-end.
Front-end loads Mustache templates using require.js text plugin and the server also renders the page using the same Mustache templates.
Another addition is to also render bootstrapped module data in javascript tag as JSON data to be used immediately by Backbone to populate models and collections.
Basically you need to decide what it is that you're serving: is it a true app (i.e. something that could stand in as a replacement for a dedicated desktop application), or is it a presentation of content (i.e. classical "web page")? If you're concerned about SEO, it's likely that it's actually the latter ("content site") and in that case the "single-page app" model isn't appropriate; you really want the "progressively enhanced website" model instead (look up such phrases as "unobtrusive JavaScript", "progressive enhancement" and "adaptive Web design").
To amplify a little, "server sends only serialized data and client does all rendering" is only appropriate in the "true app" scenario. For the "content site" scenario, the appropriate model is "server does main rendering, client makes it look better and does some small-scale rendering to avoid disruptive page transitions when possible".
And, by the way, the objection that progressive enhancement means "making sure that a user can see doesn't get anything better than a blind user who uses text-to-speech" is an expression of political resentment, not reality. Progressively enhanced sites can be as fancy as you want them to from the perspective of a user with a high-end rendering system.