Angular.js and SEO - javascript

Angular.js and SEO - javascript

I'd like to create a site with Angular (I'm new), but also want to be able to have different "views" be cachable in the search engines and have their own URL routes. How would I achieve this with Angular, or is best not to use it?

Enable pushState in Angular with $locationProvider.html5Mode(true); so that you have real URLs and make sure that, when the URL is requested by the client, you deliver the complete page for that URL from the server (and not a set of empty templates that you populate with JS).
When a link is followed, you'll go through an Angular view and update the existing DOM (while changing the URL with pushState) but the initial load should be a complete page.
This does mean duplicating effort (you need client and server side versions of the code for building each page). Isomorphic JS is popular for dealing with that issue.

If you want to expose Angular views to search engines and other bots, I suggest using an open source framework that we developed at Say Media. It uses node.js to render the pages on the server when it detects a bot vs a real user. You can find it here:
https://github.com/saymedia/angularjs-server
I would suggest not using different routes, however, as most search engines will penalize you for having duplicate content on multiple urls. And while you might think they would just hit the bot version of your site, they are getting more sophisticated about crawling single page app like sites. I would be cautious about duplicate routes for the same content.
Good Luck!

Related

Dynamic URL routes for Static site in Vue

I have a static site, there is no way to add in re-writes via htaccess or similar which is how would normally approach this functionality. We're running the site with Vue, on top of static .html templates eg
\example\index.html
So I can visit www.mywebsite.com/example/ and it'll load the page, and run Vue, when I want a subpage, based on this layout, I currently have to create
\example\subpage\index.html
Again this works great www.mywebsite.com/example/subpage/, but what I'm wanting is to pull data in via an API feed, and be able to have dynamic URLs
\example\subpage\any-page-name-here
The closest I've found is to use # so
\example\subpage#any-page-name-here
Which allows Vue to pick up the data after the # and query the API with that.
Any help would be greatly appreciated, there's no work around for the limitations of the hosting, so I need a Vue/JS/HTML only soltion.
Thanks!

As you cannot change the web server configuration, the only possibilities are the hashtag option or the query string e.g
example.com/site/?dynamic-data
The reason is the web server decides what to do with the request in the first instance, and without any configuration it will simply load a page if it exists or show a 404. This happens before your Vue app is invoked.

How to simply add pages without bogging down routes? (express.js)

I am working on a site with Express.js, and like it very much. I have it operating stably, but would like users to be able to add pages to the site (via a form, where it will have set fields, or via uploading a jade file). Preferably I would also like a moderation queue. Failing this, how can I add pages without having to add an entry to index.js for the route every time? If I add lots of pages, won't this make it slow?
Sorry for the wall of questions, and thanks in advance for any help!
EDIT: It's been requested that I narrow the query, so here goes:
I would like to add a web interface to Express.js that allows users to fill in a form and add a page to the website under a certain path. I would like a sort of "moderation queue" where I approve pages before they go live. I cannot find any sort of information on this use case. How do I do it? Thanks.

First and foremost, you will need to get yourself a database where the moderation queue can sit and wait to be processed. The specific methodology of how to structure this database, and how to integrate this data into pages that can be delivered will depend on your choice of database and view engine.
After you have set up this system, you can use express's route parameters so that you do not have to write out all the possible routes into your scripts. Your express app can take the route parameters, look up the relevant data in your database, integrate this into a page using your view engine, and have express deliver this page to your client.
I would recommend giving express's guide on routing a thorough read as well as doing some more research into databases, and view engines.

Advantage and Implementation of Angular Universal?

Ancient website:
User navigates to url via bar or href, server call is made for that particular page
The page is returned (either static html or html rendered on server by ASP.NET
MVC, etc
EVERY page reloads everything, slow, reason to go to SPA
Angular 2 SPA:
User navigates to url via bar or router
A server call is made for the component's html/javascript
ONLY the stuff within the router outlet is loaded, not the navbar, etc (main advantage of SPAs)
HOWEVER, html is not actually received from server as is, Angular 2 code/markup is - then this markup is processed on the CLIENT before it can be displayed as plain HTML, which the browser can understand - SLOW? Enter Angular Universal?
Angular Universal:
First time users of your application will instantly see a server rendered view which greatly improves perceived performance and the overall user experience.
So, in short:
User navigates to url via search bar or router
Instead of returning Angular components, Angular Universal actually turns those components into html AND then sends them to the client. This is the ONLY advantage.
TLDR:
Is my understanding of what Angular Universal does correct? (last bullet point above).
And most importantly, assuming I understand what it does, how does it achieve this? My understanding is that IIS or whatever just returns requested resources, so how does Angular Universal pre-process them (edit: I would basically running something akin to an API that returns processed html)?
This HAS to mean that the server makes all the initial API calls needed to display the initial view, for example from route resolves...correct?
Edit: Let's focus on this approach to narrow down the question:
The second approach is to dynamically re-render your application on a web server for each request. There are still several caching options available with this approach to improve scalability and performance, but you would be running your application code within the context of Angular Universal for each request.
The approach here:
The first option is to pre-render your application which means that you would use one of the Universal build tools (i.e. gulp, grunt, broccoli, webpack, etc.) to generate static HTML for all your routes at build time. Then you could deploy that static HTML to a CDN.
is beyond me. Seeing how there is a bunch of dynamic content there, yet we preload static html.

Angular app spanning subdomains

So I am building an angular app that allows people to create books and share them (digital books mind you) with a subdomain link.
So something like mycoolbook.theappsite.com would be a sharable link.
I (perhaps stupidly) built the routes so that editing books would be at the url "mycoolbook.theappsite.com/settings".
This being an angular page I am having to do hard redirects between those pages and so miss out on much of the SPA-y goodness. Is there a way to keep the app instance running between those pages?
If not I might move all the admin pages back behind the url like "theappsite.com/book/mycoolbook/settings" instead.
Is this at all possible?
I've already done all the hard work of getting sessions and ajax request working across the domains, it's just the state linking that becomes bothersome.

Short answer is no and have the URL change to reflect it. You cannot change book.domain.com -> domain.com because angular manipulates the URL, but only the fragment section of the URL in hash mode and just the path, search string, hash in HTML5 Mode. Not the other parts of the URL. If your application is using HTML5 mode your server must be able to map URLs properly so they return the correct page (ie index.html) as you change the URL. That would mean both DNS locations would have to send back the same page.
Now you can send AJAX requests between the two domains provided you understand how to deal with cross domain issues (JSONP, CORS, etc).

How can I make an indexable website that uses Javascript router?

I have been working on a project that uses Backbone.js router and all data is loaded by javascript via restful requests. I know that there is no way to detect whether Javascript is enabled or not in server-side but here is the scenarios that I thought to make this website indexable:
I can append a query string for each link on sitemap.xml and I can put a <script> tag to detect whether Javascript is enabled or not. The server renders this page with indexable data and when a user visits this page I can manually initialize Backbone.js router. However the problem is I need to execute an sql query to render indexable data in server-side and it will cause an extra load if the visitor is not a bot. And when users share an url of the website somewhere, it won't be an indexable page and web crawlers may not identify the content of that url. And an extra string in web crawler's search page may be annoying for users.
I can detect popular web crawlers like Google, Yahoo, Bing, Facebook in server-side from their user-agents but I suspect that there will be some web crawlers that I missed.
Which way seems more convenient or do you have any idea & experience to make indexable this kind of websites?

As elias94xx suggested in his comment, one solid solution to this dilemma is to take advantage of Google's "AJAX crawling". In short Google told the web community "look we're not going to actually render your JS code for you, but if you want to render it server-side for us, we'll do our best to make it easy on you." They do that with two basic concepts: pretty URL => ugly URL translation and HTML snapshots.
1) Google implemented a syntax web developers could use to specify client-side URLs that could still be crawled. This syntax for these "pretty URLs", as Google calls them, is: www.example.com?myquery#!key1=value1&key2=value2.
When you use a URL with that with that format, Google won't try to crawl that exact URL. Instead, it will crawl the "ugly URL" equivalent: www.example.com?myquery&_escaped_fragment_=key1=value1%26key2=value2. Since that URL has a ? instead of a # this will of course result in a call to your server. Your server can then use the "HTML snapshot" technique.
2) The basics of that technique is that you have your web-server run a headless JS runner. When Google requests an "ugly URL" from your server, the server loads up your Backbone router code in the headless runner, and it generates (and then returns to Google) the same HTML that code would have generated had it been run client-side.
A full explanation of pretty=>ugly URLs can be found here:
https://developers.google.com/webmasters/ajax-crawling/docs/specification
A full explanation of HTML snapshots can be found here:
https://developers.google.com/webmasters/ajax-crawling/docs/html-snapshot
Oh, and while everything so far has been based on Google, Bing/Yahoo also adopted this syntax, as indicated by Squidoo here:
http://www.squidoo.com/ajax-crawling

Develop Reference

JavaScript is the programming language of the Web.