Pretty URL for static website

Pretty URL for static website - javascript

I'm creating a "static" website, and I would like to have url like this :
http://www.my-website/my-page-1 or http://www.my-website/my-page-2
I use the jQuery method load() to load the content of my web pages.
Here is my index.html :
<body>
<nav>
<ul >
<li id="my-page-1-menu-item" class="menu-item" onclick="changeState('my-page-1')">
<span><a>Link 1</a></span>
</li>
<li id="my-page-2-menu-item" class="menu-item" onclick="changeState('my-page-2')">
<span><a>Link 2</a></span>
</li>
</ul>
</nav>
<!-- Content, loaded with $.load() -->
<div id="content"></div>
</body>
I know how to change my url without reloading the page, using the history API
Here is my javascript :
function changeState(url) {
window.history.pushState(url, 'Title', url);
loadStateData(url);
}
function loadStateData(state) {
var content = '#content';
$(content).load( 'views/' + state + '.html');
// Manage menu style
$('.menu-item').removeClass('active');
$('#' + state + '-menu-item').addClass('active');
}
With this code I'm able to reach http://www.my-website/my-page-2 by clicking on Link 2, but if I refresh my webpage or if I type http://www.my-website/my-page-2 directly in the browser address bar, I got a 404 error.
I'm not very comfortable with .htaccess URL rewritting but I think it should solve my problem.
And I don't want to use AngularJS, or other framework like that (except if its a really lightweight framework)

You need to ensure that when you use JavaScript to:
Modify the DOM of the current page so it is effectively a different page
Change the URL to identify the different page
…that you also make sure that the server can generate that page itself as well.
That way if the JavaScript fails for any reason then everything will still work. This is a basic principle that best practise is to follow.
This will probably involve duplicating your logic server side, and that will probably be a lot of work. Robustness + client side performance hacks do not work together cheaply. Consider isomorphic JS and taking page snapshots with a headless browser as techniqes to speed this up.
I'm not very comfortable with .htaccess URL rewritting but I think it should solve my problem.
It certainly won't solve it well. You could use it to divert every request back to your homepage (so you have duplicate content on every URL) and then use client side JS to read location and work out what content to load but then you might as well use hashbangs since you'll have thrown out every advantage of using pushState but added duplicate content URLs to your site (and invested a pile of work into creating those duplicate content URLs).

Doing that without any server-side rewrites sounds impossible. When you refresh the page, HTTP server needs to give you contents of proper file, based on URL.
But.. if you really want to make URL rewriting client side you can make one, universal rewrite as it is often done in case of PHP-based websites that do rewriting internally (not by .htaccess or nginx rules, but using PHP script).
When there's no such file like "/my-page-2" on webserver, just redirect client to /index.html?url=/my-page-2 and dispatch to proper subpage using JavaScript.

Related

AngularJS - SEO - S3 Static Pages

My application uses AngularJS for frontend and .NET for the backend.
In my application I have a list view. On clicking each list item, It will fetch a pre rendered HTML page from S3.
I am using angular state.
app.js
...
state('staticpage', {
url: "/staticpage",
templateUrl: function (){
return 'http://xxxxxxx.cloudfront.net/staticpage/staticpage1.html';
},
controller: 'StaticPageCtrl',
title: 'Static Page'
})
StaticPage1.html
<div>
Hello static world 1!
<div>
How do I do SEO here?
Do I really need to do HTML snapshot using PanthomJS or so.

Yes PhantomJS would do the trick or you can use prerender.io with that service you can just use their open source renderer and have your own server.
Another way is to use _escaped_fragment_ meta tag
I hope this helps, if you have any questions add comments and I will update my answer.

Do you know that google renders html pages and executes javascript code in the page and does not need any pre-rendering anymore?
https://webmasters.googleblog.com/2014/05/understanding-web-pages-better.html
And take a look at these :
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157
http://wijmo.com/blog/how-to-improve-seo-in-angularjs-applications/

My project front-end also has biult on top of Angular and I decieded to solve SEO issue like this:
I've created an endpiont for all search engines (SE) where all the requests go with _escaped_fragment_ parameter;
I parse a HTTP Request for _escaped_fragment_ GET parameter;
I make cURL request with parsed category and article parameters and get the article content;
Then I render a simpliest (and seo friendly) template for SE with the article content or throw a 404 Not Found Exception if article does not exists;
In total: I do not need to prerender some html pages or use prrender.io, have a nice user interface for my users and Search Engines index my pages very well.
P.S. Do not forget to generate sitemap.xml and include there all urls (with _escaped_fragment_) wich you want to be indexed.
P.P.S. Unfortunately my project's back-end has built on top of php and can not show you suitable example for you. But if you want more explanations do not hesitate to ask.

Firstly you can not assume anything.
Google does say that there bots can very well understand javascript application but that is not true for all scenarios.
Start from using crawl as google feature from the webmaster for your link and see if page is rendered properly. If yes, then you need not read further.
In case, you see just your skeleton HTML, this is because google bot assumes page load complete before it actually completes. To fix this you need an environment where you can recognize that a request is from a bot and you need to return it a prerendered page.
To create such environment, you need to make some changes in code.
Follow the instructions Setting up SEO with Angularjs and Phantomjs
or alternatively just write code in any server side language like PHP to generate prerendered HTML pages of your application.
(Phantomjs is not mandatory)
Create a redirect rule in your server config which detects the bot and redirects the bot to prerendered plain html files (Only thing you need to make sure is that the content of the page you return should match with the actual page content else bots might not consider the content authentic).
It is to be noted that you also need to consider how will you make entries to sitemap.xml dynamically when you have to add pages to your application in future.
In case you are not looking for such overhead and you are lacking time, you can surely follow a managed service like prerender.
Eventually bots will get matured and they would understand your application and you will say goodbye to your SEO proxy infrastructure. This is just for time being.

At this point in time, the question really becomes somewhat subjective, at least with Google -- it really depends on your specific site, like how quickly your pages render, how much content renders after the DOM loads, etc. Certainly (as #birju-shaw mentions) if Google can't read your page at all, you know you need to do something else.
Google has officially deprecated the _escaped_fragment_ approach as of October 14, 2015, but that doesn't mean you might not want to still pre-render.
YMMV on trusting Google (and other crawlers) for reasons stated here, so the only definitive way to find out which is best in your scenario would be to test it out. There could be other reasons you may want to pre-render, but since you mentioned SEO specifically, I'll leave it at that.

If you have a server-side templating system (php, python, etc.) you can implement a solution like prerender.io
If you only have AngularJS-only files hosted on a static server (e.g. amazon s3) => Have a look at the answer in the following post : AngularJS SEO for static webpages (S3 CDN)

yes you need to prerender the page for the bots, prrender.io
can be used and your page must have the
meta tag
<meta name="fragment" content="!">

take content from WordPress page and deliver it to HTML via ajax

I have the following problem:
HTML blank page on server 1.
WordPress site on server 2.
What I need is to call the content from www.wordpress.site/sample-page/ to HTML page on server 1, but not the entire page, only the part that I can edit from wp-admin; so without header and footer.
Also, I don't know if there is any other method, but I need it to be done via JavaScript/jQuery or Ajax.
I've used Google, but is hard to get a tutorial for this, I've tried a lot of tutorials, but none is what I need, and I don't know that much JavaScript to make it work.
SO, can someone help me please?
BIG Thanks!
Andrei
L.E.:
I've found this working: http://jsfiddle.net/mdawaffe/hLWdH/
It is working as it is written, if I try to change the domain with mine, will not work.
What script do I have to implement on the server from which the content is called (taken)?
For more information, as you asked:
I have a HTML + CSS + JS template that I will use with phonegap (if you don't know about it, try it, it's very useful) to create a mobile app for Android, iOS, and BlackBerry.
Now, I have this site: m.trafficvoice.ro (I hope I can post links here).
In the 'live stream' page (it's called services.html), I have a HTML5 audio tag/player.
What I need, is to get from www.trafficvoice.ro/whatever-the-name-page, the content, but only the part that I can edit in WordPress (so without header and footer).
Why? Because in the future there will be more stream to add, and maybe some of them will be down due to unknown reason, so I need to update that page, without making an update for the entire app, upload it to the store, wait for approval, the client to download it, etc.
Big thanks!
Andrei

Could you just use an iframe instead? You could modify a template in your theme to not display header/footer and then use that in the iframe.

History.js - sharing link of a AJAX loaded page

I have the following function that activates when I click on some links:
function showPage(page) {
var History = window.History;
History.pushState(null,null,page);
$("#post-content").load(page + ".php");
}
The content of the page updates, the URL changes. However I know I'm surely doing something wrong. For example when I refresh the page, it gives me the Page Not Found error, plus the link of the new page can't be shared, just because of the same reason.
Is there any way to resolve this?

It sounds like you're not routing your dynamic URLs to your main app. Unless page refers to a physical file on your server, you need to be doing some URL rewriting server-side if you want those URLs to work for anything other than simply being placeholders in your browser history. If you don't want to mess with the server side, you'll need to use another strategy, like hacking the URL with hashes. That way the server is still always serving your main app page, and then the app page reads the URL add-on stuff to decide what needs to be rendered dynamically.

You need to stop depending on JavaScript to build the pages.
The server has to be able to construct them itself.
You can then progressively enhance with JavaScript (pushState + Ajax) to transform the previous page into the destination page without reloading all the shared content.
Your problem is that you've done the "enhance" bit before building the foundations.

URL masking in JavaScript

I currently have the following JavaScript function that will take current URL and concatenate it to another site URL to route it to the appropriate feedback group:
function sendFeedback() {
url = window.location.href;
newwin = window.open('http://www.anothersite.com/home/feedback/?s=' + url, 'Feedback');
}
Not sure if this is the proper terminology, but I want to mask the URL in the window.open statement to use the URL from the current window.
How would I be able to mask the window.open URL with the original in JavaScript?

Things you could do:
1- Mask the external site in a html frame inside a document from your site.
(for example www.mysite.com/shortUrl/)
2-Send a Location HTTP header (real url will eventually be displayed)
Keep in mind that browsers do their best to show the real address due to phishing concerns.

I wouldn't use javascript if I wanted to mask url even thought it would work with javascript. You wouldn't get much benefits in that scenario.
The reason is simple:
javascript/jQuery = functions belongs to client-side (browswer/your PC/DOM)
links, url, http, and headers = functions belongs to Apache.
Apache is always top level above client-side. Whenever link is fired to SampeLink.html, Apache wakes up and reads the file, but links/urls are already owned before javascript could claim them. So, it is kinda of pointless if you tried to manipulate links in your javascript scripts, even though it works but weak.
I'd point you to this awesome approach: .htaccess and you will be surprised how powerful it is. If .htaccess is presented in the parent folder of SampleLink.html, Apache denies the DOM engine (your browser) from reading files until Apache have finished reading .htaccess.
With your scenario, .htaccess can do some work for you by rewriting links and send "decoy" links to the DOM engine, meanwhile keeping the orginial links/urls behind the curtain; and visitors would reach to 404page if they tried to break the app or whatever you are concerned about.
This is a bit complicated, but it never ceased to fail me. I use this as my "bible" http://corz.org/serv/tricks/htaccess2.php.

Ajax page part load and Google

I have some div on page loaded from server by ajax, but in the scenario google and other search engine don't index the content of this div. The only solution I see, it's recognize when page get by search robot and return complete page without ajax.
1) Is there more simple way?
2) How distinguish humans and robots?

You could also provide a link to the non-ajax version in your sitemap, and when you serve that file (to the robot), you make sure to have included a canonical link-element to the "real" page you want users to see:
<html>
<head>
[...]
<link rel="canonical" href="YOUR_CANONICAL_URL_HERE" />
[...]
</head>
<body>
[...]
YOUR NON_AJAX_CONTENT_HERE
</body>
</html>
edit: if this solution is not appropriate (some comments below points out that that this solution is non-standard and only supported by the "big-three"), you might have to re-think whether you should make the non-ajax version the standard solution, and use JavaScript to hide/show the information instead of fetching it via AJAX. If it is business critical information that is fetched, you have to realize that not all users have JavaScript enabled, and thus they won't be able to see this information. A progressive enhancement approach might be more appropriate in this case.

Google gets antsy if you are trying to show different things to you users than to crawlers. I suggest simply caching your query or whatever it is that needs AJAX and then using AJAX to replace only what you need to change. You still haven't really explained what's in this div that only AJAX can provide. If you can do it without AJAX then you should be, not just for SEO but for braille readers, mobile devices and people without javascript.

You can specify a sitemap in your robots.txt. That sitemap should be a list of your static pages. You should not be giving to Google a different page at the same URL, so you should have a different URL with static and dynamic content. Typically, the static URL is .../blog/03/09/i-bought-a-puppy and dynamic URL is something like .../search/puppy.

Develop Reference

JavaScript is the programming language of the Web.