I've seen websites like YouTube and Gmail load fast. I know that Gmail is a Single Page App but YouTube is not, Is there a way to make a website that is not a SPA that can load this fast?
NOTE: I am using a static site
Here prefetching is done like this:
<link rel="prefetch" href="/your/webpage/link.html">
Place the above in the header of your page for each page you want to prefetch.
Your question is not clearly. You are must be focus what is your problem? Or what is your concern? Etc... Because to talking about page load faster, it is not mean SPA is faster to other technology. Its depending by your technical, system, application, networks, database, etc... A tons issue to reference to performance of the pages. Maybe security is impact to performance, hard disk of sever to slow to serve the service, workloads of framework, etc...
This post is pretty old now. I understand you need to use "hashtag urls" instead of direct ones. Direct ones can be shown as the address with history.pushState.
MDN:
https://developer.mozilla.org/en-US/docs/Web/API/History/pushState
https://developer.mozilla.org/en-US/docs/Glossary/SPA
Example of a SPA App
<!DOCTYPE html>
<html lang="en">
<head>
<script>
if (document.location.href == 'https://example.com/#page'){
history.pushState('', '', "https://example.com/page")
document.querySelector("body").
}
// use 301 redirects to make sure the /page url (without hashtag) goes to the one with a hashtag.
</script>
</head>
<body>
</body>
</html>
Related
My application uses AngularJS for frontend and .NET for the backend.
In my application I have a list view. On clicking each list item, It will fetch a pre rendered HTML page from S3.
I am using angular state.
app.js
...
state('staticpage', {
url: "/staticpage",
templateUrl: function (){
return 'http://xxxxxxx.cloudfront.net/staticpage/staticpage1.html';
},
controller: 'StaticPageCtrl',
title: 'Static Page'
})
StaticPage1.html
<div>
Hello static world 1!
<div>
How do I do SEO here?
Do I really need to do HTML snapshot using PanthomJS or so.
Yes PhantomJS would do the trick or you can use prerender.io with that service you can just use their open source renderer and have your own server.
Another way is to use _escaped_fragment_ meta tag
I hope this helps, if you have any questions add comments and I will update my answer.
Do you know that google renders html pages and executes javascript code in the page and does not need any pre-rendering anymore?
https://webmasters.googleblog.com/2014/05/understanding-web-pages-better.html
And take a look at these :
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157
http://wijmo.com/blog/how-to-improve-seo-in-angularjs-applications/
My project front-end also has biult on top of Angular and I decieded to solve SEO issue like this:
I've created an endpiont for all search engines (SE) where all the requests go with _escaped_fragment_ parameter;
I parse a HTTP Request for _escaped_fragment_ GET parameter;
I make cURL request with parsed category and article parameters and get the article content;
Then I render a simpliest (and seo friendly) template for SE with the article content or throw a 404 Not Found Exception if article does not exists;
In total: I do not need to prerender some html pages or use prrender.io, have a nice user interface for my users and Search Engines index my pages very well.
P.S. Do not forget to generate sitemap.xml and include there all urls (with _escaped_fragment_) wich you want to be indexed.
P.P.S. Unfortunately my project's back-end has built on top of php and can not show you suitable example for you. But if you want more explanations do not hesitate to ask.
Firstly you can not assume anything.
Google does say that there bots can very well understand javascript application but that is not true for all scenarios.
Start from using crawl as google feature from the webmaster for your link and see if page is rendered properly. If yes, then you need not read further.
In case, you see just your skeleton HTML, this is because google bot assumes page load complete before it actually completes. To fix this you need an environment where you can recognize that a request is from a bot and you need to return it a prerendered page.
To create such environment, you need to make some changes in code.
Follow the instructions Setting up SEO with Angularjs and Phantomjs
or alternatively just write code in any server side language like PHP to generate prerendered HTML pages of your application.
(Phantomjs is not mandatory)
Create a redirect rule in your server config which detects the bot and redirects the bot to prerendered plain html files (Only thing you need to make sure is that the content of the page you return should match with the actual page content else bots might not consider the content authentic).
It is to be noted that you also need to consider how will you make entries to sitemap.xml dynamically when you have to add pages to your application in future.
In case you are not looking for such overhead and you are lacking time, you can surely follow a managed service like prerender.
Eventually bots will get matured and they would understand your application and you will say goodbye to your SEO proxy infrastructure. This is just for time being.
At this point in time, the question really becomes somewhat subjective, at least with Google -- it really depends on your specific site, like how quickly your pages render, how much content renders after the DOM loads, etc. Certainly (as #birju-shaw mentions) if Google can't read your page at all, you know you need to do something else.
Google has officially deprecated the _escaped_fragment_ approach as of October 14, 2015, but that doesn't mean you might not want to still pre-render.
YMMV on trusting Google (and other crawlers) for reasons stated here, so the only definitive way to find out which is best in your scenario would be to test it out. There could be other reasons you may want to pre-render, but since you mentioned SEO specifically, I'll leave it at that.
If you have a server-side templating system (php, python, etc.) you can implement a solution like prerender.io
If you only have AngularJS-only files hosted on a static server (e.g. amazon s3) => Have a look at the answer in the following post : AngularJS SEO for static webpages (S3 CDN)
yes you need to prerender the page for the bots, prrender.io
can be used and your page must have the
meta tag
<meta name="fragment" content="!">
I have a simple website with some basic scripts just like this:
<html>
<head>
<title>Welcome to my website but a user can view page source --oops</title>
<script>
//some basic javascript codes i used to build the website
</script>
</head>
<body>
<p>More contents on the actual implementation of the website.<p>
</body>
</html>
Is there a way I can use server side processing technique to cluster the contents of view page source as I have tried using javascript bt no substantial outcome. Please assist!
You can't hide javascript, html, or css from users. You can proccess out in server like php some code but you need to return html. The only way to complicate user's reading of your code, you try minimize javascript/css/html code. YUI compressor can help you:
http://refresh-sf.com
This makes your code more difficult to read, but the behaviour is the same.
Good luck.
I've found many explanations about caching, some of them even have examples but, it is kind of foggy to understand it and how to use it. I've tried to use it many times, but I've failed (I want to improve speed, I want only the necessary to be loaded from the server). Can you help me to make this page below be saved in the browser's cache, If possible give me an explanation or a different way on how to do it (it can be JS too!)?
P.S.: It can be Appcache if you give me a suitable example for this page ;).
Thanks in advance.
My Appcache file's name: offline.appcache.
CACHE MANIFEST
/style.css
http://sistema.agrosys.com.br/sistema/labs/CSS_HTML/html1.html
<!DOCTYPE html>
<html lang="en" manifest="/offline.appcache">
<head>
<meta name="viewport" content="width=device-width" />
<title>page1</title>
<link rel="stylesheet" type="text/css" href="style.css">
</head>
<body>
<div class="testing_class">Test</div>
<div class="testing_clas">Test</div>
<div class="testing_cla">Test</div>
<div class="testing_cl">Test</div>
<div class="testing_c">Test</div>
<div class="testing_">Test</div>
</body>
</html>
Reconsider using AppCache. Using it doesn't necessarily imply that your site will work offline. Basically, here are the steps that AppCache takes, regardless of the browser connection status:
Asks the server for the manifest file.
If the manifest file hasn't changed, it serves the local files.
If the manifest file has changed, it downloads the new files, saves them and then serves them.
Since you mention that
I want to improve speed, I want only the necessary to be loaded from the server
AppCache is a perfectly valid solution.
EDIT: A quick example of using AppCache:
In the beginning of your original HTML:
<!DOCTYPE html>
<!--[if lte IE 9]>
<style>.scrollingtable > div > div > table {margin-right: 17px;}</style>
<![endif]-->
<html manifest="example.appcache">
<head>
You just need the "manifest" in the tag. Then, the example.appcache file would be:
CACHE MANIFEST
CACHE:
http://code.jquery.com/ui/1.11.4/themes/redmond/jquery-ui.css
http://code.jquery.com/jquery-1.10.2.js
http://code.jquery.com/ui/1.11.4/jquery-ui.js
NETWORK:
*
http://*
https://*
Just include in the CACHE section whatever static content your site uses.
You can also put a version number or date in the manifest file to make sure the browsers gets the new content when needed.
Caching is used to avoid redownloading files that are reused very often (across several pages or several sessions), but it targets mainly those files that fall under the category of "assets" (CSS, javascript, images, etc.), and which are expected to remain frozen. However, the content of webpage (the HTML) is NOT expected to remain frozen (eg. search results, etc.), and is usually reasonable in size, so there's no real reason to bother caching it (who still has a 56k connection really ?).
Then, there is the case of HTML "static pages", but usually those pages contain only text, and text is very light (unless you have a full book) compared to other media, so most people don't bother about it.
Now if you really want to "cache" the HTML, well it's exactly the same as keeping an offline version, so why not Appcache ?.
Previously, Google's Friend Connect required users to upload a couple of files to their websites to enable cross domain communication and Facebook Connect still requires you to upload a single file to enabled it.
Now, Friend Connect doesn't require any file upload... I was wondering how they were able to accomplish this.
Reference:
http://www.techcrunch.com/2009/10/02/easy-does-it-google-friend-connect-one-ups-facebook-connects-install-wizard/
There are multiple methods of communicating between documents on different domains, amongst these HTML5 postMessage, NIX, FIM(hash/fragment), frameElement and by using the window.name property.
These are available on different browsers and in different versions, but collectively they allow you to do reliable XDM (cross domain messaging).
One project that have done this earlier is Apache Shindig, which probably pioneered quite a few of these, and more recently, the project easyXDM has come, unifying all of these approaches with a common API, making it easy to create complex applications using XDM and RPC.
You can read in depth about the various methods of transporting the data in this article at Script Junkie.
Now, to answer your question directly, earlier on it was quite common to believe that there was only postMessage, the FIM (Fragment Identifier Messaging) available, and for the latter to work efficiently, one often had to upload a special file to your domain. As more methods have been discovered, this has by many been deprecated as a technique, and hence; no more need for the file.
Just for the record; I'm the author of both the Script Junkie article, and the easyXDM library (that is what Twitter, Disqus and quite a few more are using by the way).
<edit>It's difficult to remember/verify now, but I believe my answer here was probably incorrect. Sean Kinsey's answer above should be the definitive answer to this question. If you're reading this, please upvote his answer and ignore mine.</edit>
The Google Friend Connect widget works like most ads/gadgets do, using a copy/pasted snippet of HTML to reference a JavaScript include on the host's server which then creates an iframe containing the desired content. By opening the iframe with your site ID in the URL, Google's server is able to generate the appropriate HTML document to represent a Friend Connect gadget for your particular site/settings.
There isn't any cross-site communication happening beyond that initial step of creating an iframe with the appropriate URL target. Everything inside the gadget's dynamically generated iframe is more like the user visited a separate page on Google's server, but what would have been displayed is then embedded/isolated in a block on your page instead.
I'm not sure how it works in this particular instance but cross-domain messaging can be accomplished either by the postMessage() API or by changing the hash part of the URL and monitoring that.
The hash change method works because both the enclosing and the enclosed pages have access to the enclosed page's URL.
Of course, hopefully the postMessage() API call becomes more standard over time.
JSON allows cross-domain javascript.
Due to browser security restrictions,
most "Ajax" requests are subject to
the same origin policy; the request
can not successfully retrieve data
from a different domain, subdomain,
or protocol.
Script and JSONP
requests are not subject to the same
origin policy restrictions.
There is no other method than using the somewindow.postMessage(); for communication between cross-domain iframes.
Before somewindow.postMessage() you had to upload file in order to ensure that you can establish communication between iframes.
example:
<html>
<!-- this is main domain www.example.com -->
<head>
</head>
<body>
<iframe src="http://www.exampleotherdomain.com/">
<script>
function sendMsg(a) {
var f = document.createElement('iframe'),
k = document.getElementById('ifr');
f.setAttribute('src', 'http://www.example.com/xdreciver.html#myValueisSent');
k.appendChild(f);
k.removeChild(f);
}
</script>
<div id="ifr"></div>
</iframe>
</body>
</html>
now the http://www.example.com/xdreciver.html html content :
<html>
<!-- this is http://www.example.com/xdreciver.html -->
<head>
<script>
function getMsg() {
return window.location.hash;
}
</script>
</head>
<body onload="var msg = getMsg(); alert(msg);">
</body>
</html>
As for using the .postMessage(); its enough to use top.postMessage('my message to other domain document, which is also the main document', 'http://www.theotherdomain.com');
I have some div on page loaded from server by ajax, but in the scenario google and other search engine don't index the content of this div. The only solution I see, it's recognize when page get by search robot and return complete page without ajax.
1) Is there more simple way?
2) How distinguish humans and robots?
You could also provide a link to the non-ajax version in your sitemap, and when you serve that file (to the robot), you make sure to have included a canonical link-element to the "real" page you want users to see:
<html>
<head>
[...]
<link rel="canonical" href="YOUR_CANONICAL_URL_HERE" />
[...]
</head>
<body>
[...]
YOUR NON_AJAX_CONTENT_HERE
</body>
</html>
edit: if this solution is not appropriate (some comments below points out that that this solution is non-standard and only supported by the "big-three"), you might have to re-think whether you should make the non-ajax version the standard solution, and use JavaScript to hide/show the information instead of fetching it via AJAX. If it is business critical information that is fetched, you have to realize that not all users have JavaScript enabled, and thus they won't be able to see this information. A progressive enhancement approach might be more appropriate in this case.
Google gets antsy if you are trying to show different things to you users than to crawlers. I suggest simply caching your query or whatever it is that needs AJAX and then using AJAX to replace only what you need to change. You still haven't really explained what's in this div that only AJAX can provide. If you can do it without AJAX then you should be, not just for SEO but for braille readers, mobile devices and people without javascript.
You can specify a sitemap in your robots.txt. That sitemap should be a list of your static pages. You should not be giving to Google a different page at the same URL, so you should have a different URL with static and dynamic content. Typically, the static URL is .../blog/03/09/i-bought-a-puppy and dynamic URL is something like .../search/puppy.