I have some div on page loaded from server by ajax, but in the scenario google and other search engine don't index the content of this div. The only solution I see, it's recognize when page get by search robot and return complete page without ajax.
1) Is there more simple way?
2) How distinguish humans and robots?
You could also provide a link to the non-ajax version in your sitemap, and when you serve that file (to the robot), you make sure to have included a canonical link-element to the "real" page you want users to see:
<html>
<head>
[...]
<link rel="canonical" href="YOUR_CANONICAL_URL_HERE" />
[...]
</head>
<body>
[...]
YOUR NON_AJAX_CONTENT_HERE
</body>
</html>
edit: if this solution is not appropriate (some comments below points out that that this solution is non-standard and only supported by the "big-three"), you might have to re-think whether you should make the non-ajax version the standard solution, and use JavaScript to hide/show the information instead of fetching it via AJAX. If it is business critical information that is fetched, you have to realize that not all users have JavaScript enabled, and thus they won't be able to see this information. A progressive enhancement approach might be more appropriate in this case.
Google gets antsy if you are trying to show different things to you users than to crawlers. I suggest simply caching your query or whatever it is that needs AJAX and then using AJAX to replace only what you need to change. You still haven't really explained what's in this div that only AJAX can provide. If you can do it without AJAX then you should be, not just for SEO but for braille readers, mobile devices and people without javascript.
You can specify a sitemap in your robots.txt. That sitemap should be a list of your static pages. You should not be giving to Google a different page at the same URL, so you should have a different URL with static and dynamic content. Typically, the static URL is .../blog/03/09/i-bought-a-puppy and dynamic URL is something like .../search/puppy.
Related
I have zero experience in native apps, which might help with this question.
Since service worker caches everything so nicely, then I don't see any reason why I should render the entire webpage again when the page gets switched (link gets clicked.) So I will switch only the content, use history pushstate to change the URL and change the title. I have that part figured out.
Problem is, I cannot find any resources that would support either of the two content load ideas I have:
Load center content via AJAX with HTML.
Load center content as data only and render the HTML on-the-fly in JS.
First method would be fairly straight forward, but would mean that the payload would be bigger.
Second seems much more advanced, but would mean that HTML templates have to be in the JS somehow already? I also have a feeling, that there is a method somewhere in here.. that would allow to open the heavily cached page (lets say the article page) and replace the (text) contents. But as I said, I cannot find any resources to wager the cons and pros or give any reliable information on PWA AJAX page switching.
Any credible information on this matter would be much appreciated.
EDIT
I have kept reading and researching on this matter, but sadly there is no clear indication on how to handle dynamic content over AJAX. Whether I should parse the JSON data from AJAX to HTML in JS or send it already as HTML from the backend.
To add in favour to second option. I have figured out, that my theory had somewhat weight to it. If I use pure.js to pull a HTML template from hidden template tag and generate the HTML on the fly from JSON over AJAX.
you make it so complicated can we take a look at your code please?!
if you mean retrieving data from database by ajaxthen all what when you need is a jquery plugin
$(document).ready(function(){
var contentData1 = document.getElementById('contentData1');
$(function() {
$.post("pathToPHP.php",{contentData1: contentData1},function(data){
$("#container").html(data);
});
});
and the pathToPHP.php file should retrieve the data you want
echo "";
My application uses AngularJS for frontend and .NET for the backend.
In my application I have a list view. On clicking each list item, It will fetch a pre rendered HTML page from S3.
I am using angular state.
app.js
...
state('staticpage', {
url: "/staticpage",
templateUrl: function (){
return 'http://xxxxxxx.cloudfront.net/staticpage/staticpage1.html';
},
controller: 'StaticPageCtrl',
title: 'Static Page'
})
StaticPage1.html
<div>
Hello static world 1!
<div>
How do I do SEO here?
Do I really need to do HTML snapshot using PanthomJS or so.
Yes PhantomJS would do the trick or you can use prerender.io with that service you can just use their open source renderer and have your own server.
Another way is to use _escaped_fragment_ meta tag
I hope this helps, if you have any questions add comments and I will update my answer.
Do you know that google renders html pages and executes javascript code in the page and does not need any pre-rendering anymore?
https://webmasters.googleblog.com/2014/05/understanding-web-pages-better.html
And take a look at these :
http://searchengineland.com/tested-googlebot-crawls-javascript-heres-learned-220157
http://wijmo.com/blog/how-to-improve-seo-in-angularjs-applications/
My project front-end also has biult on top of Angular and I decieded to solve SEO issue like this:
I've created an endpiont for all search engines (SE) where all the requests go with _escaped_fragment_ parameter;
I parse a HTTP Request for _escaped_fragment_ GET parameter;
I make cURL request with parsed category and article parameters and get the article content;
Then I render a simpliest (and seo friendly) template for SE with the article content or throw a 404 Not Found Exception if article does not exists;
In total: I do not need to prerender some html pages or use prrender.io, have a nice user interface for my users and Search Engines index my pages very well.
P.S. Do not forget to generate sitemap.xml and include there all urls (with _escaped_fragment_) wich you want to be indexed.
P.P.S. Unfortunately my project's back-end has built on top of php and can not show you suitable example for you. But if you want more explanations do not hesitate to ask.
Firstly you can not assume anything.
Google does say that there bots can very well understand javascript application but that is not true for all scenarios.
Start from using crawl as google feature from the webmaster for your link and see if page is rendered properly. If yes, then you need not read further.
In case, you see just your skeleton HTML, this is because google bot assumes page load complete before it actually completes. To fix this you need an environment where you can recognize that a request is from a bot and you need to return it a prerendered page.
To create such environment, you need to make some changes in code.
Follow the instructions Setting up SEO with Angularjs and Phantomjs
or alternatively just write code in any server side language like PHP to generate prerendered HTML pages of your application.
(Phantomjs is not mandatory)
Create a redirect rule in your server config which detects the bot and redirects the bot to prerendered plain html files (Only thing you need to make sure is that the content of the page you return should match with the actual page content else bots might not consider the content authentic).
It is to be noted that you also need to consider how will you make entries to sitemap.xml dynamically when you have to add pages to your application in future.
In case you are not looking for such overhead and you are lacking time, you can surely follow a managed service like prerender.
Eventually bots will get matured and they would understand your application and you will say goodbye to your SEO proxy infrastructure. This is just for time being.
At this point in time, the question really becomes somewhat subjective, at least with Google -- it really depends on your specific site, like how quickly your pages render, how much content renders after the DOM loads, etc. Certainly (as #birju-shaw mentions) if Google can't read your page at all, you know you need to do something else.
Google has officially deprecated the _escaped_fragment_ approach as of October 14, 2015, but that doesn't mean you might not want to still pre-render.
YMMV on trusting Google (and other crawlers) for reasons stated here, so the only definitive way to find out which is best in your scenario would be to test it out. There could be other reasons you may want to pre-render, but since you mentioned SEO specifically, I'll leave it at that.
If you have a server-side templating system (php, python, etc.) you can implement a solution like prerender.io
If you only have AngularJS-only files hosted on a static server (e.g. amazon s3) => Have a look at the answer in the following post : AngularJS SEO for static webpages (S3 CDN)
yes you need to prerender the page for the bots, prrender.io
can be used and your page must have the
meta tag
<meta name="fragment" content="!">
I am currently using javascript and XMLHttpRequest on a static html page to create a view of a record in Zotero. This works nicely except for one thing: The page html title.
I can of course also change the <title>...</title> tag, but if someone wants to post the view to for example facebook the static title on the web page will be shown there.
I can't think of any way to fix this with just a static page with javascript. I believe I need a dynamically created page from a server that does something similar to XMLHttpRequest.
For PHP there is HTTPRequest. Now to the problem. In the javascript version I can use asynchronous calls. With PHP I think I need synchronous calls. Is that something to worry about?
Is there perhaps some other way to handle this that I am not aware of?
UPDATE: It looks like those trying to answer are not at all familiar with Zotero. I should have been more clear. Zotero is a reference db located at http://zotero.org/. It has an API that can be used through XMLHttpRequest (which is what I said above).
Now I can not use that in my scenario which I described above. So I want to call the Zotero server from my server instead. (Through PHP or something else.)
(If you are not familiar with the concepts it might be hard to understand and answer the question. Of course.)
UPDATE 2: For those interested in how Facebook scraps an URL you post there, please test here: https://developers.facebook.com/tools/debug
As you can see by testing there no javascript is run.
Sorry, im not sure if i understand what you are trying to ask, are you just wanting to change the pages title?
Why not use javascript?
document.title = newTitle
Facebook expects the title (or opengraph :title tags) to be present when it fetches the page. It won't execyte any JavaScript for you to fill in the blanks.
A cool workaround would be to detect the Facebook scraper with PHP by parsing the User Agent string, and serving a version of the page with the information already filled in by PHP instead of JavaScript.
As far as I know, the Facebook scraper uses this header for User Agent: "facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)"
You can check to see if part of that string is present in the header and load the page accordingly.
if (strpos($_SERVER['HTTP_USER_AGENT'], 'facebookexternalhit') !== false)
{
//synchronously load the title and opengraph tags here.
}
else
{
//load the page normally
}
I have the following problem:
HTML blank page on server 1.
WordPress site on server 2.
What I need is to call the content from www.wordpress.site/sample-page/ to HTML page on server 1, but not the entire page, only the part that I can edit from wp-admin; so without header and footer.
Also, I don't know if there is any other method, but I need it to be done via JavaScript/jQuery or Ajax.
I've used Google, but is hard to get a tutorial for this, I've tried a lot of tutorials, but none is what I need, and I don't know that much JavaScript to make it work.
SO, can someone help me please?
BIG Thanks!
Andrei
L.E.:
I've found this working: http://jsfiddle.net/mdawaffe/hLWdH/
It is working as it is written, if I try to change the domain with mine, will not work.
What script do I have to implement on the server from which the content is called (taken)?
For more information, as you asked:
I have a HTML + CSS + JS template that I will use with phonegap (if you don't know about it, try it, it's very useful) to create a mobile app for Android, iOS, and BlackBerry.
Now, I have this site: m.trafficvoice.ro (I hope I can post links here).
In the 'live stream' page (it's called services.html), I have a HTML5 audio tag/player.
What I need, is to get from www.trafficvoice.ro/whatever-the-name-page, the content, but only the part that I can edit in WordPress (so without header and footer).
Why? Because in the future there will be more stream to add, and maybe some of them will be down due to unknown reason, so I need to update that page, without making an update for the entire app, upload it to the store, wait for approval, the client to download it, etc.
Big thanks!
Andrei
Could you just use an iframe instead? You could modify a template in your theme to not display header/footer and then use that in the iframe.
I have a new site that I am putting together and part of it has statistics for the site's users. I would like to create a widget that others can use on another website by invoking javascript that reads data from my server and shows that statistics for a given user, but I am having a hard time finding specific tutorials that covers this in django.
I have seen the link at Alex Maradon's site [0], but it looks to me like that is passing html back to the widget and I am having a hard time figuring out how to do this using something like xml.
Are there any django apps for doing this or does anyone know of good how-tos?
[0] http://alexmarandon.com/articles/web_widget_jquery/
This is not matter of Django, you can solve this by using the most common solution. Javascript.
Give your users this to put on their websites.
<script type="text/javascript" src="http://mysite.com/widget/user/124546465"></script>
On a django view, render the next template:
(function(){
document.write('<div class="mysite-userprofile">');
document.write('My visits are {{total_visits}}<br />')
document.write('</div>') })()
)
So on your view, you may have something like this, the mimetype is important
def total_visits(request, user_id):
user = get_object_or_404(User, id = user_id)
total_visits = Visits.objects.filter(user:user).total_visits() #this is a method to count, you may have to write your own logic
context = {'total_visits': total_visits}
render_to_response('widget_total_visits.html', context, mimetype='text/javascript')
What can you do next?
User settings, like this.
<script type="text/javascript">
mysite_options = {
'just_friends': True,
'theme': 'bluemarine,
'realtime': True
}
</script>
<script type="text/javascript" src="http://mysite.com/widget/user/124546465"></script>
So on your template, you can use the variables set before include the script on the web site of your user, a simple stuff.
Later, you can use POST method, to gather information from the user clients. For stats.
And of course make it Ajax!
I hope this give you a path to follow
Obey the one rule: Keep It Simple Silly!
I know it may be very web 1.0 but an iframe really is your best friend in this situation. A simple piece of code such as <script>document.write("<iframe src='yoursite.com/userwidget/" + username + "' height='30' width='150' />");</script> to inject an iframe at load time will save you a crapload of time writing jsonp async code and dom manipulators and making sure that all the elements you inject onto their page will be styled correctly on every different website and worrying about origin policies.
if you have your code plug in an iframe pointed at you page then:
The origin is you! you don't have to worry about it.
You can use django templates instead of js to construct the widget being shown.
their CSS won't mess with your presentation.
their js can't easily manipulate your stats ;)
This is exactly what iframes were designed to do.