Will Javascript URLs hurt SEO? - javascript

I'm making a word cloud with the jQuery plugin jQCloud, in which each word in the cloud is associated with a URL. I want each of those URLs crawled and indexed by Google/Bing.
jQCloud takes a hash specifying the word, rank, and URL. So if the bots read the JavaScript they will read the URL, but there will be no HREF without the JavaScript being rendered.
Based on Google's SEO documentation, I presume the bots won't index those URLs. Is this right? If so, what would be the most SEO-friendly approach to this wordcloud?

In short yes. Search bots are not going to bother parsing your JS because you could not be bothered to provide static accessible content.

Don't use "javascript URLs", they are an anti-accessibility feature. Some reading:
Broken Links
Hash, Bang, Wallop.
Breaking the Web with hash-bangs
Going Postel

That's why some browsers already implemented HTML5 PushState API, which uses the original URL but has the capability of understanding if it's Ajax or not and enables browser's navigation buttons (back/next).
Give a look at History.js project, a wrapper to help you use the API.

One possibility is to have your cloud degrade gracefully. For example, you could have a static list of your links created server-side with the page; if JavaScript is enabled, you could replace this list with your prettier cloud.
This has a benefit apart from being more transparent to search engines: people with JavaScript off will be able to see your links and it will improve accessibility.

Related

Should you use a hashbang on AJAX content sites, or just use normal URLs?

So I am creating just a fairly normal content based website that is going to load new pages/content in via AJAX for a smoother user experience and some simple transitions. I see a ton of people using a hashbang for similar implementations.
My question is why not just use regular URLs and have server-side determine if the regular page should be served or JSON/XML/etc based on the XMLHttpRequest var? At my first glance here it seems to make more sense to have one URL and I am curious why I see no mentions of this idea/approach in my initial searchings. Perhaps I am just missing something...
ANSWER: I missed that updating a full URL without a page load was not possible before HTML5 history. That is where my confusion arose.
If you only have 1 url, users can't bookmark anything on your site. So you need to do something.
So you can either use hashbangs (aka anchors), or, if you are targeting reasonably new browsers, html5 history.
You can't use regular urls because when you change a url, the entire page reloads.

SEO and the use of !# in a url

I read somewhere about how you can create a website that loads each section of a page with AJAX while still providing SEO. It had something to do with the use of !# in a url. Similar to what twitter does. I can't seem to find anything about it anywhere. Does anyone know what I'm talking about?
Is this what you are looking for:
Quoting:
Slightly modify the URL fragments for stateful AJAX pages
Stateful AJAX pages display the same content whenever accessed directly. These are pages that could be referred to in search results. Instead of a URL like http://example.com/page?query#state we would like to propose adding a token to make it possible to recognize these URLs: http://example.com/page?query#[FRAGMENTTOKEN]state. Based on a review of current URLs on the web, we propose using "!" (an exclamation point) as the token for this. The proposed URL that could be shown in search results would then be: http://example.com/page?query#!state.
#! is called a "hashbang" and they are the root of all that is evil in web development.
Basically, weak web developers decided to use #anchor names as a kludgy hack to get "web 2.0" things to work on their page, then complained to google that their page rank suffered. Google made a work around to their kludge by enabling the hashbang.
Weak web developers took this work around as gospel. Don't use it. It is a crutch.
Web development that depends on hashbangs is web-development done wrong.
This article is far more well worded than I could ever be, and deals with the Gawker media fiasco from their migration to a (failed) hashbang centric website. It tells you WHAT is happening and why it's bad.
http://isolani.co.uk/blog/javascript/BreakingTheWebWithHashBangs
Take a look at this article:
http://eliperelman.com/blog/2011/10/06/handling-googles-ajax-crawling-hashbang-number-navigation-in-asp-dot-net-mvc-3/
It explains the implementation of hashbang navigation, allowing google to index your site.
Check Modify Address Bar URL in AJAX App to Match Current State, also this can be done from Flash and Ajax with http://www.asual.com/swfaddress .
I believe you are looking for this forum question here: http://www.google.com/support/forum/p/Webmasters/thread?tid=55f82c8e722ecbf2&hl=en

Another "how to change the URL without leaving the webpage"

I am creating a website which uses jquery scrolling as the method of navigation that never leaves a single html page.
I have noticed that some websites are able to change the URL and have looked at posts/answers (such as How does GitHub change the URL without reloading a page? and Attaching hashtag to URL with javascript) which refer to these changes being either push states, AJAX scripts or history API's (all of which I am not too savvy in).
Currently I am looking into which method is best for my website and have been looking at some examples which I like.
My question is why do the websites below use /#/ in the path for the changing URL. The only reason I ask is because I am seeing this more and more often with jquery heavy websites.
http://na.square-enix.com/ffxiii-2/
http://www.airwalk.com
If anyone could simply shed some light on what these guys are using to do this, it would be much appreciated so I can possibly create my own script.
My question is why do the websites below use /#/ in the path for the changing URL
If we discount the possibility of ignorance to the alternatives then: Because they are willing to accept the horrible drawbacks in exchange for making it work in Internet Explorer (which doesn't support the history API).
Github take the sensible approach of using the history API if it is available and falling back to the server if it isn't, rather then generating links that will break without JavaScript.
http://probablyinteractive.com/url-hunter
This has a nice example on how to change the url with javascript.
I've not tried it myself, but read many reviews/opinions about History.js
It's supposed to have the "# in the path" option as you said (for older -- incompatible -- browsers) and the facebook-like direct changing of URL. Plus, when you hit the back button, you will get to the previous AJAX-loaded page with no problem.
I've implemented such a feature (AJAX tabs with URL changing), but if you will have other javascript on the pages that you want to load dynamically, I wouldn't recommend you using AJAX-loaded pages, because when you load content with AJAX, the JavaScript inside the content won't be executed.
So I vote for either HistoryJS or making your own module.
Well, they're using the anchor "#" because they need to differentiate between multipla bookmarkable/browser navigatable places in the site, while still having everything on the same page. By adding browser history events of the genre /mySamePage.html#page1, /mySamePage.html#page2 when the user does something that Ajax loads some content in the current html page you have the advantage of (well, obviouslly) still staying on the current page, but at the same time the user can bookmark that specific content, and pressiing back/forward on his browser will differentiate between different Ajax loaded content.
It's not bad as a trick, only issue is with SEO optimisation. Google has a nice page explaining this http://googlewebmastercentral.blogspot.com/2009/10/proposal-for-making-ajax-crawlable.html

Versioning and routing for single-page-app?

Background
I'm working on a an educational JavaScript application/site (SPA) that will eventually have 1000s of dynamic urls that I'd like to make crawlable.
I'm now researching how to implement versioning, routing and seo (and i18n).
My general idea is to use hashbangs and have resources like:
example.com/#!/v1?page=story1&country=denmark&year=1950
The "page" parameter here decides which controllers/views that need to be loaded and the subsequent parameters could instruct controllers to load corresponding content.
Versioning of parameters could then be handled by just replacing the "v1" part of the url - and have a specific route handler mapping deprecated parameters for each version.
SEO would be improved by having node.js or other backend delivering an "escaped fragment" version of the content.
i18n should probably be handled by node.js as well? This way, what gets delivered to the crawler is already translated?
Is this a viable approach to making a Single-page-application versioned and crawlable?
I'm using Backbone.js now, what would you add to the mix to help out with the above?
1) Hell no. (well it could work, but designing your application from the ground up with hashbangs is a bad idea)
2) node.js and backbone are a good combination. Personally I like express for routing/templating on the server.
--The argument against hashbangs:There is so much good information on the web that I will defer to them.
here: http://isolani.co.uk/blog/javascript/BreakingTheWebWithHashBangs
here: http://www.pixelflips.com/blog/do-clean-urls-still-matter/
and this wonderful library: https://github.com/browserstate/History.js/
and this wiki page from that library: https://github.com/browserstate/history.js/wiki/Intelligent-State-Handling
Than check out this chrome extension that will ajax StackOverflow(or any other site with normal urls) using that library: https://chrome.google.com/webstore/detail/oikegcanmmpmcmbkdopcfdlbiepmcebg
Are 15 parameters absolutely necessary? Put the content parameters(page, country) in the url and the presentational (ie: sortby=author) in the query string.
in response to "You are still stuck with hash tag serialization" I give this:
every route should point to a valid resource location. ie: /v1/page/denmark/some-slug-for-post should be a resource location and when you change it to a new post/page, that should be a resource location too. What I'm saying is that if you can't use the url to bookmark the page, than the implementation is broken.
Also, are you planning on breaking all your links with every version? I'm not sure why you're including the version in the url.
I hope this helps.
In answer to number 1, the requirement is that all "pages" have a unique URL and can be found and viewed without JavaScript.
You MUST make a robots.txt that lists all your unique URLs or have a site map somewhere so the crawlers can find all URLs
I'm not sure exactly what yo mean by SEO in this context. It seems like you are suggesting you will give different content to the crawlers than to the browsers. Typically not a great idea unless your site is so dynamic there is no other way.

URL Redirects for SEO (in Flash)?

I am creating a flash site and am trying to make it SEO. I'm thinking a possible solution would be to render html to any search engine bot, or to anyone who needs accessibility, and rendering the flash site for the rest of the users.
First question is, is this acceptable for google, and SEO in general?
This would mean I would redirect urls to flash users from site.com/home.html to site.com/#/home only if they weren't a bot of some sort.
Second question is, is it possible to do this in javascript or rails?
I would do this by capturing the URL, checking to see who the user is (is it google, or is it a human), I'm just not sure how to do this with javascript/rails, whatever need be. Then once I found "hey this is google", I would return the html page; if it was a user, I'd return flash.
Would that work? Is there something better?
It'd be worth reading up on Google's policies toward cloaking, sneaky Javascript redirects, and doorway pages.
Personally, I'd build the site in HTML and use the Flash for progressive enhancement where appropriate.
Its not doable in javascript, because javascript is executed after the page is sent, so the damage is already done.
Your webserver would have to recognize the google useragent when the page request is made, and serve a different page accordingly. Then you can avoid the whole redirect nonsense entirely. I know you can configure most webservers to do that, however I do not know the required steps, and it depends on what webserver you are using.
I'm not going to comment on the merits/demerits of flash based websites.
This is a form of SEO called cloaking that's widely considered unscrupulous (though your intended use doesn't sound malicious to me). It can get you banned by Google.
Have you looked in to using SWFAddress?
The flash framework, Gaia, uses separate xhtml pages for it's SEO solution. From it's site:
"The Search Engine Optimization Scaffolding engine in Gaia creates an XHTML file for every page you specify in the site.xml, as well as a sitemap.xml file that follows sitemaps.org protocols.
The purpose of SEO Scaffolding is to provide search engines and non-Flash users with easy access to the content on your site, as well a convenient single data source for the copy on your site, organized by page.
This technique is white hat compliant, and is discussed on the Gaia forums."
More information here: http://www.gaiaflashframework.com/wiki/index.php?title=SEO

Categories

Resources