Twitter Cards using Backbone's HTML5 History - javascript

I'm working on a web app which uses Backbone's HTML5 History option. In order to avoid having to code everything on the client and on the server, I'm using this method to route every request to index.html
I was wondering if there is a way to get Twitter Cards to work with this setup, as currently it can't read the page as everything is loaded in dynamically with Javascript.
I was thinking about using User Agents to detect whether it's the TwitterBot, and if it is, serving a static version of the page with the required meta-tags. Would this work?
Thanks.

Yes.
At one job we did this for all the SEO/search/facebook stuff etc.
We would sniff the user-agent, and if it was one of the following sniffers
Facebook Open Graph
Google
Bing
Twitter
Yandex
(a few others I can't remember)
we would redirect to a special page that was written to dump all the relevant data about the page for SEO purposes into a nicely formatted (but completely unstyled) page.
This allowed us to retain our google index position and proper facebook sharing even though our site was a total single-page app in backbone.

Yes, serving a specific page for Twitterbot with the right meta data markup will work.
You can test your results while developing using the card's preview tool.
https://dev.twitter.com/docs/cards/preview (with your static URL or just the tags).

Related

HTML5 History API not working after refreshing the URL

I'm building very simple SPA 'wannabe' site /for a friend and for exercise/. The idea is simple - 3 static pages: home, portfolio, contact. I've made the links from the portfolio and contact to change the url and I just need to make some functions to change the content. So far so good, but when I refresh the page when I'm on /contact or /portfolio "page" i get error "Cannot GET /portfolio". Same happens when I try to copy and paste the link in other browser. The purpose of the site is to be able to send links and open em. Could this be achieved without server-side?
No.
You must change have a server-side, because this mode requires URL rewriting. You have to rewrite all your requests to the index of your app.
You can disable HTML5 location mode (and use hash instead).

Google Analytics: Do I need URL address?

I am a complete Google Analytics beginner and would appreciate a help with a basic question.
I am developing HTML, CSS and JavaScript based applications which are further uploaded into an iOS application to present your applications in a fancy way. Therefore my application is a hybrid application (half JS web site, half mobile app).
I would love to see users' activity in my app when they are browsing through it and I thought GA might work well with it - but the problem is, that the outer app doesn't provide me with any URL of my inner JS app (the inner web site's URL is file:///).
At this page (link), I found that URL is not really important, that it is the tracking code which is important. So I used a dummy URL, added the GA snippet into my application and uploaded it in iPresent. I can't see no live activity though... :/ It also says the measuring is not installed (not used at a home page).
So I am wondering - is the URL really important?
Any ideas?
Thanks!
URL (or page path) is only important if you want to report on data based on which URLs your visitors went to.
If you app doesn't use URLs at all, perhaps it fits better with the "app" model where you are sending screen name data instead of page data. You can read more about the differences between web and app views here:
https://support.google.com/analytics/answer/2649553
I found out that URL is not needed. This type of problem can be solved by using GA Measurement Tool
https://developers.google.com/analytics/devguides/collection/protocol/v1/
Validate your hit here:
https://ga-dev-tools.appspot.com/hit-builder/

Does html5mode(true) affect google search crawlers

I'm reading this specification which is an agreement between web servers and search engine crawlers that allows for dynamically created content to be visible to crawlers.
It's stated there that in order for a crawler to index html5 application one must implement routing using #! in URLs. In angular html5mode(true) we get rid of this hashed part of the URL. I'm wondering whether this is going to prevent crawlers from indexing my website.
Short answer - No, html5mode will not mess up your indexing, but read on.
Important note: Both Google and Bing can crawl AJAX based content without HTML snapshots
I know, the documentation you link to says otherwise but about a year or two ago they officially announced that they handle AJAX content without the need for HTML snapshots, as long as you use pushstates, but a lot of the documentation is old and unfortunately not updated.
SEO using pushstates
The requirement for AJAX crawling to work out of the box is that you are changing your url using pushstates. This is just what html5mode in Angular does (and also what a lot of other frameworks do). When pushstates is on the crawlers will wait for ajax calls to finish and for javascript to update the page before they index it. You can even update things like page-title or even meta tags in your router and it will index properly. In essence you don't need to do anything, there is no difference between server-side and client-side rendered sites in this case.
To be clear, a lot of SEO-analysis tools (such as Moz) will spit out warnings on pages using pushstates. That's because those tools (and their reps if you talk to them) are at the time of writing not up to date, so ignore them.
Finally, make sure you are not using the fragment meta-tag from below when doing this. If you have that tag the crawlers will think that you want to use the non-pushstates method and things might get messed up.
SEO without pushstates
There is very little reason not to use pushstates with Angular, but if you don't you need to follow the guidelines linked to in the question. In short you create snapshots of the html on your server and then you use the fragment meta tag to change your url-fragment to be "#!" instead of "#".
<meta name="fragment" content="!" />
When a crawler finds a page like this it will remove the fragment part of the url and instead requests the url with the parameter _escaped_fragment_, and you can serve your snapshotted page in response. Giving the crawler a normal static page to index.
Note that the fragment meta-tag should only be used if you want to trigger this behaviour. If you are using pushstates and want the page to index that way, don't use this tag.
Also, when using snapshots in Angular you can have html5mode on. In html5mode the fragment is hidden but it is still technically exists and will still trigger the same behaviour, assuming the fragment meta-tag is set.
A warning - Facebook crawler
While both Google and Bing will crawl your AJAX pages without problem (if you are using pushstates), Facebook will not. Facebook does not understand ajax-content and still requires special solutions, like html snapshots served specifically to the facebook bot (user agent facebookexternalhit/1.1).
Edit - I should probably mention that I have deployed sites with all of these versions. Both with html5mode, fragment meta tag and snapshots and without any snapshots and just relying on the pushstate-crawling. It all works fine, except for pushstates and Facebook as noted above.
To allow indexing of your AJAX application, you have to add special meta tag in the head section of your document:
<meta name="fragment" content="!" />
Source:
https://docs.angularjs.org/guide/$location#crawling-your-app
Towards the bottom look for crawling your app

#! hashtag and exclamation mark in links as folder?

how i can make my pages show like grooveshark pages
http://grooveshark.com/#!/popular
is there a tutorial or something to know how to do this way for showing page by jQuery or JavaScript?
The hash and exclamation mark in a url are called a hashbang, and are usualy used in web applications where javascript is responsible for actually loading the page. Content after the hash is never sent to the server. So for example if you have the url example.com/#!recipes/bread. In this case, the page at example.com would be fetched from the server, this could contain a piece of javascript. This script can then read from location.hash, and load the page at /recipes/bread.
Google also recognizes this URL scheme as an AJAX url, and will try to fetch the content from the server, as it would be rendered by your javascript. If you're planning to make a site using this technique, take a look at google's AJAX crawling documentation for webmasters. Also keep in mind that you should not rely on javascript being enabled, as Gawker learned the hard way.
The hashbang is being going out of use in a lot of sites, evenif javascript does the routing. This is possible because all major browsers support the history API. To do this, they make every path on the site return the same Javascript, which then looks at the actual url to load in content. When the user clicks a link, Javascript intercepts the click event, and uses the History API to push a new page onto the browser history, and then loads the new content.

SEO and AJAX (Twitter-style)

Okay, so I'm trying to figure something out. I am in the planing stages of a site, and I want to implement "fetch data on scroll" via JQuery, much like Facebook and Twitter, so that I don't pull all data from the DB at once.
But I some problems regarding the SEO, how will Google be able to see all the data? Because the page will fetch more data automatically when the user scrolls, I can't include any links in the style of "go to page 2", I want Google to just index that one page.
Any ideas for a simple and clever solution?
Put links to page 2 in place.
Use JavaScript to remove them if you detect that your autoloading code is going to work.
Progressive enhancement is simply good practise.
You could use PHP (or another server-side script) to detect the user agent of webcrawlers you specifically want to target such as Googlebot.
In the case of a webcrawler, you would have to use non-JavaScript-based techniques to pull down the database content and layout the page. I would recommended not paginating the search-engine targeted content - assuming that you are not paginating the "human" version. The URLs discovered by the webcrawler should be the same as those your (human) visitors will visit. In my opinion, the page should only deviate from the "human" version by having more content pulled from the DB in one go.
A list of webcrawlers and their user agents (including Google's) is here:
http://www.useragentstring.com/pages/Crawlerlist/
And yes, as stated by others, don't reply on JavaScript for content you want see in search engines. In fact, it is quite frequently use where a developer doesn't something to appear in search engines.
All of this comes with the rider that it assumes you are not paginating at all. If you are, then you should use a server-side script to paginate you pages so that they are picked up by search engines. Also, remember to put sensible limits on the amout of your DB that you pull for the search engine. You don't want it to timeout before it gets the page.
Create a Google webmaster tools account, generate a sitemap for your site (manually, automatically or with a cronjob - whatever suits) and tell Google webmaster tools about it. Update the sitemap as your site gets new content. Google will crawl this and index your site.
The sitemap will ensure that all your content is discoverable, not just the stuff that happens to be on the homepage when the googlebot visits.
Given that your question is primarily about SEO, I'd urge you to read this post from Jeff Atwood about the importance of sitemaps for Stackoverflow and the effect it had on traffic from Google.
You should also add paginated links that get hidden by your stylesheet and are a fallback for when your endless-scroll is disabled by someone not using javascript. If you're building the site right, these will just be partials that your endless scroll loads anyway, so it's a no-brainer to make sure they're on the page.

Categories

Resources