loading a external content so that searchable by Google for SEO purposes

loading a external content so that searchable by Google for SEO purposes - javascript

I'm working on a project where we'd like to load external content onto a customers site. The main requirements are that we'd like the customer to have as simple of an include as possible (like a one-line link similar to Doubleclick) and would preferably not have to be involved in any server-side language. The two proposed ways of doing this were an iframe or loading a javascript file that document.write's out the content.
We looked more at the latter since it seemed to produce more reliable legibility and simplicity for the end user - a single line of Javascript. We have been hit with the reality that this will be indexed unpredictably by Google. I have read most of the posts on this topic regarding javascript and indexing (for example http://www.seroundtable.com/google-ajax-execute-15169.html, https://twitter.com/mattcutts/status/131425949597179904). Currenlty we have (for example):
<html>
<body>
<div class='main-container'>
<script src='http://www.other.com/page.js'></script>
</div>
</body>
</html>
and
// at http://www.other.com/page.js
document.write('blue fish and green grass');
but it looks like google indexes this type of content only sometimes based upon 'Fetch As Google' used in Google's webmaster tools. Since it does sometimes work, I know it's possible for this indexing to be ok. More specifically, if we isolate our content to something like the above and remove extraneous content, it will index it each time (as opposed to the EXACT SAME Javascript in a regular customer html page). If we have our content in a customer's html file it doesn't seem to get indexed.
What would be a better option to ensure that Google has indexed the content (remote isn't any better)? Ideas I have tried / come across would be to load a remote file in for example PHP, something like:
echo file_get_contents('http://www.other.com/page');
This is obviously blocking but possibly not a deal-breaker.
Given the above requirements, would there be any other solution?
thx

This is a common problem and I've created a JS plugin that you can use to solve this.
Url: https://github.com/kubrickology/Logical-escaped_fragment
Make sure to use the: __init() function instead of standard DOM ready functions and you know for sure that Google is able to index.

Related

Get data from another HTML page

I am making an on-line shop for selling magazines, and I need to show the image of the magazine. For that, I would like to show the same image that is shown in the website of the company that distributes the magazines.
For that, it would be easy with an absolute path, like this:
<img src="http://www.remotewebsite.com/image.jpg" />
But, it is not possible in my case, because the name of the image changes everytime there is a new magazine.
In Javascript, it is possible to get the path of an image with this code:
var strImage = document.getElementById('Image').src;
But, is it possible to use something similar to get the path of an image if it is in another HTML page?

Assuming that you know how to find the correct image in the magazine website's DOM (otherwise, forget it):
the magazine website must explicitly allow clients showing your website to fetch their content by enabling CORS
you fetch their HTML -> gets you a stream of text
parse it with DOMParser -> gets you a Document
using your knowledge or their layout (or good heuristics, if you're feeling lucky), use regular DOM navigation to find the image and get its src attribute
I'm not going to detail any of those steps (there are already lots of SO answers around), especially since you haven't described a specific issue you may have with the technical part.

You can, but it is inefficient. You would have to do a request to load all the HTML of that other page and then in that HTML find the image you are looking for.
It can be achieved (using XMLHttpRequest or fetch), but I would maybe try to find a more efficient way.

What you are asking for is technically possible, and other answers have already gone into the details about how you could accomplish this.
What I'd like to go over in this answer is how you probably should architect this given the requirements that you described. Keep in mind that what I am describing is one way to do this, there are certainly other correct methods as well.
Create a database on the server where your app will live. A simple MySQL DB will work, but you could use anything. Create a table called magazine, with a column url. Your code would pull the url from this DB. Whenever the magazine URL changes, just update the DB and the code itself won't need to be changed.
Your front-end code needs some sort of way to access the DB. One possible solution is a REST API. This code would query the DB for the latest values (in your case magazine URLs), and make them accessible to your web page. This could be done in a myriad of different languages/frameworks, here's a good tutorial on doing something like this in Node.js and express (which is what I'd personally use).
Finally, your front-end code needs to call your REST API to get the updated URLs. This needs to be done with some kind of JavaScript based language. jQuery would make this really easy, something like this:
$(document).ready(function() {
$.Get("http://uri_to_your_rest_api", function(data) {
$("#myImage").attr("scr", data.url);
}
});
Assuming you had HTML like this:
<img id="myImage" src="">
And there you go - You have a webpage that pulls the image sources dynamically from your database.
Now if you're just dipping your toes into web development, this may seem a bit overwhelming. But I promise you, in the long run it'll be easier then trying to parse code from an HTML page :)

Is there any way to implement DKI in Squarespace landing pages or I must create 800 pages with different title manually?

Well, as in title; I want to implement dynamic keyword insertion into my Squarespace landing pages, I know how bad this engine is but well, it's for a company though. Last time I created 100 landing pages with different title and description etc, now I am slightly worried that 800 is a little bit too much for a manual work. All I can do in Squarespace is just javascript within the body, maybe you know how to actually utilize this opportunity.
Is it possible to use dynamic keyword insertion in Squarespace landing pages?
PS: Do yourself a favour, don't use Squarespace, never. Beaver builder + Wordpress > Squarespace

This might be possible using developer mode and a small misuse of the Squarespace tag and/or category URL queries. For example, you could:
Create ads using DKI
Create a custom layout (.region or .block) using JSON-T. Somewhere in your page, you'd include {categoryFilter} or {tagFilter}. Similar to Mustache.js, wherever you insert that reference in your template, the value of the category or tag query parameter will be inserted. This could be used to set the title tag on the page or meta description for example.
Append ?category=My Keyword Text (or ?tag=...) to the destination URL of the ad.
Notes:
The head of your template will likely reference the {squarespace-headers} tag. This is necessary for your Squarespace site to run properly (at least, unless you invest hours into breaking it down into its various components, which has been done, but then requires ongoing maintenance). This will contain its own title and meta description tags as well as other meta that may be working against you. You may have to experiment with forcing your own title and meta description by adding your own code above and/or below {squarespace-headers}. Your page will have multiple such tags; it's been said Google will ignore subsequent ones.
All Squarespace websites have an identical robots.txt, and it is not editable. URLs containing tag= are disallowed. On the other hand, category= was recently removed from this disallow-list. That may influence which you use, if you attempt this route at all.
A Squarespace site can have up to 1000 pages, but they don't recommend adding more than 400. That may influence your approach as well.
As mentioned, this is a misuse of the tag and category filter URL queries, but these are the only query parameters that allow you to insert essentially whatever values you want and have those values accessible from within the templating engine. The category and tag URL queries are intended for collections, not pages, but the value works across all Squarespace page types that I've tested. Depending on your application, it could be that creating a collection of items and having a legitmate category/tag filter may be a relevant approach.
You could attempt to do this all with Javascript-based templating, but I am assuming you're looking to render the page, with keywords, server-side.
Squarespace's templating engine, JSON-T, doesn't have logic such as if value contains X.... This will limit the degree of flexibility you have if trying to render different content based on the keyword. You can check for equality via {.equal?...}, though I doubt that's practical for your application, given the number of possibilities. Of course, you can insert the keyword itself as mentioned previously.
Although I have a lot of experience with Squarespace and developer mode, I've not attempted this specific scenario myself, so this is more theoretical than from experience.

Does google robot index text from javascript document.write()?

Lets say I have this:
<script type="text/javascript">
var p = document.getElementById('cls');
p.firstChild.nodeValue = 'Some interesting information';
</script>
<div id="cls"> </div>
So, google robots will index text Some interesting information or not?
Thanks!

AFAIK, google robot will now indexing AJAX and Javascript stuff.For reference please follow:
http://www.submitshop.com/2011/11/03/google-bot-now-indexing-ajax-javascript
Get google to index links from javascript generated content

Update
SearchEngine watch has recently mentioned that Google bot has been improvised to read JavaScript, to quote exactly
it can now read and understand certain dynamic comments implemented
through AJAX and JavaScript. This includes Facebook comments left
through services like the Facebook social plugin.

We've had a need to hide pieces of information on pages from GoogleBot. As the information wasn't extremely sensitive, we've used document.write()-s to avoid searchbots indexing content in question.
Later in 2011 Q3 I've found that GoogleBot did index the scripted content, so I'm pretty sure now that Google is indexing much more than just fetching URLs from content, even though it's really not documented anywhere deeply.

Google doesn't index the JavaScript code or the generated content. You will only see it in the cache because the cached page consists of the complete file including the JavaScript code and your browser renders it. Google does scan JavaScript for URLs to crawl, so if the code is pulling content from an external file via Ajax, etc., there's a chance that the external file will also be indexed, but separate from the parent page. If you want the content to be indexed, it's got to be in plain HTML. Good luck!

mediawiki 1.16.5: Load javascript for a specific namespace

I am developing an extension for Mediawiki which is based on another extension (developed in-house) that will not work above with a Mediawiki installation with a version superior 1.16.5 . I need to include javascript in pages belonging to a specific namespace and I cannot use the ResourceLoader http://www.mediawiki.org/wiki/ResourceLoader .
Does someone know if there's a simple to do this? I need to include JQuery and Datatables for a custom rendering of the pages belonging to the namespace.

There are at least three ways to go about this.
The 1st approach is to edit the magic page MediaWiki:Common.js and add something like this:
if(wgNamespaceNumber == 0) { // NS_MAIN
importScript('MediaWiki:MyScript.js');
}
You can place arbitrary javascript in the block, the importScript bit there is for executing JavaScript stored in a Wiki page but there are other ways to include JS on the fly as well (see eg. this question). See Manual:Interface/JavaScript for details of the MediaWiki side of things.
The 2nd approach would be to hack the PHP that produces the MediaWiki page to inject <script> tags depending on the current namespace, but that's a bit more involved: you'd need to build a custom extension and hook it in at some appropriate point. The ParserAfterTidy hook looks suitable, see Hooks.
The 3rd approach would be to simply edit the skin and load the JS for every page in the wiki -- is there a reason you don't want to do this for every page? They're cached anyway, so it's only a one-time hit.

What benefits are there to storing Javascript in external files vs in the <head>?

I have an Ajax-enabled CRUD application. If I display a record from my database it shows that record's values for each column, including its primary key.
For the Ajax actions tied to buttons on the page I am able to set up their calls by printing the ID directly into their onclick functions when rendering the HTML server-side. For example, to save changes to the record I may have a button as follows, with '123' being the primary key of the record.
<button type="button" onclick="saveRecord('123')">Save</button>
Sometimes I have pages with Javascript generating HTML and Javascript. In some of these cases the primary key is not naturally available at that place in the code. In these cases I took a shortcut and generate buttons like so, taking the primary key from a place it happens to be displayed on screen for visual consumption:
...
<td>Primary Key: </td>
<td><span id="PRIM_KEY">123</span></td>
...
<button type="button" onclick="saveRecord(jQuery('#PRIM_KEY').text())">DoSomething</button>
This definitely works, but it seems wrong to drive database queries based on the value of text whose purpose was user consumption rather than method consumption. I could solve this by adding a series of additional parameters to various methods to usher the primary key along until it is eventually needed, but that also seems clunky.
The most natural way for me to solve this problem would be to simply situate all the Javascript which currently lives in external files, in the <head> of the page. In that way I could generate custom Javascript methods without having to pass around as many parameters.
Other than readability, I'm struggling to see what benefit there is to storing Javascript externally. It seems like it makes the already weak marriage between HTML/DOM and Javascript all the more distant.
I've seen some people suggest that I leave the Javascript external, but do set various "custom" variables on the page itself, for example, in PHP:
<script type="text/javascript">
var primaryKey = <?php print $primaryKey; ?>;
</script>
<script type="text/javascript" src="my-external-js-file-depending-on-primaryKey-being-set.js"></script>
How is this any better than just putting all the Javascript on the page in the first place? There HTML and Javascript are still strongly dependent on each other.

one point: an external file can be cached by the browser, a js-block in the head is loaded every time the file loads.

Performance (due to browser caching)
Separation of concerns - HTML/CSS/JavaScript should be separate. It makes working with them easier. You know exactly where to locate certain areas, plus other developers can work on the likes of HTML, CSS and JavaScript independently.
Reuse - you can include a source file in multiple locations/projects without duplicating code.

You can YUICompress your javascript (at build/integration time) if it's in separate files. I smash all my Javascript together (lots of separate little jQuery plugins etc) at build time so that there's just one file to fetch/cache.

It depends on how much Javascript are you dynamically generating on the server-side versus how much of it is static. If all of it is dynamically generated, then it doesn't matter where you put them as every request will pull a new file without any caching. Putting it in the head has the advantage of one lesser HTTP request which is hardly any benefit unless you're primary concern is performance and bandwidth is a non-issue.
But if most of the Javascript is static, keeping it in separate files at development time keeps things organized.
Dynamically generated Javascript can be served as separate files instead of being part of the page itself. It will add an extra HTTP call.
<script src="myServerSideScript.php" type="text/javascript"></script>

Develop Reference

JavaScript is the programming language of the Web.