Cannot read "Google" from object in JavaScript - javascript

I've written a script that detects the referring URL from a couple of search engines and then passes this value (not the mc_u20 variable) to a server to be used somewhere. The script works like a treat except for one big problem, it simply won't track Google search results. So any result that comes from Google, simply doesn't register. Here is the script:
var mc_searchProviders = {"search_google":"google.co","search_bing":"bing.com","search_msn":"search.msn","search_yahoo":"search.yahoo","search_mywebsearch":"mywebsearch.com","search_aol":"search.aol.co", "search_baidu":"baidu.co","search_yandex":"yandex.com"};
var mc_socialNetworks = {"social_facebook":"facebook.co","social_twitter":"twitter.co","social_google":"plus.google."};
var mc_pageURL = window.location +'';
var mc_refURL = document.referrer +'';
function mc_excludeList() {
if (mc_refURL.search('some URL') != -1) {
return false; //exclude some URL
}
if (mc_refURL.search('mail.google.') != -1) {
return false; //exclude Gmail
}
if (mc_refURL.search(mc_paidSearch) != -1) {
return false; //exclude paidsearch
}
else {
mc_checkURL();
}
}
mc_excludeList();
function mc_checkURL() {
var mc_urlLists = [mc_searchProviders, mc_socialNetworks],
i,mc_u20;
for (i = 0; i < mc_urlLists.length; i++) {
for (mc_u20 in mc_urlLists[i]) {
if(!mc_urlLists[i].hasOwnProperty(mc_u20))
continue;
if (mc_refURL.search(mc_urlLists[i][mc_u20]) != -1) {
mc_trackerReport(mc_u20);
return false;
}
else if ((mc_refURL == '') && (mc_directTracking === true)){
mc_u20 = "direct_traffic";
mc_trackerReport(mc_u20);
return false;
}
}
}
}
The most annoying thing is, I have tested this on my local machine (by populating the mc_refURL with an actual google search URL and it works like a charm. I've also thought that maybe when searching through the first mc_searchProviders object it is somehow skipping the first instance, so I added a blank one. But still this doesn't work. What's even more annoying is that for every other search engine, the mc_u20 variable seems to populate with what I need.
This is driving me insane. Can anyone see what's wrong here? I might also mention that I'm signed into Google but I don't see how this would affect the script as their blogpost (in November) said they were filtering keywords not stopping the referring URL from being passed.

Right so I figured out what was going on. The first part of the script excludes your own URL (see where it says 'some URL'. Say this was set to www.example.com. In Google if I searched for say example and Google returned www.example.com as the first search result, in the referring URL it would contain www.example.com. Hence why the script was breaking, maybe someone will find this useful in future.

Related

Changing css dependent on domain and adding "?" into domain

First post here but I'll get on with it, I was having some trouble with finding out how to change a site's CSS dependent on the domain, what I mean by that is like if I wanted "example.com" to display a coming soon page but if you change the domain to "example.com?bypass=keyhere" then it displays the actual site that is being built, I mentioned CSS because I was thinking of changing visibility of elements, etc. but I do not even know how to get started with having it detect what the proper domain is.
You can query params using
function urlParam(name){
var results = new RegExp('[\?&]' + name + '=([^&#]*)').exec(window.location.href);
if (results == null){
return null;
}
else {
return decodeURI(results[1]) || 0;
}
}
if (urlParam.bypass === 'keyhere') {
// if you want to do something
} else {
document.body.classList.add('hideContent')
}

Attempting window.location or window.location.href redirect which is causing a loop

I'm attempting to use javascript to determine if the user is using a certain language and if they're not using english then for the page to load a different page BUT with the params of which I've grabbed from the url.
I have been able to load the page with the params but I keep falling into a loop reloading the page, even after skimming through the countless other examples, such as: this or this.
function locateUserLanguage() {
var languageValue = (navigator.languages ? navigator.languages[0] : (navigator.language || navigator.userLanguage)).split('-');
var url = window.location.href.split('?');
var baseUrl = url[0];
var urlParams = url[1];
if (languageValue[0] === 'en') {
console.log('no redirect needed, stay here.');
} else {
// I tried to set location into a variable but also wasn't working.
// var newURL = window.location.href.replace(window.location.href, 'https://www.mysite.dog/?' + urlParams);
window.location.href = 'https://www.mysite.dog/?' + urlParams
}
} locateUserLanguage();
I've attempted to place a return true; as well as return false; but neither stop the loop.
I've tried window.location.replace(); and setting the window.location.href straight to what I need, but it's continuing to loop.
There is a possibility that the script in which this function is written is executed in both of your pages (english and non-english) on load. So, as soon as the page is loaded, locateUserLanguage function is executed in both english and non-english website causing the infinite loop.
You need to put a check before you call locateUserLanguage function.
Suppose english website has url = "www.myside.com" and non-english website has url "www.myside.aus". So the condition needs to be
if (window.location.host === "www.myside.com") { locateUserLanguage() }
This will make sure that locateUserLanguage is called only in english website.
Or other apporach can be to load this script only in english website which will avoid the usage of conditional statement.
Hope it helps. Revert for any doubts.

Searching all pages of my website [duplicate]

I want to have a search engine which searches only my own site. I have some JavaScript currently, but it only searches words on that specific page. I need it to search the links within my site if possible.
I cannot use the Google search engine as my site is on an internal intranet.
<SCRIPT language=JavaScript>
var NS4 = (document.layers);
var IE4 = (document.all);
var win = window;
var n = 0;
function findInPage(str) {
var txt, i, found;
if (str == "")
return false;
if (NS4) {
if (!win.find(str))
while(win.find(str, false, true))
n++;
else
n++;
if (n == 0)
alert("Not found.");
}
if (IE4) {
txt = win.document.body.createTextRange();
for (i = 0; i <= n && (found = txt.findText(str)) != false; i++) {
txt.moveStart("character", 1);
txt.moveEnd("textedit");
}
if (found) {
txt.moveStart("character", -1);
txt.findText(str);
txt.select();
txt.scrollIntoView();
n++;
}
else {
if (n > 0) {
n = 0;
findInPage(str);
}
else
alert("Sorry, we couldn't find.Try again");
}
}
return false;
}
</SCRIPT>
(onsubmit="return findInPage(this.string.value); in the button tag.)
It works great for searching that page, but I was hoping there was a way to search all pages on my site.
Few suggestions:
Unless you must, don't re-invent the wheel - there are open source libraries such as Tipue Search (Tipue Search) and others.
You can use jquery/ajax $.load() to dynamically load page content and search them, while still staying in the same page as far as your DOM and script goes.
NodeJS is also a good option, but will probably be an over kill.
Hope this helps!
You could use search-index. It can run on the server and in the browser. An example on how to run it in the browser and the actual demo of it. You would have to write a crawler/spider that goes through your site. Lunr.js would also work well, I think.
If you had your site as JSON, the indexing would be a small task to fix, or you could have a crawler running in the browser.
Disclaimer: I'm doing some work on search-index.
As you are on an intranet and presumably all you pages are on the same server then I would think it would be possible to make a XMLHttpRequest to each of your pages in turn, store the page in a variable and then do a search on the stored page.
Possibly someone with more experience of XMLHttpRequest would say how efficient or effective this would be.

differentiate between different urls

I am trying to differentiate between different urls. I have an if/else in place but hopefully this can be done better in vanilla js. No express js please.
/product
/product/1
/product/1/customer
/product
/product/2
/product/2/customer
/customer
/customer/1
/customer/1/product
/customer
/customer/2
/customer/2/product
Current strategy:
if(request.url.indexOf('/product') != -1 && request.url.length == '/product'.length) {
} else if { // /product/:id
if(!request.params) request.params = {};
request.params.id = request.url.match(/^\/product\/([^\\/]+?)(?:\/(?=$))?$/i)[1];
} else { // 3rd case /product/1/customer
}
I think my if/else are not resolving to all uri's mentioned above. Please suggest any solution, so that I can resolve all 3 cases from above in a reusable way for different urls, and run appropriate queries from there.
Here is a very basic example that will hopefully put you on the right track. You firstly need to store all the urls you want, and their associated functions that you want to run. Then when a new url comes in, you need to compare it to your stored urls to see if there is a match. As urls can have certain varying values, you need a way to deal with wildcards.
So firstly create a function that adds a particular url scheme, and associated run function to an array.
var myUrls = [];
function addUrl(url, associatedFunc){
mrUrls.push({
func: associatedFunc,
parts: url.split('/').map(function(item){
return item.startsWith(':') ? null : item;
});
});
}
To add a new url you would put : in front of any part of the url that can be wild.
addUlr('/product/:value/customer', doCustomerStuffFunc);
Next you need a way of comparing incoming url requests with your url array.
function resolveUrl(url){
myUrls.forEach(function(item){
var pieces = url.split('/');
if(pieces.length === item.parts.length){
for(var i=0; i<pieces.length; i++){
// check if the piece is valid
if(item.parts[i] && pieces[i] !== item.parts[i]){
break;
}
// if there is an exact match run the function and return
if(i === pieces.length - 1){ return item.func(pieces); }
}
}
});
}
This will run the first encountered matches associated function, passing in an array containing all the values in the given url.
Note: This is untested code, written off the top of my head, intended to get you started, and not be a full solution.

Navigating / scraping hashbang links with javascript (phantomjs)

I'm trying to download the HTML of a website that is almost entirely generated by JavaScript. So, I need to simulate browser access and have been playing around with PhantomJS. Problem is, the site uses hashbang URLs and I can't seem to get PhantomJS to process the hashbang -- it just keeps calling up the homepage.
The site is http://www.regulations.gov. The default takes you to #!home. I've tried using the following code (from here) to try and process different hashbangs.
if (phantom.state.length === 0) {
if (phantom.args.length === 0) {
console.log('Usage: loadreg_1.js <some hash>');
phantom.exit();
}
var address = 'http://www.regulations.gov/';
console.log(address);
phantom.state = Date.now().toString();
phantom.open(address);
} else {
var hash = phantom.args[0];
document.location = hash;
console.log(document.location.hash);
var elapsed = Date.now() - new Date().setTime(phantom.state);
if (phantom.loadStatus === 'success') {
if (!first_time) {
var first_time = true;
if (!document.addEventListener) {
console.log('Not SUPPORTED!');
}
phantom.render('result.png');
var markup = document.documentElement.innerHTML;
console.log(markup);
phantom.exit();
}
} else {
console.log('FAIL to load the address');
phantom.exit();
}
}
This code produces the correct hashbang (for instance, I can set the hash to '#!contactus') but it doesn't dynamically generate any different HTML--just the default page. It does, however, correctly output that has when I call document.location.hash.
I've also tried to set the initial address to the hashbang, but then the script just hangs and doesn't do anything. For example, if I set the url to http://www.regulations.gov/#!searchResults;rpp=10;po=0 the script just hangs after printing the address to the terminal and nothing ever happens.
The issue here is that the content of the page loads asynchronously, but you're expecting it to be available as soon as the page is loaded.
In order to scrape a page that loads content asynchronously, you need to wait to scrape until the content you're interested in has been loaded. Depending on the page, there might be different ways of checking, but the easiest is just to check at regular intervals for something you expect to see, until you find it.
The trick here is figuring out what to look for - you need something that won't be present on the page until your desired content has been loaded. In this case, the easiest option I found for top-level pages is to manually input the H1 tags you expect to see on each page, keying them to the hash:
var titleMap = {
'#!contactUs': 'Contact Us',
'#!aboutUs': 'About Us'
// etc for the other pages
};
Then in your success block, you can set a recurring timeout to look for the title you want in an h1 tag. When it shows up, you know you can render the page:
if (phantom.loadStatus === 'success') {
// set a recurring timeout for 300 milliseconds
var timeoutId = window.setInterval(function () {
// check for title element you expect to see
var h1s = document.querySelectorAll('h1');
if (h1s) {
// h1s is a node list, not an array, hence the
// weird syntax here
Array.prototype.forEach.call(h1s, function(h1) {
if (h1.textContent.trim() === titleMap[hash]) {
// we found it!
console.log('Found H1: ' + h1.textContent.trim());
phantom.render('result.png');
console.log("Rendered image.");
// stop the cycle
window.clearInterval(timeoutId);
phantom.exit();
}
});
console.log('Found H1 tags, but not ' + titleMap[hash]);
}
console.log('No H1 tags found.');
}, 300);
}
The above code works for me. But it won't work if you need to scrape search results - you'll need to figure out an identifying element or bit of text that you can look for without having to know the title ahead of time.
Edit: Also, it looks like the newest version of PhantomJS now triggers an onResourceReceived event when it gets new data. I haven't looked into this, but you might be able to bind a listener to this event to achieve the same effect.

Categories

Resources