I am trying out CasperJS. I am trying to create a web scraper . I need to scrape all pages of site(s) and get data in less than 5 seconds(each page).
For this I will have to crawl through all similar pages. Go to appropriate content div and get data from there.
So If the site has say 1000 pages. I need to complete whole operation as quickly as possible. I can not control N/w latency, page size etc parameter. All I can control is parsing mechanism. So I want it to be as fast as possible. If there is even small improvement, then it will extrapolate as the number of URLs
I am trying to parse child elements and creating CSS paths.
I need to make sure parsing does not take long time.
I hear standard java-script is more efficient in terms of performance than JQuery.
Hence, I need inputs.
What will be the standard JS equivalent of following JQuery code, which performance efficient in terms of parsing.
function() {
var TAG_CSS_PATH = 'div#buttons ul li.tab';
var selectOptions = $(TAG_CSS_PATH);
var results = [],i=0;
selectOptions.each(function(index,value) {
index=index+1;
results.push(TAG_CSS_PATH+':nth-of-type('+index+')');
});
return results
}
If anybody can provide any other suggestions, I will appreciate it.
This should do it:
function() {
var TAG_CSS_PATH = 'div#buttons ul li.tab',
selectOptions = document.querySelectorAll(TAG_CSS_PATH),
results = [],
l = selectOptions.length + 1;
for(var i = 1; i < l; i++){
results.push(TAG_CSS_PATH+':nth-of-type('+i+')');
}
return results;
}
The jQuery part is the $selector, and the $each. These can be replaced as follows.
function() {
var TAG_CSS_PATH = '#buttons ul li.tab',
selectOptions = document.querySelectorAll(TAG_CSS_PATH),
results = [];
for( var i = 1, ln = selectOptions.length + 1; i < ln; i++ ) {
results.push(TAG_CSS_PATH+':nth-of-type('+ i +')');
}
return results;
}
Since you are storing selectors, it seems still really inefficient to me (usage of nth-of-type are expensive selectors). Selectors are read from right to left.
CSS/selector optimisation
Note,
the div#buttons seems redundant. If you use CSS properly, you will have only exactly one element that matches id='buttons'. Thus, with proper use of IDs, you should be able to remove the div in the selector.
Further, if all your .tabs are li, then you can remove the li, too. If all your li.tabs are inside ul, you can remove the ul too.
Related
First question ever, new to programming. I'll try to be as concise as possible.
What I want to do is to create a bunch of children inside a selected div and give each of them specific html content (from a predefined array) and a different id to each child.
I created this loop for the effect:
Game.showOptions = function() {
var i = 0;
Game.choiceElement.html("");
for (i=0; i<Game.event[Game.state].options.length; i++) {
Game.choiceElement.append(Game.event[Game.state].options[i].response);
Game.choiceElement.children()[i].attr("id","choice1");
}
};
Using the predefined values of an array:
Game.event[0] = { text: "Hello, welcome.",
options: [{response: "<a><p>1. Um, hello...</p></a>"},
{response: "<a><p>2. How are you?</p></a>"}]
};
This method does not seem to be working, because the loop stops running after only one iteration. I sincerely have no idea why. If there is a completely different way of getting what I need, I'm all ears.
If I define the id attribute of each individual p inside the array, it works, but I want to avoid that.
The idea is creating a fully functional algorithm for dialogue choices (text-based rpg style) that would work with a predefined array.
Thanks in advance.
The problem with your loop as I see it could be in a couple different places. Here are three things you should check for, and that I am assuming you have but just didn't show us...
Is Game defined as an object?
var Game = {};
Is event defined as an array?
Game.event = new Array();
Is Game.state returning a number, and the appropriate number at that? I imagine this would be a little more dynamic then I have written here, but hopefully you'll get the idea.
Game.state = 0;
Now assuming all of the above is working properly...
Use eq(i) instead of [i].
for (var i = 0; i<Game.event[Game.state].options.length; i++) {
Game.choiceElement.append(Game.event[Game.state].options[i].response);
Game.choiceElement.children().eq(i).attr("id","choice" + (i + 1));
}
Here is the JSFiddle.
I have to increase the z-index by 1, of all span with class .page. There can be more than 100 matched elements (NOT more than 150 in any case). Right now I am iterating through each one of them and changing the z-index via following code.
$('#mydiv span.page').each(function() {
var zi = parseInt($(this).css('z-index')) + 1;
$(this).css('z-index', zi);
});
Is there a better way to deal with it for better performance. I am using jQuery.
Some tricky way is,
Create new style
var style = document.createElement('style');
style.type = 'text/css';
style.innerHTML = '.cssClass { z-index: value; }';
document.getElementsByTagName('head')[0].appendChild(style);
document.getElementById('yourElementId').className = 'cssClass';
The best way would be to rewrite your logic not to depend on a uniform incremental z-index in the element styling. If you are only ever running this logic once, perhaps you can set up some general CSS rules that just involve toggling a class to achieve the layout you want. Assuming that is not an option, there isn't much you can do to make it more performant.
You may be able to detach the '#mydiv' element temporarily to reduce page repainting but it is hard to give more help without more info, and that can confuse other things.
var div = $('#mydiv');
var prev = div.prev();
div.detach();
// You can clean up your jQuery like this:
div.find('span.page').css('z-index', function(index, zIndex) {
return parseInt(zIndex, 10) + 1;
});
div.insertAfter(prev);
In terms of performance, #Jaykishan Mehta's one is the best, then comes the for-loop.
for (var i = 0, spans = document.querySelectorAll('#mydiv span.page'), len = spans.length; i < len; i += 1) {
spans[i].style.zIndex = parseInt(spans[i].style.zIndex, 10) + 1;
}
Using jQuery massively, i.e for each iteration etc., can slow down globally.
What I mean is that jQuery may do separate tasks quite quickly, but the sum might cause a general slowdown.
It all depends on your app/website.
I want to make my DIV element to be able to click and go to the detail page with the ID as a query parameter data to my server.
I have found some examples of possible uses I can use, example :
<div style="cursor:pointer;" onclick="document.location='http://www.google.com'">Foo</div>
It's just that I was confused would like to add the above script into the code that I built.
Part of my code :
for ( var i = 0; i < response.length; ++i ) {
str = response[i].judul;
str2 = response[i].waktu_mulai;
str3 = response[i].channel_code;
var Year,Month,Date,Time,Strip,Join= ""
var Year = str2.substr(0,4)
var Month = str2.substr(5,2)
var Date = str2.substr(8,2)
var Time = str2.substr(-8,8)
var Strip = str2.substr(4,1)
var Join = Date+Strip+Month+Strip+Year+' '+Time
listItem = document.createElement('div');
listItem.setAttribute('data-bb-type', 'item');
listItem.setAttribute('data-bb-img', 'images/icons/logo/'+str3+'.png');
listItem.setAttribute('data-bb-title', str);
listItem.innerHTML = Join+" WIB";
container = document.createElement('div');
container.appendChild(listItem);
bb.imageList.apply([container]);
dataList.appendChild(container.firstChild);
if (bb.scroller) {
bb.scroller.refresh();
}
}
Maybe someone can help me use the link on each DIV additions made by looping my application from database.
Ensure you provide the right context for your question. you're currently posting a snippet of bbUI.js framework script, which might interact completely different as normal HTML, CSS3 and JavaScript.
I also you declare you variables twice. -> var ...... = ""; then again each individual var.
Also try to search StackOverflow first to see if you're asking the same question again. This particular question has already been raised and/or answer many times before and you can find it on many "starting with HTML, Javascript". Though everyone would like to help out, some things are just found in covering the basics and can easily be found by performing the right searches on e.g. Google. People want to see you're putting effort in finding the answer yourself and also see the effort in formulating your question. The better you describe your issue, the more accurate the answer will be.
Back to the subject:
Using just this will solve your problem I think:
<element>.setAttribute('onclick','doSomething();'); // for normal browsers
<element>.onclick = function() {doSomething();}; // for IE
where you can replace 'doSomeThing();' with your own wanted code eg. :
"document.location='http://www.google.com'"
If you want to make it more dynamic, you can also just call a function:
<element>.setAttribute('onclick','myFunction();'); // for normal browsers
<element>.onclick = function() {myFunction();}; // for IE
Where myFunction:
function MyFunction() {
var called_id = this.id;
var call_url = "http://myurl.com/page?id="+called_id;
document.location = call_url;
return; //superflous
}
And as the others remarked try to up your acceptance rate for StackOverFLow, people will be more eager to answer your questions.
Disclaimer: I am fully aware that the id attribute is for unique IDs. In my case, when the ajax request takes longer than usual it can glitch, causing two same chat messages. I am also aware that there are similar questions out there like this, but I have been unable to find one that solves my issue.
This is what I want to do:
Are there any duplicate IDs inside the div chat_log?
What are they?
Delete all the duplicates
Making sure that the original one is still there.
I've tried using the following code:
$('[id]').each(function () {
var ids = $('[id=' + this.id + ']');
if (ids.length > 1 && ids[0] == this) {
$(ids[1]).remove();
}
});
But I'm not sure how I can adapt that method to my situation, nor am I sure if it would be possible.
How can you ensure that something is unique? Let's say you have a bunch of vegetables (cucumbers, turnips, pizzas etc.) You want to ensure colour uniqueness, making sure that any colour only appears once. How'd you do it?
What I'd do is make a list. I'd go through every vegetable, and inspect its colour. If the colour is already on the list, we'll remove that vegetable from the bunch. Otherwise, we leave it as-is and add its colour to our list.
Once that logic is understood, all we need is to convert it to code! What a fantastically trivial thing to do (on paper, of course.)
//assumes $elem is the element you're deleting duplicates in
//create the ids list we'll check against
var ids = {};
//go over each element
var children = $elem.children();
for ( var i = 0, len = children.length; i < len; i++ ) {
var id = children[ i ].id;
//was this id previously seen?
if ( ids.hasOwnProperty(id) ) {
$( children[i] ).remove();
}
//a brand new id was discovered!
else {
ids[ id ] = true;
}
}
//done!
This is the very simple, plain logic version. You can make much fancier ways with some weird sizzle selectors, but this should get you started.
Demo (without jquery): http://tinkerbin.com/qGJpPsAQ
Your code should work but it only removes the second element that has the same ID, try this:
$('[id]').each(function() {
var $ids = $('[id=' + this.id + ']');
if ($ids.length > 1) {
$ids.not(':first').remove();
}
});
http://jsfiddle.net/3WUwZ/
I need to run this function every X number of posts and the page uses AJAX to load in new posts as you scroll down the page. I was hoping to use the function below using a for loop with the modulus operator but it doesn't seem to accomplish what i'm looking for. Any idea how to do this?
$(document).ready(function($) {
function adTileLoop(){
var adTile = "<div class='new-box' id='article-tile'></div>";
var adLoc = 11;
var p = $('#article-tile');
var tile = $('#article-tile:nth-child('+adLoc+')');
for(var i = 0; i <= p.length; i++){
if(i % adLoc == 0){
$(tile).after(adTile);
}
}
}
$('#content').live(adTileLoop);
}
First of all, you need to be careful to keep IDs unique. The fact that you have $("#article-tile") and you are trying to select more than one is a mistake. ID's must be unique.
Here's a better way to run over a bunch of divs with jQuery:
$("div").each(function() {
console.log($(this));
});
You can then improve the selector to select only nth-children as you do in your question:
$("div:nth-child(2)") for example will get every other div on the page.
To retrieve the information about your posts specifically, use a selector specific to each post, something like: $(".post:nth-child(2)").
As Ashirvad suggested in the comments, you can run this after a successful ajax call and you will be able to retrieve the updated information about your page.