How to profile javascript in PhantomJS

How to profile javascript in PhantomJS - javascript

We use PhantomJs 2.0 to take screenshots of web pages. We've found that one particular page takes several minutes to process. This page does not appear to have this issue (or at least not of any comparable magnitude) when loaded in Chrome.
I believe that this is because the javascript is hanging/running very slowly. During the hang, Phantom is using a lot of CPU (although only one core). It does not appear to be taking up an abnormal amount of memory. I am fairly confident that javascript is the culprit because I can see from logging that all requests complete quickly, but then after the page loads Phantom hangs for awhile and won't run anything (I think this is because Phantom is all single-threaded so if the page is still running javascript my Phantom script won't run anything).
I'd like to debug and try to understand what part of the JS is taking so long, but I can't figure out how to get at this in Phantom. For example, I can't seem to collect any output from console.profile/console.profileEnd. How can I profile the javascript running in Phantom to find the bottleneck?

I use Phantomas, via grunt-phantomas. It's a tool that integrates with PhantomJS to profile a wide variety of performance-related metrics. Definitely worth checking out. If it doesn't give you exactly what you need, you can look at the source and see how they integrate with PhantomJS and get data out.

Related

Optimizing high-speed browser interaction with Selenium/Puppeteer

I wish to create a script that can interact with any webpage with as little delay as possible, excepting network delay. This is to be used for arbitrage trading so any tips in this direction are welcome.
My current approach has been to use Selenium with Python as I am a beginner in web scraping. However, I have investigated some approaches and from my findings it seems that Selenium has a considerable delay (See benchmark results below) I mention other things besides it in this question but the focus is on Selenium.
In general, I need to send a BUY or SELL request to whatever broker I am currently working with. Using the web browser client, I have found a few approaches to do this:
1. Actually clicking the button
a. Use Selenium to Click the button with the corresponding request
b. Use Puppeteer to Click
c. Use Pynput or other direct mouse input manipulation
2. Reverse-engineering the request and sending it directly without clicking any buttons
Now, the delay in sending this request needs to be minimized as much as possible. Network delay is out of our control for the purpose of this optimization.
I have benchmarked the approaches for 1. by opening a page in the standard way you would, either with puppeteer and selenium then waiting for a few seconds. While the script waits, I injected into the browser the following code:
$x('//*[#id="id_demo_webfront"]/iframe')[0].contentDocument.body.addEventListener('click', (data => console.log(new Date().getTime())), true);
The purpose of this code is to log in console the current time when the click is registered by the browser.
Then, I log in my python(Selenium, pynput)/javascript(Puppeteer) script the current time right before issuing the click. I am running on Ubunutu 18.04 as opposed to Windows so my system clock should have good resolution. Additional reading about this here and here.
Actual results, all were run on the same webpage multiple times, for roughly 10 clicks each run:
1. a. ~80ms
1. b. ~10-30ms
1. c. ~5-10ms
For 2., I haven't reliably tested the delay. The idea behind this approach is to inject a function that when fired will send a request that exactly resembles a request that would be sent when the button is clicked. I have not tested the delay between issuing the command to run such an injected function and it actually being run, however I expect this approach to be even faster, basically creating my own client to interact with whatever API the broker has on the backend.
From the results, it is pretty clear that issuing mouse commands seems to be the quickest, but it also is the hardest for me to implement reliably. Also, I seem to find puppeteer running faster across the board, but I prefer selenium in python for ease of development and was wondering if there are any tips and ideas to speed up those delays I am experiencing.
Summary:
Why does Selenium with Python have such a delay to issue commands and can it be improved?
Why does Puppeteer seem to have a lower delay when interacting with the same browser?
Some code snippets of my approach:
class DMMPageInterface:
# These are part of the interface script, self.page is the webpage after initialization
def __init__(self, config):
self.bid_price = self.page.get_element('xpath goes here')
...
# Mostly only the speed of this operation matters, sending the trade request
def bid_click(self):
logger.debug("Clicking bid")
if USE_MOUSE:
mouse.position = (720,390)
mouse.click(Button.left, 1)
else:
self.bid_price.click()

Quite a few questions there!
"Why does Selenium with Python have such a delay to issue commands"
I don't have any direct referencable evidence for this but the selenium delay is probably down to the amount of work it has to do. It was created for testing not for performance. The difference in that is that the most valuable part of a test is that MUST run reliably, if it's slower and does more checks in order to make it more reliable than so be it. You still get a massive performance gain over manual testing, and that's what we're replacing.
This is where you'll need the reliability - and i see you mention this need in your question too. It's great if you have a trading script that can complete an action in <1s - but what if it fails 5% of the time? How much will that cause problems?
Selenium also needs to render the GUI then, depending on how you identify your objects it needs to scan the entire DOM for what you want. Generic locators will logically take longer.
Selenium's greater power is the simplicity and the provided ability synchronisation. There's lots of talk in lots of SO articles about webdriverwait (which is great for in-page script sync), but there's also good recognition of page load wait-times. i.e. no point pushing the button before all the source is loaded.
"Why does Puppeteer seem to have a lower delay when interacting with the same browser?"
Puppeteer works with chrome devtools - selenium works with the DOM. So the two are interacting in a different way. Obviously, puppeteer is only for chrome, but selenium can go to almost any browser. It might not be a problem for you if you don't care about cross-browser capability.
However - how much faster is it really?
In your execution timings, you might also want to factor in the whole life cycle of what you aim to do - namely, include: browser load time, and page render time. From your story, you want to buy or sell something. So, you kick off the script - how long does it take from pressing GO to the end result. You most likely will need synchronisation and that's where puppeteer might not be as strong (or take you much longer to get a handle on).
You say you want to "interact with any webpage with as little delay as possible", but you have to get to the page first and get it in the right state. I wouldn't be concerned about milliseconds per action. Kicking open chrome isn't the quickest action in itself and your overall millisecond test becomes >seconds to minutes.
When you have full selenium script taking 65 seconds and a puppeteer script taking 64 seconds, the overall consideration changes :-)
Some other options to think about if you really need speed:
Try running selenium headless - no gui, less resources, runs faster. (google it) :-)
Try other browsers
If you're all about speed - Drop the GUI entirely. Create an API script or start to investigate the use of performance scripts (for example jmeter or loadrunner).
Another option that isn't selenium or puppeteer is javascript directly in the browser
Try your speed tests with other free web tools like cypress or Microsoft's playwright - they interact with JS in the browser directly and brag about some excellent inbuilt stability features.
If it was me, and it was about a single performant action - my first port of call would be doing it at the API level. There's lots of resources and practice sites out there to help get you started and there's no browser to render, there's barely anything more than the network delay. That's the real way to keep it as a sub-second script.
Final thought, when you ask "can it be improved": if you have some code, there may be other ways to make it faster. Share your script so far and i'm sure folks will love to chip in their 2 cents on how you can streamline it :-)

Angular - Why would a website take a very long time to load?

Im working on a college project thats going well apart from a problem with slowness. It seems to take forever to load.
I have a feeling that im not deploying it properly or something else is slowing it down. Its deployed to a Heroku Hobby server which $7 a month and never has to go to sleep.
How would i properly check through chrome inspect or somethign similar whats causing it to take 20 seconds to see anything?

For me, the page took 9.85s to load the DOM content and 33.91s to fully load. I should also note that I'm in Korea, so it may take slightly longer for me.
Like the others have said, you have far too many script tags. Each is making their own request and they have to wait until the previous request is finished before the next one can fire off.
To drastically speed up the time, I would suggest to concatenate and minify your scripts into as few scripts as possible. Maybe one bundle for third party scripts and another for all of your scripts. Also, cache them if you can.

The networking, and auditing tabs on Chrome DevTools (inspect) can help find your issue, which definitely seems to be the amount of scripts you have loading.
Networking
Audit

CasperJS times out on a page that Chrome can load in much less time. Is this expected?

I'm using casperjs 1.1.0-beta3 to measure page load times of a CPU and data intensive web page. In some scenarios, I'm seeing that casperjs is taking a very long time to detect the test-hook that I have put in place to indicate that the page has finished loading and processing the large amount of data.
The test hook uses the casper.waitFor function to look for a variable that gets set in global scope on the page. This is what indicates to the script that everything is complete and rendered.
Details of what the page does is not not very relevant, other than to say that the it gets a large amount of JSON data from a server and the processing is done on the main thread -- no worker threads involved and no CORS requests either.
The issue is, casperjs seems to take a lot more time (10x +) compared to the identical scenario running on regular chrome.
What might be going on? What are some ways to debug?
I have setup casperjs scripts to take a screenshot of the page when it times out, and it shows that the page is still in a 'processing' state. So it indeed looks like the webkit engine used by casper is much slower than normal chrome.

Can a Javascript error slow the website's load time

I have a web portal cricandcric.com which I have done using the php, java script, and mysql.
I dont see the java script error in Firefox, but i see the error in IE,
I observe the site loads faster in Firefox than in IE.
So my question, does java script errors can slow down the website loading time, even though the java script placed at the end of the page, (Yslow Strategies)

It depends. If the error happens early on and a lot of your script code is bypassed, it could actually make it faster. But every time an error occurs, there is some overhead (the exception object has to be built and sent up the call stack to look for any catches), so if it happens at the end, the script would run slower.
But I doubt your change in load time is noticeably impacted by script errors. How long the script takes to execute on the browser's JS engine or a host of other factors will have more impact.

IE's javascript engine has always been notably behind the other common browsers in performance, so it really might come down to that more than anything else. One of the many improvements in IE9 is JS execution speed that's actually competitive.
That said, the JS error probably is worth looking into, since it's occurring every time the image slideshow advances which happens once every couple of seconds.
If you're concerned about performance in general, there are a couple of tools like YSlow and the recently-opensourced DOM Monster bookmarklet, for giving suggestions on general ways to speed up websites. You might also want to check out some of the writings of Steve Souders.

I have to restart firebug/firefox many times a day. Is it Firebug, or is it me?

After an hour or two of heavy use on the site I'm developing, Firebug develops the following problems:
Breakpoints get glitchy -- it becomes difficult to add/remove breakpoints. Sometimes I click on a line multiple times, see nothing, move to the console tab and back, and then see my breakpoints again.
Console stops logging xhr's, or stops logging debug statements.
Script files become non-viewable.
I'm working with a javascript file which is quite large (over 10k lines). I don't think this is a memory leak issue with my own code. I'm refreshing the page all the time. Looks like an issue on the Firebug side. Is my logic sound? Is there anything I can do to get firebug to behave better? Or do I just need to restart firefox every hour?

Keep in mind Firefox while wonderful has always had many issues with handling memory. You should take a look at your task manager to see Firefox's memory footprint. Additionally I'd break up that JS file if you could into smaller chunks (for many reasons aside of this as well) to be better readable and work with the segments. Finally turn off plugins your not using or that may conflict with Friebug if your not using them.

I spent hours using Firebug without restart Firefox and never crashed, try a clean profile, install on it only Firebug and check if all works fine.
I use a separated development profile with Firebug and other dev oriented extensions installed.
How to configure a profile is described in many sites, on my wiki you find a brief description

i have similar problems! i think it is partially to do with the massive JS files. I just re-start firefox every once and a while. no big deal.

Develop Reference

JavaScript is the programming language of the Web.