Record and replay Javascript - javascript

I know it is possible to record mouse movements, scrolling and keystrokes. But what about changes to the document? How can I record changes to the document?
Here is my try out. There must be a better more simple way to store all events?
I am thankful for all tips I can get!
<!DOCTYPE html>
<html>
<head>
<title>Record And replay javascript</title>
</head>
<body id="if_no_other_id_exist">
<div style="height:100px;background:#0F0" id="test1">click me</div>
<div style="height:100px;background:#9F9" class="test2">click me</div>
<div style="height:100px;background:#3F9" id="test3">click me</div>
<div style="height:100px;background:#F96" id="test4">click me</div>
<script src="http://code.jquery.com/jquery-latest.min.js"></script>
<script>
$(document).ready(function() {
var the_time_document_is_redy = new Date().getTime();
var the_replay = '';
$('div').live("click", function (){
var the_length_of_visit = new Date().getTime() - the_time_document_is_redy;
// check if the element that is clicked has an id
if (this.id)
{
the_replay =
the_replay
+
"setTimeout(\"$('#"
+
this.id
+
"').trigger('click')\","
+
the_length_of_visit
+
");"
;
alert (
"The following javascript will be included in the file in the replay version:\n\n"
+
the_replay
) // end alert
} // end if
// if it does not have an id, check if the element that is clicked has an class
else if (this.className)
{
// find the closest id to better target the element (needed in my application)
var closest_div_with_id = $(this).closest('[id]').attr('id');
the_replay =
the_replay
+
"setTimeout(\"$('#"
+
closest_div_with_id
+
" ."
+
this.className
+
"').trigger('click')\","
+
the_length_of_visit
+
");"
;
alert (
"The following javascript will be included in the file in the replay version:\n\n"
+
the_replay
) // end alert
} // end if
});
// fall back if there are no other id's
$('body').attr('id','if_no_other_id_exist');
// example of how it will work in the replay version
setTimeout("$('#test1').trigger('click')",10000);
});
</script>
</body>
</html>

I became curious by this question and implemented a proof of concept here
https://codesandbox.io/s/jquery-playground-y46pv?fontsize=14&hidenavigation=1&theme=dark
Using the demo
Press record, click on some circles, type something in the input, press record again to stop the recording and finally click play.
You can tweak the size of the playback by editing the REPLAY_SCALE variable in the source code.
You can control the playback speed by changing the SPEED variable in the source code.
Note, I only tested this on Chrome.
Implementation details:
It monitors mousemove, click and typing events. It should be easily extensible to add others such as scroll, window resizing, hover, focus etc.
Playback creates an <iframe>, injects the original HTML and replays the user events.
The event listeners bypass any event.stopPropagation() by using capture when listening for events on the document.
Displaying playback in a different resolution is done using zoom CSS3.
A transparent canvas could be overlaid to draw the trace lines of the mouse. I use just a simple div so no trace lines.
Considerations:
Imagining we are capturing user events on a real website. Since the page served could change between now and the playback we can't rely on the client's server when replaying the recording in the iframe. Instead we have to snapshot the html, all ajax requests and resource requests made during the recording. In the demo I only snapshot the HTML for simplicity. However in practice, all extra requests would have to be stored on the server in realtime as they are downloaded on the client page. Furthermore, during playback it is important that the requests are played back with the same timeline that they were perceived by the user. To simulate the request timeline, the offset and duration of each request must also be stored. Uploading all page requests as they are downloaded on the client will slow down the client page. One way to optimize this uploading could be to hash the contents of the request before they are uploaded, if the hash is already present on the server, the request data need not be reuploaded. Furthermore, the session of one user can leverage the request data uploaded by another user using this hashing method. Finally, the browser itself need not do the uploading, provided all requests are going through a central server, this snapshotting can happen server side so as not to impact the user's experience.
Careful consideration will be needed when uploading all the user events. Since lots of events will be generated, this means lots of data. Perhaps some compression of the events could be made e.g. losing some of the less important mousemove events. An upload request should not be made per event to minimize number of requests. The events should be buffered until a buffer size or timeout is reached before each batch of events is uploaded. A timeout should be used as the user could close the page at any point thus losing some events.
During playback outgoing POST requests should be mocked to prevent duplicating events elsewhere.
During playback the user agent should be spoofed but this may be unreliable in rendering the original display.
The custom recording code could conflict with client code. e.g. jquery. Namespacing will be required to avoid this.
There might be some edge cases where typing and clicking may not reproduce the same resulting HTML as seen in the original e.g. random numbers, date times. Mutation observers may be required to observe HTML changes, although not supported in all browsers. Screenshots could come in useful here but might prove OTT.

Replaying user actions with just Javascript is a complex problem.
First of all, you can't move mouse cursor, you can't emulate mouseover/hovers also. So there goes away a big part of user interactions with a page.
Second of all, actions, once recorded, for most of the time they have to be replayed in different environment than they were recorded in the first place. I mean you can replay the actions on screen with smaller resolutions, different client browser, different content served based on replaying browser cookies etc.
If you take a time to study available services that enable you to record website visitors actions ( http://clicktale.com, http://userfly.com/ to name a few), you'll see that none of them are capable of fully replaying users actions, especially when it comes to mouseovers, ajax, complex JS widgets.
As to your question for detecting changes made to the DOM - as Chris Biscardi stated in his answer, there are mutation events, that are designed for that. However, keep in mind, that they are not implemented in every browser. Namely, the IE doesn't support them (they will be supported as of IE 9, according to this blog entry on msdn http://blogs.msdn.com/b/ie/archive/2010/03/26/dom-level-3-events-support-in-ie9.aspx).
Relying on those events may be suitable for you, depending on use case.
As to "better more simple way to store all events". There are other ways (syntax wise), of listening for events of your choice, however handling (= storing) them can't be handled in simple way, unless you want to serialize whole event objects which wouldn't be a good idea, if you were to send information about those event to some server to store them. You have to be aware of the fact, that there are massive amount of events popping around while using website, hence massive amount of potential data to be stored/send to the server.
I hope I made myself clear and you find some of those information useful. I myself have been involved in project that aimed to do what you're trying to achive, so I know how complicated can it get once you start digging into the subject.

I believe you are looking for Mutation Events.
http://www.w3.org/TR/2000/REC-DOM-Level-2-Events-20001113/events.html#Events-eventgroupings-mutationevents
Here are some resources for you:
http://tobiasz123.wordpress.com/2009/01/19/utilizing-mutation-events-for-automatic-and-persistent-event-attaching/
http://forum.jquery.com/topic/mutation-events-12-1-2010
https://github.com/jollytoad/jquery.mutation-events
Update:
In Response to comment, a very, very basic implementation:
//callback function
function onNodeInserted(){alert('inserted')}
//add listener to dom(in this case the body tag)
document.body.addEventListener ('DOMNodeInserted', onNodeInserted, false);
//Add element to dom
$('<div>test</div>').appendTo('body')
Like WTK said, you are getting yourself into complex territory.

Record
Save the initial DOM of the page, remove the scripts from it and also you need to change all relative URLs to absolute ones.
Then, record DOM mutations and Keyboard/Mouse event.
Replay
Start with initial saved DOM, apply mutations and events using timestamp order.
In fact, clicks will not do anything because we have removed any scripts. but because we have saved the DOM changes we can replay the effect after the click.

I found these two solutions on github which allows your to capture the events and then replay that on a remote server.
https://github.com/ElliotNB/js-replay
and a more comprehensive solution
https://github.com/rrweb-io/rrweb
https://www.rrweb.io/#demos
Both has demos which you can try.

Lately, we can now use MutationObserver
MutationObserver provides developers with a way to react to changes in
a DOM. It is designed as a replacement for Mutation Events defined in
the DOM3 Events specification.
Slow demo, because the console.log message is huge.
var mutationObserver = new MutationObserver(function(mutations) {
mutations.forEach(function(mutation) {
console.log(mutation)
})
})
mutationObserver.observe(watchme, {
attributes: true,
characterData: true,
childList: true,
subtree: true,
attributeOldValue: true,
characterDataOldValue: true
})
<div id="watchme" contenteditable>
Hello world!
</div>

Related

How to update a web page javascript counter live when the browser doesn't have focus?

I am making a browser game in html, css, and javascript, written in perl. Health and stamina are kept in the server and I use javascript to show the user a live updated count of these stats while the current page is loaded. This works fine, however if the user switches tabs or switches away from the browser and leaves it running in the background, the count value you see when you return does not keep up properly. So when you switch back to the browser, your counter might say 50/100 stamina when you actually have 100/100. So when you do something in the game (loads a new page) the server updates the counter to the true amount because the javascript is just keeping time to show the user a "live" rolling view in the browser.
Is there a way to ensure the javascript counter will continue to function even if the page/tab isn't active or on the forefront? Aside from completely re-writing my game to include continuous live server pushes in what is displayed on the browser to the user?
Say you are playing the game. You see your health and stamina regenerating. You switch to another program for a minute, then return to the game in the browser. You notice your health and stamina have not updated while you were away. But when you perform an action in the game, this value is updated to what it should be because it is tracked internally on the server. This is what I would like to fix. Hope that makes sense!
I have not tried anything to fix this issue yet besides searching the web and ending up on this site without a really "good" answer in sight, so I decided to ask the question.
Continuous server pushes wouldn't work either. Anything in the main event loop like a timer, or events happening when it's out of focus, gets slowed down by the browser to conserve resources. Some mobile browsers will stop it together.
The answer to the question is to change how your app keeps track of these stats.
Now some will say to use WebWorkers to run the timer in a separate thread but this won't solve all your issues. You'd still have a different version of the issue, like if someone restored your webpage from sleep or something along those lines. No background task can survive that.
You mention that you track these stats also on the server. That's convenient, so the most obvious thing you should do is detect when the tab comes back into focus using the Window focus event. You would then make all the calls to the server to fetch the most up-to-date stats and reset the timers based on that fresh data. To stop it from showing stale data while the request is in flight, you might choose to show a loading spinner or something during that period.
Another common way of fixing this is you keep around on each timer increment a var which says when the data last came back (a timestamp). When you leave focus, you detect this with the blur event and store that last timestamp somewhere. Then they come back into focus, you handle the focus event and calculate the difference between the current time and the last recorded time before defocus (blur). You may be able to recalculate from this period what the values should be.
But if your server has this info, it'd be far less error-prone and easy to just ask the server when they refocus.

How does HotJar generate their recordings?

Tracking mouse movement/scroll/click events is easy but how do they save the screen and keep it in sync so well?
The pages are rendered very quite well (at least for static HTML pages, haven't tested on Angular or any SPA), the sync is almost perfect.
To generate and upload a 23fps recording of my screen (1920x1080) it would take about 2Mbps of bandwidth. Maybe when recording only when there are some mouse events it would still take some 300-500Kbps on average? That seems way too much...
HTML content and DOM changes get pumped through a websocket and stored by Hotjar (minus sensitive information such as form inputs from the user, unless you've whitelisted them), the CSS isn't stored (it gets loaded by you when you watch the recording).
Because they're only recording user activity and DOM changes, there's a lot less data to record than if they were capturing a full video. The downside is that some Javascript driven widgets won't function correctly in the replay.
Relevant information from Hotjar docs:
When it comes to recordings, changes to the page are captured using the MutationObserver API which is built-in into every modern browser.
This makes it efficient since the change itself is already happening
on the page and the browser MutationObserver API allows us to record
this change which we then parse and also send through the websocket.
At regular short intervals, every 100ms or 10 times per second, the cursor position and scroll position are recorded. Clicks are recorded
when they happen, capturing the position of the cursor relative to the
element being clicked. These are functions which in no way hinder a
user's experience as they only capture the location of the pointer
when a click happens or every 100ms. The events are sent to the Hotjar
servers through frames within the websocket, which is more efficient
than sending XHR requests at regular intervals.
Source: https://help.hotjar.com/hc/en-us/articles/115009335727-Will-Hotjar-Slow-Down-My-Site-

Track when the user received first bytes of the video

There is a web page which has HTML5 video in it. When the user clicked start or when he navigates through the timeline, the video starts (either from start or from the position he selected). But it does not always happens instantly. I wanted to find how much time did it took from the user click event and the time the user received first bytes of the video.
Getting time of userclick is not a problem, but while looking through HTML5 video API here and I was not able to find any event which is close to what I am looking for.
Is it possible to tack such event?
The event(s) you listen for after you receive the click (or "play" or "seeking") event depends on the state of the video before the time of the click.
If you have a fresh, unplayed video element with the preload attribute set to "none", then the first data you're going to receive from the network is the metadata. so you can listen for the "loadedmetadata" event.
If preload is set to "metadata", you might have already loaded metadata, depending on the browser and platform. (e.g., Safari on iPad will not load metadata or anything else until the first user interaction.) In that case, you want to listen for either "loadedmetadata" or "progress". It couldn't hurt to listen for "loadeddata" as well, but I think "progress" fires first.
If preload is set to "auto" or if you've already played some of the video, you might have some actual video data. And while you're likely to have data at the current point on the timeline, you may or may not have it at the seek destination. It depends at least on how far ahead (or behind) you're seeking, how fast data is coming in and how much spare room the browser has in the media cache.
If there is no data at the destination time (you can check this in advance if you want with the .buffered property, see TimeRanges), then the next event you see will be either "loadeddata" or "progress", probably followed by "canplay". If there is enough data buffered at the target time of the seek, then the question doesn't really apply because nothing else will be transferred.
However, in any of the above cases, once there is enough data to display the frame at the new point on that timeline and that data has been decoded, the "seeked" event will fire. So if you were to only pick one (no reason you can't use more), this is the one to pick.

Recording user data for heatmap with JavaScript

I was wondering how sites such as crazyegg.com store user click data during a session. Obviously there is some underlying script which is storing each clicks data, but how is that data then populated into a database? It seems to me the simple solution would be to send data via AJAX but when you consider that it's almost impossible to get a cross browser page unload function setup, I'm wondering if there is perhaps some other more advanced way of getting metric data.
I even saw a site which records each mouse movement and I am guessing they are definitely not sending that data to a database on each mouse move event.
So, in a nutshell, what kind of technology would I need in order to monitor user activity on my site and then store this information in order to create metric data? I am not looking to recreate GA, I'm just very interested to know how this sort of thing is done.
Thanks in advance
Heatmap analytics turns out to be WAY more complicated than just capturing the cursor coordinates. Some websites are right-aligned, some are left-aligned, some are 100%-width, some are fixed-width-"centered"... A page element can be positioned absolutely or relatively, floated etc. Oh, and there's also different screen resolutions and even multi-monitor configurations.
Here's how it works in HeatTest (I'm one of the founders, have to reveal that due to the rules):
JavaScript handles the onClick event: document.onclick = function(e){ } (this will not work with <a> and <input> elements, have to hack your way around)
Script records the XPath-address of the clicked element (since coordinates are not reliable, see above) in a form //body/div[3]/button[id=search] and the coordinates within the element.
Script sends a JSONP request to the server (JSONP is used because of the cross-domain limitations in browsers)
Server records this data into the database.
Now, the interesting part - the server.
To calculate the heatmap the server launches a virtual instance of a browser in-memory (we use Chromium and IE9)
Renders the page
Takes a screenshot,
Finds the elements' coordinates and then builds the heatmap.
It takes a lot of cpu-power and memory usage. A lot. So most of the heatmap-services including both us and CrazyEgg, have stacks of virtual machines and cloud servers for this task.
The fundamental idea used by many tracking systems uses a 1x1px image which is requested with extra GET parameters. The request is added to server log file, then log files are processed to generate some statistics.
So a minimalist click tracking function might look like this:
document.onclick = function(e){
var trackImg = new Image();
trackImg.src = 'http://tracking.server/img.gif?x='+e.clientX+'&y='+e.clientY;
}
AJAX wouldn't be useful because it is subject to same-origin policy (you won't be able to send requests to your tracking server). And you'd have to add AJAX code to your tracking script.
If you want to send more data (like cursor movements) you'd store the coordinates in a variable and periodically poll for a new image with updated path in the GET parameter.
Now there are many many problems:
cross-browser compatibility - to make the above function work in all browsers that matter at the moment you'd probably have to add 20 more lines of code
getting useful data
many pages are fixed-width, centered, so raw X and Y coordinates won't let you create visual overlay of clicks n the page
some pages have liquid-width elements, or use a combination of min- and max-height
users may use different font sizes
dynamic elements that appear on the page in response to user's actions
etc. etc.
When you have the tracking script worked out you only need to create a tool that takes raw server logs and turns them into shiny heatmaps :)
Don't know the exact implementation details of how crazyegg does it, but the way I would do it is to store mouse events in an array which I'd send periodically over AJAX to the backend – e.g. the captured mouse events are collected and sent every 30 seconds to the server. This recudes the strain of creating a request for every event, but it also ensures that I will only lose 30 seconds of data at maximum. You can also add the sending to the unload event which increases the amount of data you get, but you wouldn't be dependent on it.
Some example on how I'd implement it (using jQuery as my vanilla JS skills are a bit rusty):
$(function() {
var clicks = [];
// Capture every click
$().click(function(e) {
clicks.push(e.pageX+','+e.pageY);
});
// Function to send clicks to server
var sendClicks = function() {
// Clicks will be in format 'x1,y1;x2,y2;x3,y3...'
var clicksToSend = clicks.join(';');
clicks = [];
$.ajax({
url: 'handler.php',
type: 'POST',
data: {
clicks: clicksToSend
}
});
}
// Send clicks every 30 seconds and on page leave
setInterval(sendClicks, 30000);
$(window).unload(sendClicks);
});
Note that I haven't tested or tried this in any way but this should give you a general idea.
If you're just looking for interaction, you could replace your <input type="button"> with <input type="image">. These are automatically submitted with the X, Y coordinates of where the user has clicked.
jQuery also has a good implementation of the mousemove event binding that can track the current mouse position. I don't know your desired end result, but you could setTimeOut(submitMousePosition, 1000) to send an ajax call with the mouse position every second or something like that.
I really don't see why do you think that is impossible to store all click points in one user session to the database?
Their moto is "See Where People Click"
Once when you gather enough data it is fairly easy to make heat maps in batch processes.
People are really underestimating databases, indexing and sharding. The only hard thing here is to gather enough money for underlying architecture :)

Is there a profitable way to record user actions in textarea?

I need to send bunch of commands to the server on timer - like:
put(0,"hello")
del(4,1)
put(4," is around the corner")
so I need to monitor and record all of the user input and compile/flush it on the timeout (idle), something like macros.
I can record all things happening onKeyUp/onKeyDown/onMouseDown/onMouseUp using textarea cursor position and keys information (and make it cross-browser some time later) but I can't handle things like pasting using mouse right button and selecting 'Paste' or pasting from the menu (I can handle onChange, but I will have no information is it pasted or already recorded as pressed keys and it fires only after focus change). Even pasting from context menu fires some useful info, but the menu from the browser is the only thing, giving nothing for Javascript.
Is there any plugin for jQuery or something like that and do I really have no other ways to implement it without comparing current-document and document-a-second-before?
Upd.: There are events for handling cut/copy/paste: http://www.quirksmode.org/dom/events/cutcopypaste.html , but what about
the undo one?
P.S. I will show a macro-recording code when I'll finish, if someone really needs it. And to finish it properly, I just need the undo handling possibility. Current version is here: http://code.google.com/p/sametimed/source/browse/WebContent/module-editor.js, look for compileCommands method.
There are events for cut/copy/paste you may listen to, depending on browser. So if they are triggered you may use them, otherwise fall back to more tedious work-around.
See: http://www.quirksmode.org/dom/events/cutcopypaste.html

Categories

Resources