Collecting user stats/logs on third party sites with javascript - javascript

I want to collect statistic (views/clicks/browser info/etc.) for my javascript widget which installed on third party web-sites and after that provide analytics for each domain owner (such as: on my web site I will create interface in which domain's owner could login and see stats for his domain).
I assume that I need to integrate tracking pixel in my widget. And after that parse all request for that pixels.
I've got several question about architecture and implementation of stats/log collection:
When using tracking pixel do I need to add all stats as GET parameters? for example when browser loads my js-widgwt I could get all parameters in widget's javascript and after that make ajax request:
my-stats-domai.com/?widget_id=1&domain=example.com&browser=chrome&city=London&type=view....
or there is another way to get/send all parameters?
What is the easiest/fastest way to collect all users info (browser info, referer, url, get params, etc.)? Maybe there is common approach, logs format or specification for users/visitors log?
When tracking user clicks (or other actions) I assume that I need to use ajax request from onclick?
When using onclick do I need to append all browser info, refferer, etc. to url as GET parameters?
Is there any javascript/jquery plugin that could help me collect user stats on third-party sites? And maybe there is any open source php log-parser for my backend to send logs to datastore (mysql).
Maybe I should use piwik or other tracking systems, but I think that it will be kinda overhead. What is pros/cons of using piwik (or smth. else) for my task?
P.S. If there some useful reading about this theme please share a link.

try using microsoft clarity. They have small screen recordings of the widget and you can view errors, clicks, sites, backlinks and more

Related

How does Google's analytics.js authenticate the hostname?

I'm building JS-only plugin which will be implemented on multiple websites, each website having its own unique ID, which is passed to a Rails API along with some other data. My API will verify the hostname and ID provided by the JS plugin - but these things can of course be seen and used to fake impressions or events by anyone.
As far as I'm aware, there is no foolproof way of authenticating a website without an invisible, server-side key. That said, how does Google do it?
Analytics requires no server-side implementation, only an ID, which it of course checks against the hostname. Does this not mean that page views and events can be faked by a third party, and if so, why isn't it a prevalent issue?
Thanks in advance

Cross-domain conversion tracking - Custom vs GA?

Say I'm starting a site, refer.com, where I post items on an 'affiliation' basis. When users click on my links, they're directed to the site shop.com. If the user I redirect to shop.com makes a purchase, I need that conversion tracked.
I see two possibilities:
Creating a custom tracking library (probably JavaScript) where I
request URLs from refer.com to transfer information from shop.com. I guess PHP would work too, but reduces compatibility with clients.
I use Google Analytics cross-domain tracking to do this. I don't
want the refer.com GA account to interfere with the shop.com GA account, but as I understand it you can use several accounts on the
same page, giving them different identifiers.
I feel like I'm stuck with a narrow set of possibilities. Do I do both? Neither? I need it to be as easy to implement as possible for the client, while also providing relatively bullet proof tracking. What's the standard today? Affiliation services are everywhere, and this type of cross-domain tracking has to be a very used technique. Is there another preferred method of achieving this that I'm not aware of?
This question might seem highly theoretical. While that may be true, answers with code are highly appreciated too.
I have a way for this to work but it requires both your domains to have the Universal Analytics code installed. This will not work with the older GA code
https://support.google.com/analytics/answer/1032400?hl=en
You can install multiple instances of the Google Analytics tracking code on your web pages to send data to multiple properties in your account.
You can, for example, install multiple instances of the Universal Analytics tracking code (analytics.js) on your web pages but only one instance of the Classic Analytics code (ga.js).
So (provided they have your GA code installed) when you refer to shop.com what you should do is this
Parse your GA cookie. You can get to it by $_COOKIE['_ga']. The cookie holds a string that has four parts, broken up by periods. (i.e. GA1.3.367110421.1357220305). You want those last 2 numbers (in this example 367110421.1357220305)
Pass the parsed cookie data in your referral to shop.com
shop.com should store the parsed cookie in its session
Last but not least, when shop.com has your referral data it should load your GA code and set your sessions up like this
ga('create', 'UA-YOUR-GA-CODE', {'cookieDomain': 'shop.com', 'clientId': 'USERS-PARSED-SESSION'});
What this does is it passes your GA session to their domain. At this point, GA will keep their session going so you can track what happens on shop.com. Any conversion data they pass to their GA code should be passed to your GA as well.
Is it bulletproof? No. You have to trust shop.com to properly retain and show your referrred GA session ID. But I have to use this methodology to keep my sessions between my primary sites and the centralized checkout we use and it preserves my Adwords conversions, etc.
I feel like if you're looking for ease of use for the client, Google Analytics is a pretty solid option. It is a widely used tool, with lots of documentation and active forums for feedback. Also, from my research on the topicit seems that they've got this type of behaviour in mind already.
An alternate that comes to mind is that, when redirected from site A to site B, they should be forced to authenticate on site B. You could then setup an authentication form that is unique to this referral from site A, and will be filtered into your database separately from regular authentications on site B.

Security in embedded iframe/javascript widget

I'm building a website that is functionally similar to Google Analytics. I'm not doing analytics, but I am trying to provide either a single line of javascript or a single line iframe that will add functionality to other websites.
Specifically, the embedded content will be a button that will popup a new window and allow the user to perform some actions. Eventually the user will finish and the window will close, at which point the button will update to a new element reflecting that the user completed the flow.
The popup window will load content from my site, but my question pertains to the embedded line of javascript (or the iframe). What's the best practice way of doing this? Google analytics and optimizely use javascript to modify the host page. Obviously an iFrame would work too.
The security concern I have is that someone will copy the embed code from one site and put it on another. Each page/site combination that implements my script/iframe is going to have a unique ID that the site's developers will generate from an authenticated account on my site. I then supply them with the appropriate embed code.
My first thought was to just use an iframe that loads a page off my site with url parameters specific to the page/site combo. If I go that route, is there a way to determine that the page is only loaded from an iframe embedded on a particular domain or url prefix? Could something similar be accomplished with javascript?
I read this post which was very helpful, but my use case is a bit different since I'm actually going to pop up content for users to interact with. The concern is that an enemy of the site hosting my embed will deceptively lure their own users to use the widget. These users will believe they are interacting with my site on behalf of the enemy site but actually be interacting on behalf of the friendly site.
If you want to keep it as a simple, client-side only widget, the simple answer is you can't do it exactly like you describe.
The two solutions that come to mind for this are as follows, the first being a compromise but simple and the second being a bit more involved (for both you and users of your widget).
Referer Check
You could validate the referer HTTP header to check that the domain matches the one expected for the particular Site ID, but keep in mind that not all browsers will send this (and most will not if the referring page is HTTPS) and that some browser privacy plugins can be configured to withhold it, in which case your widget would not work or you would need an extra, clunky, step in the user experience.
Website www.foo.com embeds your widget using say an embedded script <script src="//example.com/widget.js?siteId=1234&pageId=456"></script>
Your widget uses server side code to generate the .js file dynamically (e.g. the request for the .js file could follow a rewrite rule on your server to map to a PHP / ASPX).
The server side code checks the referer HTTP header to see if it matches the expected value in your database.
On match the widget runs as normal.
On mismatch, or if the referer is blank/missing, the widget will still run, but there will be an extra step that asks the user to confirm that they have accessed the widget from www.foo.com
In order for the confirmation to be safe from clickjacking, you must open the confirmation step in a popup window.
Server Check
Could be a bit over engineered for your purposes and runs the risk of becoming too complicated for clients who wish to embed your widget - you decide.
Website www.foo.com wants to embed your widget for the current page request it is receiving from a user.
The www.foo.com server makes an API request (passing a secret key) to an API you host, requesting a one time key for Page ID 456.
Your API validates the secret key, generates a secure one time key and passes back a value whilst recording the request in the database.
www.foo.com embeds the script as follows <script src="//example.com/widget.js?siteId=1234&oneTimeKey=231231232132197"></script>
Your widget uses server side code to generate the js file dynamically (e.g. the .js could follow a rewrite rule on your server to map to a PHP / ASPX).
The server side code checks the oneTimeKey and siteId combination to check it is valid, and if so generates the widget code and deletes the database record.
If the user reloads the page the above steps would be repeated and a new one time key would be generated. This would guard against evil.com from page scraping the embed code and parameters.
The response here is very thorough and provides lots of great information and ideas. I solved this problem by validating X-Frame-Options headers on the server-side , though the support for those is incomplete in browsers and possibly spoofable.

Store product details from URL

Is there a method or is it even possible to get a products details by using a URL. Let's say I paste a URL of a product from a store like Walmart Or bestbuy, would it be possible to write something to retrieve the product info (price, name, info, etc..) does this exist? Or would this have to be something site specific that I can write for each specific store?
One solution I see is to parse the HTML code of the page the URL redirects to using for example Tika, but I'm not sure the e-commerce website in question will like that very much :) Maybe you could ask them if they have implemented an API to access their products data?
Yes, it is possible, but not using JavaScript due to same-origin-policy. You must send that URL to the server, read that external page on the server side and return results back to the server.
On the server side (in whichever language you are using) download the web page, parse it (using xml/xpath if you can) and extract relevant information.
As already noted watch out, some websites forbid such access (called web-scraping), other might actively try to prevent that, e.g. by discovering fake clients.
What you're talking about is website scraping and yes, it's possible and there are loads of tools out there to help you with it. Some websites aren't happy with you doing it though.
You could do it in C# using the HttpWebRequest class to request data from a url and then parse it with something like XmlReader or the http://html-agility-pack.net/

Is it possible to use the Google Analytics API to provide stats for customer's page views?

Let's say I run a site where customers are willing to pay for a page that shows some sort of cool info about them. The whole site is tracked using Google Analytics.
To provide stat tracking for the customers, would it be possible to mine the data from Google Analytics, using the AJAX API?
Are there any show-stoppers I should look out for before attempting this?
Trying to prevent from writing my own stat tracking solution.
Update, a bit more clarification: I'm looking to be able to build a stats page that shows a few stats for a specific url (page views, traffic sources, etc...), not necessarily in real-time. I would cache the page to prevent hitting API rate limits.
There are 2 major impediments: One technological, and one legal-ish. Together, they make using Google Analytics Data Export API an unfit solution.
Technological: Google Analytics Data is not available in Real-Time. Delays in data processing run from 3-4 hours to 24-48 hours. Page-views are processed fasted; things like custom variables often take a day or so). In theory, you could tag each user with a custom variable, and then query against that custom variable for information.
Legal-ish The Google Analytics Terms of Service prohibits you from collecting personally identifiable information. So, you can't use a custom variable that stores their username on your site without violating the Terms of Service. Here's the relevant section.
PRIVACY . You will not (and will not allow any third party to) use the
Service to track or collect personally
identifiable information of Internet
users, nor will You (or will You allow
any third party to) associate any data
gathered from Your website(s) (or such
third parties' website(s)) with any
personally identifying information
from any source as part of Your use
(or such third parties' use) of the
Service. You will have and abide by an
appropriate privacy policy and will
comply with all applicable laws
relating to the collection of
information from visitors to Your
websites. You must post a privacy
policy and that policy must provide
notice of your use of a cookie that
collects anonymous traffic data.
As far as alternatives, it depends on what information you want. You can access their IP address on the server side and use that with a third party tool or a command line call to find out their rough location (much the same way that Google does). You can similarly access their referer on the server side. Much of the information that gets sent to Google actually gets stored in the Analytics cookies (_utm prefixed cookies). There's a wide body of literature on reading these cookies (See: http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=how+to+parse+google+analytics+cookies)

Categories

Resources