Client side to server side google analytics MP client ID - javascript

I am trying to send transaction data to UA from a webshop which is only creating orders once it has received a "push" from a payment processing API. A success page is displayed to the customer independently from the order being created, meaning the tracking script (UA via GTM with data layer) does not have access to the order data.
This gives me the headache of trying to track transactions made through the webshop server side, instead of client side. Yet I still want to tie the transaction to the unique visitor ID who made the purchase, not just a random Client ID each time.
My site is using analyics.js (UA profile). Therefore I thought somewhere a client ID would be stored as a cookie, and I do see a _ga cookie there which I believe is the client ID, which looks like this:
GA1.2.1586737968.1429871710
The documentation for getting cookie and user identification states the following:
You should not directly access the cookie analytics.js sets, as the cookie format might change in the future. Instead, developers should use the readyCallback to wait until analytics.js is loaded, and then get the clientId value stored on the tracker.
... which is not helpful to me, as I have to do this server side. But anyway, this client ID does not even fit the description of what a client ID looks like, although it does appear to match a legacy format. Sort of.
Couple of questions then:
Is it just those last two numbers 1586737968.1429871710 that I need to be parsing from the _ga cookie and sending as a client ID? Or is the whole thing ok?
Are there any scripts/libraries that will do this for me so I don't have have to worry about Google suddenly giving new visitors the client ID based on the new UUID v4 format?
Does this approach have any obvious flaws?

Answers:
Yes the cid value should just be the last 2 numbers as you rightly point out. (this is experience from client setups that already successfully use Measurement Protocol on an ongoing basis)
Nope, not as far as I am aware, best to do it yourself. It will only be a few lines of code to detect what format you're seeing in the cookie and what, therefore you need to send through.
not really. It is a common enough scenario.

Related

How to secure API Routes in Next.js [duplicate]

I have a restaurant locater web application that mashes up the location of restaurants to a Google Maps.
I use JQuery sliders to limit the amount of restaurant to show on the map by having Search filter such as: price, type of food, locale.
These JQuery sliders call back via AJAX to an API I created to update the map without the web page having to refresh.
JQuery calls a RESTFUL API like so:
http://example.com/search/?city=NYC&max-price:50&cuisine=french
This returns a JSON string of restaurants which match this criteria so that my web application can display on the map all the restaurants which match the search.
What I don't want to have happen is for someone to come along and figure out my API and dumps out ALL of my restaurant listings.
Is there a way that I can limit who call the above HTTP API, so that only my web server calls the URL and not spamer/hackers looking to dump my database?
Thanks
First, declare your intentions in robots.txt.
Then, send a Set-Cookie header with a nonce or some kind of unique ID on the main page, but not on your API responses. If the cookie is never sent to your API endpoint, return a 401 Bad Request response, because it's a bot, a very broken browser, or somebody is rejecting your cookies. The Referer header can also be used as an additional check, but it's trivial to fake. Keep track of how many API calls have been made by that ID. You may also want to match IDs to IP addresses. If it goes above your threshold, spit back a 403 Forbidden response. Make your threshold high enough that legitimate users don't get caught by it.
Keep good logs, and highlight 401 and 403 responses.
Realistically, if someone is determined enough, they WILL be able to dump this information. Your goal shouldn't be to make this impossible, because you will never succeed. (See all the usual adages about achieving perfect security.) Instead, you want to make it abundantly clear that:
This behavior violates the terms of service.
You are actively trying to prevent this.
You know that the offender exists and roughly who they are.
Scary lawyers might start getting involved if this continues.
(You do have a lawyer, right?)
To achieve this, be sure the body of your 403 Forbidden response conveys a scary sounding message along the lines of "This request exceeds the maximum allowed usage of the API. Your IP address has been logged. Please refer to the terms of service and obey the directives in robots.txt."
IANAL, but I believe the DMCA can be made to apply in this situation if you claim copyright on your database. This essentially means that if you can track illegal usage of your API to an IP address, you can send a nastygram to their ISP. This should always be a last resort of course.
I don't encourage the use of assigned API keys/tokens because they turn out to be a barrier to adoption and kind of a pain in the neck to manage. As a counter-point to #womp's answer, Google is slowly moving away from their use. Also, I don't think they actually apply in this case, because it sounds like your "API" is more like a JSON call that's used mainly on your own site.
All the big REST API's tend to use tokenized authentication - basically before you do a REST request, you have to send some other request to the token service to fetch a token to include with your data request. Bing Maps does this, Amazon does this, Flickr does this... etc.
I don't know too much about it other than having worked with Bing Maps. You'll need to read up on tokenized authentication with REST. Here's a blog post to get you started: http://www.naildrivin5.com/daveblog5000/?p=35

Getting user info using Client ID

I have inserted the analytics.js tracking script into my code, and now I am trying to get user data such as medium, source, etc. using javascript and putting them into variables. Is there a way I can do this using Client Id?
I assume you mean getting the data in realtime for use in your website. That is not possible.
Client ID is not exposed in the interface by default, you'd need to use a custom dimension.
There is a processing delay, report data may only be reliable the next day.
While there is the (less reliable) data from the real time API (which at least contains medium and source information) it does not support custom dimension, so you could not use the client id as query key.
Also to retrieve data from the API you need to be authenticated, which the current users of your webpage is not. So you would need to set up some kind of serverside proxy that handles authentication for you.
Also there are API limits determining how many requests you can make in a given time frame. Even a small site would exhaust those requests pretty quickly.
So while in theory this sounds doable it is not actually feasible for any real-life purpose.

can someone change the data sent to the server if I have the data in a div

I got questions about security. lets say I have data in a div like so
<div id="Q9vX" class="mainContent" data-compname="comp1" data-user="57f70c8e78ae49d41c78876a" data-shortid="Hy85nKVR">
and I do a post request that sends the compname and user id .Couldn't someone change the data-user attribute value before it was sent? Since I'm doing DB operations based on the ID in the div can someone change the id and have the operation occur for the id the villain entered and not the one I initially intended. . I use mongodb, heroku, express. I'm afraid of sessions because they expire and I'm not too comfortable with them. What is the standard procedure for something like this?
For example this div is for a review placed by a user that has the id 57f70c8e78ae49d41c78876a. So if everything went normal and the user presses submit the review will be assigned to that users's Id. but lets say someone decides to go into firebug and changes the id would the review be registered to this new ID?
The value could be changed a user through the developer tools or through a cross site scripting attack where malicious code is injected onto the page. This could be done a number of ways such as adding the code to a file on your server, adding the code to your database if it uses a CMS, or through another means such as a browser extension.
If you have no server side access controls, someone could write a script that compromises the availability or integrity of your data. The availability could be compromised through a denial of service attack where thousands of fake requests are sent to the server trying to exceed the number of concurrent database connections preventing a legitimate user from connecting. The data integrity could be compromised by sending lots of fake requests to the database which could be difficult and time consuming to identify and remove. Also if the review is like a comment box where users can enter data that's displayed on the site, it could be used to inject malicious cross site scripting code.
If you are concerned about security I recommend that you implement access control such as sessions, sanitize the data coming in and going out of your database, and use a secure HTTP connection on your web server.
The Express JS website have an article about security best practices.

Restrict the registration for a machine C#

I have a web application that has form based authentication.
the application has registration functionality also. Since last few weeks, i have observed that some users with specific domain is making fake entries into the website and getting the benefits as We do not have any approval workflow.
this user either do it manually or run some script. We thought that we can restrict the registration process as per IP based, however this is not possible to get the visitor exact IP address using C# (please correct if i am wrong).
Can we do it using some other techniques. our requirement is - single registration from a machine per 2 days.
unfortunately I would call this mission impossible.
Idea 1: IP address. The user can use a proxy to register multiple accounts depending on how many proxy he can find (there are a bunch on the internet for free)
OR they could just fake the ip package by putting a random ip in the header. Since all they need is to register so it doesn't matter if the confirmation message was sent to another random guy
Idea 2: one registration per machine. I could fake as many machines as I want with virtual machine and you will have no way to tell from http request.
Alternatively I could just fake all the information with raw http request and I can do that with a script with no issue.
And from what I know you don't have the system right to read hardware id from js (correct me if im wrong)
No method is guarantee to restrict 2 registration per day but IP based method should work against most normal users. Do keep in mind that everyone using the same router could have the same IP (example school, public wifi in apartment)
You could find out the user's IP address within HttpContext object
Whatever your restriction would be - it will be based on the data the browser sends (as long you restrict a specific computer).
Your main desire is to create a "footprint" on that machine in order to use it later - per request.
Whatever your manipulation would be, you should also obfuscate your JS code.
for example, on pageload code you can request for httpheaders dedicated to that machine and save them in cache, then you generate a guid for the client which it suppose to use in order to register.
another option is to use AES to encrypt the data before sending it "on the wire", that way you won't be able to manipulate it.
the most important thing is that once you "drop" a js code on the client he can do whatever he wants, the question is how hard it would be.
**edit:
a more secured way but more complicated that i have once used is creating a sync-key.
an async ajax call to the server requesting an encryption key.
the server call will save the new guid-key in memory and will generate a new one for each request.
you can use this idea to keep track of user debug and browser behavior.
as debuging will hold the code from running the sync key will be change and you can "catch" him.
Neither cookies nor IP can protect against fake entries.
You should look at it from another side. You get unwanted entries and you don't know if it's an automated bot, or spammer, or someone who just doesn't care about your data. Instead of banning entries you should think how to validate them. For example, if you get "aaaaa" as a name and "bbbbb" as an email address - add, at least, regexp validation on client and server side to ensure you get data in a required format. Next level would be to verify the email address by querying the mail server or sending validation email. This will not only help to stop spammers, but also people who doesn't care. If you think it's an automated bot - add a captcha. In case of emergency - ban IP in the web.config (See ASP.Net How to limit access to a particular IP address to a particular page through web.config file (.htaccess similar)?)

Analytics Measurement Protocol - how do I use the Client ID?

TL;DR: Can't find clear information on how to set/get the Client ID to make any server-side tracking request. Need to understand how to work with the Client ID.
I'm planning to use Analytics Measurement Protocol to send a custom pageview from the server (I'm using PHP).
Standard page-track request looks like this:
v=1 // Version.
&tid=UA-XXXXX-Y // Tracking ID / Property ID.
&cid=555 // Anonymous Client ID.
&t=pageview // Pageview hit type.
&dh=mydemo.com // Document hostname.
&dp=/home // Page.
&dt=homepage // Title.
In order to make the request, I need to set cid (Client ID). This is what the documentation tells about it:
Required for all hit types.
This anonymously identifies a particular user, device, or browser
instance. For the web, this is generally stored as a first-party
cookie with a two-year expiration. For mobile apps, this is randomly
generated for each particular instance of an application install. The
value of this field should be a random UUID (version 4) as described
in http://www.ietf.org/rfc/rfc4122.txt
For me, the whole point of using the Analytics Measurement Protocol is not to use JS to track specific hits. JS can throw errors, older browsers might not be so developer-friendly, users tend to use browser extensions to block not only ads, but trackers as well. Having said that:
Is there any way to get the Client ID in PHP, and do I even need to
do that?
Can I just generate a random UUID (v.4) every time I need to send a
pageview or an event?
I understand the Client ID should be unique per client. How do I make sure it really is?
Let me add, I'm using a legacy code with the old ga.js library fueling Google Analytics.
UPDATE:
I found a post by Dave Meindl from 2013, showing an example implementation. Seems like he basically creates the an UUID every time and uses it as the Client ID. If someone could confirm that that's the way to go, I would be so happy.

Categories

Resources