Issue with Adblockers on server-side GTM - javascript

I am using Server side GTM, but I am facing adblocking issues when calling the below request when I want to retrieve the gtm?js file:
https://example.gtmdomain.com/gtm.js?id=GTM-MY_GTM_ID
The request works fine when I don't use adblockers.
Is there a way to rename the endpoint to something else, such as https://example.gtmdomain.com/secret_file_name.js?id=GTM-MY_GTM_ID in order to not be blocked by adblockers?

So. Server-side gtm is exactly what it says. It's executed on the server. It listens for network requests. It doesn't have any exposure to what happens on the front-end. And the front-end has no clue about there being a server-side GTM. Well, unless there are explicit calls to its endpoint, which you can proxy with your backend mirrors when needed.
What you experience is adblockers blocking your front-end GTM container. Even though it's theoretically possible to track all you need, including front-end events with server-side GTM, it's considered to be the best practice to use both GTMs and stream front-end events to back-end GTM through front-end GTM.
This, of course, makes you dependant on adblockers since they will block your front-end GTM. A way to avoid it is... Well, not to use the front-end GTM and have all your tracking implemented either in a tag manager that is not blocked (I doubt there is one) or just have your own custom javascript library doing all the front-end tracking and sending it to the backend GTM to be properly processed and distributed.
Generally, it's too expensive to implement tracking with no TMS, since now you really have to know your JS, so only the cool kids can afford to do this. A good example would be Amazon.
Basically, it would cost about two to five times more (depending on particulars) to implement tracking with no TMS, but adblockers typically cut only about 10% of traffic. 10% is not vital for reporting, measuring effectiveness of funnels and what not. All the critically important data is not being reported on by analytics anyway. Backend is the real source of critical data.

You can easily do this if you use sGTM hosting from https://stape.io
There is a feature called Custom Loader. With it, you can download Web GTM from different paths and all other related scripts will be also downloaded from different URLs, for example, gtag.js for GA4.
More info https://stape.io/blog/avoiding-google-tag-manager-blocking-by-adblockers
You can also create your custom loader client for Web GTM. However, there will be problems with related scripts. UA/GA4 still will be blocked then, but GTM itself not.

So, I finally implemented a great solution using GTM client templates. It works like a charm.
To summarize, the steps are:
Create a client template from your server container. You can import this template from https://raw.githubusercontent.com/gtm-templates-simo-ahava/gtm-loader/main/template.tpl
Create a new client from this client template
name your path as you want
This article explains perfectly the required steps: https://www.simoahava.com/analytics/custom-gtm-loader-server-side-tagging/

Related

Questions on Security in React.js

I am building my first application on React.JS and I am wondering how to implement simple security in on the client-side because there are some vulnerabilities that I see such as if you view page source the script tags show you all the Components that you made with all the functioning and rendered pages, Also if there is a basic method to stop XSS from happening that I can build on I would like to see that as well.
I am concerned about how anyone can view a page source on react and see the components from the script tag
You're not going to be able to prevent people from looking at the code source on the browser, since the browser has to see it and render it. You can make it a little harder for people to get to the inspect element, but there is always a way to get to it.
As for XSS, all you can do on the client side is validating input and sanitizing, but you can get around that via watching the network traffic and submitting bad data directly through your own http requests.
Client side is just that, served to the client.
What do you mean by sources? It's still a JavaScript and everybody can see the sources, but they will be uglified and minified by Webpack.
Regarding XSS. Don't worry about it using React. Your code is already protected thanks to JSX. String variables in views are escaped automatically.
I suggest you to secure you're service in back-end, because anyone can request to the server trough Postman and can record and repeat your request in BurpSuite;
then use security solutions in your back-end code after that use Object Schema validator in your React app to prevent users XSS attacks.

Difference between WebApp that connects to API vs backend rendering

Sometimes when I create basic web tools, I will start with a nodeJS backend, typically creating an API server with ExpressJS. When certain routes are hit, the server responds by rendering the HTML from EJS using the live state of the connection and then sends it over to the browser.
This app will typically expose a directory for the public static resources and will serve those as well. I imagine this creates a lot of overhead for this form of web app, but I'm not sure.
Other times I will start with an API (which could be the exact same nodeJS structure, with no HTML rendering, just state management and API exposure) and I will build an Angular2 or other HTML web page that will connect to the API, load in information on load, and populate the data in the page.
These pages tend to rely on a lot of AJAX calls and jQuery in order to refresh angular components after a bunch of async callbacks get triggered. In this structure, I'll use a web server like Apache to serve all the files and define the routes, and the JS in the web pages will do the rest.
What are the overall strengths and weaknesses of both? And why should I use one strategy versus the other? Are they both viable and dependent upon scale and use? I imagine horizontal scaling with load balancers could work in both situations.
There is no good or bad approach you could choose. Each of the approaches you described above have some advantages and you need to decide which one suits best to your project.
Some points that you might consider:
Server-side processing
Security - You dont have to expose sensitive information (API tokens, logins etc).
More control - You will have more control over what you do with your resources
"Better" client support - Some clients (IE) do not support same things as the others. Rendering HTML on the server rather than manipulating it on client will give you more support for clients.
It can be simpler to pre-render your resources on server rather than dealing with asynchronous approach on client.
SEO, social sharing etc. - How your server sends resources, thats how bots see them. If you pre-render everything on the server bot will be able to scrape your site, tag it etc. If you do it on the client, it will just see non-processed page. That being said, there are ways to work around that.
Client-side processing
Waiting times. Doing stuff on the client-side will improve your load times. But be careful not to do too many things since JS is single-threaded and heavy stuff will block your UI.
CDN - you can serve static resources (HTML, CSS, JS etc) from CDN which will be much faster than serving them from your server app directly
Testing - It is easy to mock backend server when testing your UI.
Client is a front-end for particular application/device etc. The more logic you put into client, the more code you will have to replicate across different clients. Therefore if you plan to have mobile app, it will be better to have collection of APIs to call rather than including your logic in the client.
Security - Whatever runs on the client can be fully read by the client. No matter how much you minify, compress, encrypt everything a resourceful person will always be able to do whatever he wants with your code
I did not mark pro/con on each point on purpose because it is up to you to decide which it is.
This list could go on and on, I didn't want to think about more points because it is very subjective, and in the end it depends on the developer and the application.
I personally tend to choose "client making ajax requests" approach or blend of both - pre-render something on the server and client takes care of rest. Be careful with the latter though as it will break your automated tests, IDE integration etc. if not implemented correctly.
Last note - You should always do crucial validations on the server. Never rely on data from client.

Send info to google tag manager from backend

Working on a website, I use Google Tag Manager and push some info with the dataLayer in Javascript. So far so good. However there are some information that should not be seen on the client side. Therefore I'm wondering if it's possible to do the same thing on the backend?
Basically a request to GTM API that does the equivalent of
dataLayer.push({
'event': 'transaction',
'something': {
'superSecret': 42
}
});
but on backend. (I've never used GTM API and I'm not sure if it permits to do this kind of requests. If possible, I would appreciate some help :) ). Thanks!
GTM for the Web is basically a Javascript injector - the Interface is there to configure your tags, then everything is wrapped into a JavaScript function that is inserted into your page and executed by the browser. There is no serverside component that you could push data to.
So quite probably the answer is no (unless you want to try really weird workarounds like running the container in a headless browser on your server or trying to abuse the mobile SDKs for GTM, which works rather differently than the web version). I guess it would be easier to send your serverside calls directly to the respective tracking services.
Updated 07/20201 Server-side GTM has moved out of beta a few months ago, so now you can run a container in a virtual machine that proxies tracking requests to tracking vendors. You can hit the endpoint for server-side GTM from your backend (basically with anything that sends http requests), so by now server-side GTM is the way to go for what you asked for. Technically it's a different beast that client-side GTM, but Google did a very good job at making the interface look and feel familiar.
In 2020 Google has released Google Tag Manager for server-side tracking, where you run a container in a cloud environment that then distributes the requests.
https://developers.google.com/tag-manager/serverside
Facebook and Google Analytics Support this now. So you can move tracking to server side.

Can I call a routine or a function automatically when a File is accessed?

Pardon me if I am asking something really stupid. But this is what I want to implement as per my new role as a analytic Implementer. Some of our files (Mostly pdfs) are stored on a webserver (CDN server) to reduce some load of the application server.
We provide links to these file to all our users across the world. What I want is to track these file download whenever they occur. So I just wanted to know is there any way by which I can call a function or a routine from where I can make those tracking calls ?
Not really.
If you are using a 3rd party web hosting as CDN, then you could simply get the Analytics reports using whatever tool your host offers.
If you are running your own hosting box, you could install almost any analytics software on it to monitor access. Just one example is provided here: http://ruslany.net/2011/05/using-piwik-real-time-web-analytics-on-iis/
The clean simple way, however, would be to have a simple web application running on that CDN server that accepts file requests and then returns the file. The advantages are that you could:
record whatever statistics you wish off it.
use widely available tools like Google Analytics
make dynamic decisions, one example of which is deciding version of file sent based on factors like user bandwidth, etc.
transparently handle missing files and path changes, so links will be valid forever
send different caching headers for different files
implement very simple access control and policy based restrictions

How can I check if an XMLHttpRequest to my public API is from my own webapp or from a third-party client (to ensure priority)?

Does anybody know of a way of checking on the API side if a XMLHttpRequest has been made from my own web-application (ie. from the JS I have written) or from a third-party application...
The problem, to me, seems to be that because the JS is run on the client and thus accessible to anyone I have no way of secretly communicating to the API server who I am. I think this is useful because otherwise I cannot prioritize requests from my own application over third-party clients in case of high usage.
I could obviously send some non-documented parameters but these can be spoofed.
Anybody with some ideas?
I would have your web server application generate a token that it would pass to your clients either in JavaScript or a hidden field which they in turn would use to call your API. Those with valid tokens get priority, missing or invalid tokes wouldn't. The web server application can create & register the token in your system in a way that limits its usefulness to others trying to reuse it (e.g., time limited).
If you do approve of third party clients accessing your API, perhaps you could provide them with a slightly different, rate-limited interface and document it well (so that it would be easier to use and thus actually be used by third-party clients).
One way to do this would be to have two different API URLs, for example:
/api?client=ThirdPartyAppName&... for third-party apps (you would encourage use of this URL)
/api?token=<number generated from hidden fields from the HTML page using obfuscated code>&... for your own JS
Note that as you mention, it is not possible to put a complete stop to reverse engineering of your own code. Although it can take longer, even compiled, binary code written in such languages as C++ can be reverse engineered, and that threatens any approach relying on secrecy.
A couple of ideas come to mind. I understand that secrets never last, so I agree that's not a good option.
You could run another instance on a different unadvertised port
You could do it over SSL and use certs to identify the client
A simple but less secure way would be to use cookies
You could go by IP address, but that could be an administrative nightmare

Categories

Resources