Getting user info using Client ID

Getting user info using Client ID - javascript

I have inserted the analytics.js tracking script into my code, and now I am trying to get user data such as medium, source, etc. using javascript and putting them into variables. Is there a way I can do this using Client Id?

I assume you mean getting the data in realtime for use in your website. That is not possible.
Client ID is not exposed in the interface by default, you'd need to use a custom dimension.
There is a processing delay, report data may only be reliable the next day.
While there is the (less reliable) data from the real time API (which at least contains medium and source information) it does not support custom dimension, so you could not use the client id as query key.
Also to retrieve data from the API you need to be authenticated, which the current users of your webpage is not. So you would need to set up some kind of serverside proxy that handles authentication for you.
Also there are API limits determining how many requests you can make in a given time frame. Even a small site would exhaust those requests pretty quickly.
So while in theory this sounds doable it is not actually feasible for any real-life purpose.

Related

AngularJS and MySQL real-time communication

I have built a web application using AngularJS (front-end) and PHP/MySQL (back-end).
I was wondering if there is a way to "watch" the MySQL database (without Node.js), so if one user adds some data to it, the changes are synced to other users too.
E.g. I know Firebase does that, but it's object oriented database and I am unable to do the advanced queries there like I do with SQL.
I was thinking to use $interval and $http and do ajax requests, so that way I could detect changes in the database. Well, that's possible, but it'll then do thousands of http requests to the server everyday and plus interpret php on each request.
I believe nothing is impossible, I just need an idea to do this, which I don't have, so that's why I am asking for a help here.

If you want a form of "real-time communication" you'll likely have to incorporate some form of long-polling from the client. Unless you use web sockets, but that's a big post about a bunch of different things. You're right to be concerned about bandwidth and demand on the DB though. So here's my suggestion:
If you don't have experience with web sockets then log your events in a separate table/view and use the pub/sub method to subscribe entities to an event, and broadcast that event to the table. Then long-poll against the watcher view to see when changes may have occurred. If one did occur then you query for the exact value.
Another option would be to use some query system with "deciders" that hold messages. Take a look at Amazon's SQS platform for a better explanation of how this could work. Basically you have a queue that holds messages and a decider chooses where to store the message using some hash or sorting method (to reduce run time). When the client requests an update, the decider finds any messages that would apply based on the hash/sort and returns them. Then you just have to decide how and when to destruct the messages.
The second option would require a lot more tinkering though, so it's really about your preference. I think what you'll find the difficulty to be is that most solutions have to deal with the fact that the message has to be delivered 1 or More times and you'll need to track when someone received the message and if it can now be deleted from the queue/event table or if you still need to wait. Otherwise you'll consume a lot of memory.

Track cross domain sessions with ApplicationInsights

I am using the Azure Application Insights JavaScript library to keep track of some business flow in my application. AppInsights uses a session_id (saved to a cookie) to connect separate events into a flow. This value is automatically generated and managed.
The problem is that now the business flow requires me to track events from multiple domains. Can I somehow tell AppInsights-JS that I want to continue a previous session? If the current session could be serialized into a string, and loaded on an other page, that would be perfect, I could just pass it along as a query parameter to the page on the other domain.
My first thought was to save the ai_user and ai_session cookie values, but it feels like hacking the system.
The solution I am currently using is to maintain a custom sessionid myself, and pass it to every tracked event as a custom dimension. This way I can filter the events based on this field to obtain the events of a business flow. It's a bit harder to use this way.
Is it safe to just save and store the cookie values? Or is there any better way to do this?

JavaScript SDK doesn't support this functionality today. But you can write a TelemetryInitializer to override the ai_session and ai_user.

Client side to server side google analytics MP client ID

I am trying to send transaction data to UA from a webshop which is only creating orders once it has received a "push" from a payment processing API. A success page is displayed to the customer independently from the order being created, meaning the tracking script (UA via GTM with data layer) does not have access to the order data.
This gives me the headache of trying to track transactions made through the webshop server side, instead of client side. Yet I still want to tie the transaction to the unique visitor ID who made the purchase, not just a random Client ID each time.
My site is using analyics.js (UA profile). Therefore I thought somewhere a client ID would be stored as a cookie, and I do see a _ga cookie there which I believe is the client ID, which looks like this:
GA1.2.1586737968.1429871710
The documentation for getting cookie and user identification states the following:
You should not directly access the cookie analytics.js sets, as the cookie format might change in the future. Instead, developers should use the readyCallback to wait until analytics.js is loaded, and then get the clientId value stored on the tracker.
... which is not helpful to me, as I have to do this server side. But anyway, this client ID does not even fit the description of what a client ID looks like, although it does appear to match a legacy format. Sort of.
Couple of questions then:
Is it just those last two numbers 1586737968.1429871710 that I need to be parsing from the _ga cookie and sending as a client ID? Or is the whole thing ok?
Are there any scripts/libraries that will do this for me so I don't have have to worry about Google suddenly giving new visitors the client ID based on the new UUID v4 format?
Does this approach have any obvious flaws?

Answers:
Yes the cid value should just be the last 2 numbers as you rightly point out. (this is experience from client setups that already successfully use Measurement Protocol on an ongoing basis)
Nope, not as far as I am aware, best to do it yourself. It will only be a few lines of code to detect what format you're seeing in the cookie and what, therefore you need to send through.
not really. It is a common enough scenario.

Ideas on Protecting Web App data sources

I'm working on a new web app where a large amount of content (text, images, meta-data) is requested via an Ajax request.
No auth or login required for a user to access this.
My concern is that you could easily lookup the data source URL and hit it directly outside the app to get large data. In some ways, if you can do this you could probably scrape the static HTML pages elsewhere that also have this content.
Are there any suggestions on methods to obfuscate, hide, or otherwise make it very difficult to access the data directly?
Example: web app HTML page contains a key that is republished every 30 min. On the server side the data is obfuscated based on this key. In order to get the data outside the app you'd need to figure out the data source but also the extra step of scraping the page for a key every 30 min.
I realize there is no 100% way to stop someone, but I'm talking more about deterrence.

Use sessions in your webapp. Make a note (e.g. database entry or some other mechanism which your server-side code can access) when a valid request for the first page is received and include code in the second page to exclude the data when processing a request without a corresponding session entry.
Obviously the specifics on how to do this will vary between languages, but most robust web platforms will support sessions, largely for this type of reason.

If you are wanting to display real-time data and are concerned about scrapers...if this is a big enough concern, then I suggest doing it with flash instead of JS (AJAX). Have the data display withing a flash object. Flash can make real-time send/receive requests to the server just like AJAX. But the benefit of Flash is that the whole stage, data, code, etc.. are within a flash object, which cannot be scraped. Flash object makes the request, you output the stuff as a crypted string of code. Decrypt it within flash and display from there.

"Are there any suggestions on methods to obfuscate, hide, or otherwise make it very difficult to access the data directly?"
Answers your own question because if the data is worth getting it will be obtained because you are obfuscating is merely making it harder to find.
You could in the server side script processing the ajax and returning the data check where the request came from.

How to read a very large rss/atom

I have a very large RSS (which may be 1M), so when I read it, it takes alot of time.
If I set the number of items read, example: 4, I think this will not ensure that I read all the data which updated since last time I read it (and I will lose some items),
What can I do?
I am using Google AJAX Feed Api to read the RSS/Atom feed.
updated:
I am using Google AJAX Feed to handle the RSS, then I store the data in my database.

Edit, possible specific solution:
If accessing a limited set of items from a feed do speed up the Google Feed API access, then simply keep asking for the most recent items until you encounter an item you have seen before. Unless the feed has been re-ordered this will ensure all items have been seen (however, remember that feed items may be updated -- those changes would be lost).
If accessing a limited set of items does not have an performance benefit, then another approach, such as a server-side helper (or another feed accessor), needs to be considered.
General information (not specific to this question):
The feed server should correctly handle If-Modified-Since header. So, while it won't directly save the 1M+ download, you only need to perform the download if the feed has been modified.
Additionally, you can request just a Range of data from the server, if the server supports Range requests, and manually merge the data in. Even if the server doesn't support range requests, you can abort the download after you have sufficient to continue (using this approach will allow you to inspect inbound data and terminate at exactly the right time).
In either case, you are responsible for ensuring enough is read -- from there it may be easiest just to "fix up" the local XML and pass it to a normal feed processor.
And, neither of the above are possible to do in plain client JavaScript :-)

Gosh that would be definitely a whole archive. I know how difficult large XML files can be to parse!

Develop Reference

JavaScript is the programming language of the Web.