About Google Analytics collect data - javascript

First of all, Hi to all of you (I'm new here).
I'm having a look on how Google Analytics works as I'm gonna develop a similar tracking js to collect all the data I need for my websites and, as far as I can see, the ga.js script send all the data (maybe not all but a good part of it) with a get request with a 1x1 gif and all the parameters following.
Seen here: How does google analytics collect its data?
So, on the server side It seems the only way to "read" all these parameters is going to analyze server logging and then collect everything on my database?
Is this the best option to get users data?
I think, server logging could "switch file" every 2 hours so you can analyze that file of the past 2 hours and show "not that old" data to your graph!
Of course will never be "realtime" graph but a 2 hours delay could be acceptable, I think.

I think you can simply put a script (PHP for example) at the image path, then through the script return as a response the image, by doing this you can act in real time, since using a script you can get all the data that would be present in your server log.
If you want to try my solution I think a good point to start (in PHP) would be this to create the GIF image and then you can use data located in $_SERVER to start gathering data!

Related

Extract html sourcecode from a javascript generated output

I am currently working on a project of finding empty classrooms in our school in real time. For that purpose, I need to extract substitution published on our school page (https://ssnovohradska.edupage.org/substitution/?), since there might be any additional changes.
But when I try to extract the html source code and parse it with bs4, it cannot find the divs(class: "section print-nobreak") that contain the substitution text. When I took a look at the page source code(Ctrl+U) I found that there is only a javascript that prints it all directly.
Is there any way to extract the html after the javascript output has been already rendered?
Thanks for help!
Parsing HTML is unfortunately necessary to solve your problem. But I will explain how to find ways to avoid that in your future projects (not based on this website).
You've correctly noticed that the text is created by JavaScript code running on the page. This could also indicate that the data is either loaded from another resource (XHR/fetch call getting a response from an API) or is stored as a JSON/JS inside of the website's code. (Or is generated from an algorithm, but this is unlikely to be the case in such websites.)
The website actually uses both methods (initial render gets data stored inside of the website's code, but when you switch dates on the calendar it makes AJAX requests). You can see this by searching for ReactDOM.render(React.createElement( in the code. They're providing a HTML string to the createElement call, so I would suggest looking into the AJAX way of doing things.
Now, to check where the resource is located, all you need to do is opening Developer Tools in your favorite browser (usually Control+Shift+I) and navigating to the Network tab. Now that your network tab is open, you need to cause the website to load external data, for example, by pressing a date on the "calendar bar".
Here you will notice many external requests, but we're actually looking only for XHR calls. Click on the XHR button next to the "Filter" text field. That should result in only one request being shown:
Unfortunately for us, the response only contains HTML. Also, API calls are protected - they require a PHP session ID and some sort of a token (__gsh) to not fail. So, going back to step 1 - seems like our only solution is to use regular expressions to find the text between "report_html":"<div class and </div></div></div> from the source code, if you're interested in today's date only. If you want to get contents for tomorrow or any other date - you will need to either fetch the page, save the cookies and find the token to supply to the request and then make that request, or use something like puppeteer or pyppeteer (since you've mentioned BS4) and load the webpage in that. If you aren't doing the data fetching that often, you should be fine overall.

How to notify the front-end of a website of the status of a back-end processing job?

I currently face the following issue:
After a user has uploaded his images, all images are processed through a script that optimizes every image (compresses it and removes EXIF-data).
I got everything working, the only problem is that the proces takes quite some time. I want to notify the user of the job status, e.g. a percentage of the processed images.
Currently, the user has to wait without knowing what's up in the back-end. What is the best way to accomplish this? I've thought about AJAX-calls, but I honestly have no idea where to start with implementing this, also because it looks like I need multiple calls (kinda like a heartbeat call on the processing job).
The application I am developing in is a Laravel application, I've made an API controller which handles incoming files via AJAX calls.
Any help is appreciated, thanks.
Laravel has Broadcasting for this. It uses websockets, redis or pusher to send events to the client.
This way you can send the client a message when the processing is done without them having to refresh a webpage all the time.
You'd be better off reading about the principle of how it's done, for example: Progress bar AJAX and PHP
Essentially the way it's done is that the job (processing images in your case) happens on the server through PHP. Your script will need to produce some sort of output to show how far through it is, e.g. echo some value for the percentage progress. The PHP script itself is responsible for producing this output, so you must work out how to calculate it and then code that in. It could be that it takes the number of images to be processed into account, and when each one is successfully processed, it adds 1 to a counter. When the counter equals the number of images, 100% done, or possibly some error messages if something went wrong.
On the frontend you could have an ajax script which reads the output from the PHP script. This in turn could update a progress bar, or div with some sort of percentage message - the value used coming from your PHP script.
Laravel - and other frameworks - have built-in methods to help. But you'd be better understanding the principles of how it works, such as on the link I posted.

How to track an external JSON file for a change

I am currently checking this file: https://api.steampowered.com/IEconService/GetTradeOffers/v1/?key=700D84417970EEAE593ACB8BE455B16E&format=json&get_sent_offers=1 for a change every 5 seconds but Is there any sort of way that I can get notified E.g. a post request when this website changes? I had a look at google push notifications but It seems that it only works for their own api's not external api's
I don't know what do you mean by getting notified.
If you want to get an email when the page changes you can use something like this Google Chrome extension
If you mean it in an programmatically way, then you will have to GET the JSON object from the URL every 5 sec and compare it to the previous version, see this question regarding the compare part.

How to retrieve info from database to display with Chrome extension

I am trying to write my first chrome extension. The workflow goes something like this -When the extension is installed and active if a user hovers over a specific product/ID displayed on the page, the extension retrieves related vendor data about the product with the ID.
This is how I thought about this:
Use jQuery attr to access the ID on mouse over.
Post this ID to a retrieve.php file with .post() method
The retrieve.php file retrieves the data from database
Display the data in a tool tip on the web page.
I have some queries for the above process:
I am able to get this working on a local XAMPP server but how will it work online as the chrome extension will not have access to server. What is the way around to retrieve data without using PHP?
I am able to get the logic working but am unable to place these in respective files - Will all my logic reside in background.js ?
Any suggestions on getting this started will be much appreciated.
You could build a very simple API on your server that responds with JSON to any request it receives after processing it. Like this:
{"firstVar":"foo","secondVar":"bar" }
Your chrome extension can then make an xmlhttp request to this server and and process the returned data.(You could also use JSONP and wrap the response in a callback function which will execute as soon as you have the reponse)
The JS extension will be able to deal with the JSON nicely as it can understand that format so you can then choose to display the data in whatever way you want.
Essentially, what you want is a server that can take an ID posted to it and return the corresponding date in a nice and readable format. And a chrome extension that can make an request to a server and then process the response. Build and test them separately (keep positing an ID to the server and see the response and for your JS side at first instead of making requests to your unfinished API just set a static response to begin with which will be the same as an expected response.

Upload files asynchronously then save data about it

I am building a way for users to upload tracks with information about that track but I would like to do this asynchronously much like YouTube does.
At the moment there is an API endpoint of tracks that accepts a POST request with the uploaded file and all the meta data. It processes the track, validates everything and will then save the path to the track and all of its meta data in the database. This works perfectly but I am having trouble thinking of ways to do this asynchronously.
The user flow will be:
1) User selects a track and it starts uploading
2) A form to fill in meta data shows and user fills it in
3) Track is uploaded with its metadata to the endpoint
The problem is that the metadata form and the file upload are now two separate entities and now the file can finish uploading before the metadata is saved and vice-versa. Ideally to overcome this both the track and metadata would be saved in the browser as a cookie or something until they are both completed. At that point both would be sent to the endpoint and no changes would be required at the back end. As far as I am aware there is no way of saving files client side like this. Oh apart from that filesystem API which is pretty much deprecated.
If anyone has any good suggestions about how to do this it would be much appreciated. In a perfect world I would like there to be no changes to the back end at all but little changes are probably going to be required. Preferably no database alterations though.
Oh by the way I'm using laravel and ember.js just in case anyone knows of any packages already doing this.
I have thought about this a lot few months ago.
The closest solution that I managed to put together is to upload file and store it's filename, size, upload time (this is crucial) and other attributes in DB (as usual). Additionally, I've added the column temporary (more like a flag) which would initially be set to TRUE and only after you would sent meta data it would be negated.
Separately, I've set the cron job (I used Symfony2, but in Laravel is all the same) that would run on every 15-30 minutes and delete those files (and corresponding database records) which had temporary = TRUE and exceeded time window. In my case it was 15 minutes but you could set it to be coarse (every hour or so).
Hope this helps a bit :)

Categories

Resources