I'm developing a website that uses a notification system (like the Facebook's one).
For this purpose I think I'll write a JQuery polling function that look for new notification in the server side, using ajax.
My question is, is this a good idea?
It will be ok for client side, it's more likely to be a server issue. http://swerl.tudelft.nl/twiki/pub/Main/TechnicalReports/TUD-SERG-2007-016.pdf
In conclusion, if you want high data coherence and network performance you should use push rather than pulling. But push will consume more cpu thus has scalability issue.
The efficiency of pulling depends on the pull interval vs publish interval. If they equal, everything will be perfect. But realistically they will never meet.
Some additional personal opinion, if your server script is blocking in nature (like ruby on rails with single thread server). You better rethink the solution.
Related
I have a node.js app running in Azure as a webApp. On startup it connects to an external service using a websocket subscription. Specifically I'm using the reconnecting-websockets NPM package to wrap it to handle disconnects.
The problem I am having is that because there are 2 instances of the app running on Azure (horizontal scaling for failover) I end up with two subscriptions at any one time.
Is there an obvious way to solve this problem?
For extra context, this is a problem for 2 reasons:
I pay for each message received and am over quota
When messages are received I process then and do database updates, these are also being duplicated.
You basically want to have an AppService with potentially multiple instances, but you don't want your application to run in parallel. At least you don't want two have two subscriptions. Ideally you don't want to touch your application code.
An easy way to implement this would be to wrap your application into a continuous WebJob, and set its scale property to singleton.
Here is one tutorial on how to set up a nodejs webjob: https://morshemesh.medium.com/continuous-deployment-of-web-jobs-with-node-js-2308f95e63b1
You can then use a settings.job file to control that your webjob only runs on a single instance at any one time. Or you can use the Azure Portal to set the value when you manually deploy the Webjob.
{
"is_singleton": true
}
https://github.com/projectkudu/kudu/wiki/WebJobs-API#set-a-continuous-job-as-singleton
PS: Don't forget to enable Always On. It is also mentioned in the docs. But you probably already need that for your current deployment.
If you don't want your subscription to be duplicated then it stands to reason that you only want one process subscribing to the external websocket connection.
Since you mentioned that messages received will be updated in the db, then it makes sense that this would be an isolated backend process since you made it clear that you have multiple instances running for the frontend server (and whether or not a separate backend).
Of course if you want more redundancy, you could use a load balancer with simple distribution of messages to any number of instances behind. Perhaps some persistent queueing system if you feel that it's needed.
If you want these messages to be propagated to the client (not clear from the question), this will be a bit more annoying. If it's a one-way simple channel, then you could consider using SSE which is a rather simple protocol. If it's bilateral then I would myself probably consider running a STOMP server with intermediary broker (like RabbitMq) and connect directly from the client (i.e. the browser, not the server generating the frontend) to the service.
Not sure if you're well versed with Java, but I made some app that you could use for reference in case interested when we had to prepare some internal demos: https://github.com/kimgysen/zwoop-backend/tree/develop/stomp-api/src/main/java/be/zwoop
For all intents and purposes, I'm not sure if all this is worth the hustle for you, it sounds like you're on a tight budget and that you're looking for simple solutions without too much complexity. Have you considered giving up on load balancing the website (is the load really that high?), I don't have enough background knowledge on your project to judge, I believe. But proper caching optimization and initially scaling vertically may be sufficient at the start (?).
Personally I would start simple and gradually increase complexity when needed.
I'm just throwing ideas at you, hopefully it is helpful in any way to have a few considerations.
Btw, I don't understand why other answers on this question were all deleted (?).
I was wondering if there is any good practice how to utilize long-running requests in nodejs?
I am using Express.js and have been wondering if there is any way to, I don't know, move one of express requests to the different instance of node or something?
The reason why I am wondering is that if strongly uses a CPU, blocking at the same time event loop in node. The main problem is that firstly, I am fetching some data, then calculating and inserting missing data into DB, then fetching other data depends on the result, and then calculating again and updating other stuff in DB.
I haven't noticed this problem as I was the only one using this request so far. But now I am testing it with other users and if few of them use this request, the whole application freeze for a few seconds. And on my monitoring tools I see a huge CPU usage (100%-200% lol)
So I have tried clustering but it seems, it does not work. I have used pm2 and ran app on all cores of my CPU. Because of complexicity of my alghoritm I tried to move several functions to worker threads but it looks like these threads would be used very, very often and I am afraid of crashing the whole node instance.
I have no clue, which solution would be the best and don't know if are there any good tools for Express, which would help me offload/utilize these requests? I have not diveed in partitioning yet, but it looks like this may work. Has anyone used this for the larger projects and knows if this could be the solution to distribute one request into several ticks?
What about the job queue like Bull or Kue? Would be this a good idea to push these tasks into a queue? I have never used such tools and I am asking, as I have no idea if this would make any sense.
Cheers
in case you have data present - You can sent data to user using incoming HTTP request.
but if you don't found any data you can send empty response in HTTP request.
And you need to do calculating and inserting missing data using queue service. bee-queue is nice one.
Use redis with bee-queue to make background jobs faster.
I am working on an in-browser game, taking advantage of the Canvas available in HTML5. However, I realized that I have a big vulnerability in the system. The game score and other statistics about game play are calculated on the client-side in Javascript, then submitted to the server for storage and comparison to other players through XMLHTTPRequest. This obviously exposes the statistics to manipulation and potential cheating.
I am worried about moving these to the server-side due to latency issues. I expect the timing to be close.
Are there other smart ways to deal with this problem? I imagine more and more games will deal with this as HTML5 grows.
Not really. Your server in this scenario is nothing more than a database that trusts the client. You can obfuscate but people will be easily able to figure out what your api is doing. This is an intractable problem with all standalone games, and is why for example, you see Blizzard making Diablo3 a client-server game. The fact that it's a javascript game just makes it even more transparent and easy for people to debug and exploit.
Why don't you just send data to the server every time the client scores a point and then keep a local score of points.
Unfortunately there is not much you can do about this. Minifying/obfuscating your code is always a good option. I'm not entirely sure, but I think putting your code inside
(function() { /* code */ })();
should protect any variables from users editing (unless you have them attached to an object like window). But users can still exploit your ajax call and send whatever score they want to the server. Just, never trust anything that is done client side. Validate everything server-side.
EDIT: Another thing I thought of: maybe generate a code server-side, and send that to the browser. Then with every ajax-call send that code to verify that it is you and not some malicious user.
The 100% security is not achievable when you have to trust to data from client. However, you can make it hard to cheat by obfuscating the js code and also the data that you send from client.
I have got an idea that is similar to gviews comment.
On the server, you should keep track of the players process of the game by batch updates, that you will send from client regularly in some interval... Player will not recognize it in the latency, and you will have the tool to detect obvious cheaters. You know the starting point of the players game, so you can easily detect the cheating right from the beginning.
Also, i would suggest to use some checkpoints where you would check the real state of the game on client and the state on the server. (the state of the client would not change if the cheater changes only the server updates by xhr).
There is lot of ways to make it harder to cheat, and it is quite individual for each game... there is no standard that solves it all.
looking for some general advice and/or thoughts...
i'm creating what i think to be more of a web application then web page, because i intend it to be like a gmail app where you would leave the page open all day long while getting updates "pushed" to the page (for the interested i'm using the comet programming technique). i've never created a web page before that was so rich in ajax and javascript (i am now a huge fan of jquery). because of this, time and time again when i'm implementing a new feature that requires a dynamic change in the UI that the server needs to know about, i am faced with the same question:
1) should i do all the processing on the client in javascript and post back as little as possible via ajax
or
2) should i post a request to the server via ajax, have the server do all the processing and then send back the new html. then on the ajax response i do a simple assignment with the new HTML
i have been inclined to always follow #1. this web app i imagine may get pretty chatty with all the ajax requests. my thought is minimize as much as possible the size of the requests and responses, and rely on the continuously improving javascript engines to do as much of the processing and UI updates as possible. i've discovered with jquery i can do so much on the client side that i wouldn't have been able to do very easily before. my javascript code is actually much bigger and more complex than my serverside code. there are also simple calulcations i need to perform and i've pushed that on the client side, too.
i guess the main question i have is, should we ALWAYS strive for client side processing over server side processing whenever possible? i 've always felt the less the server has to handle the better for scalability/performance. let the power of the client's processor do all the hard work (if possible).
thoughts?
There are several considerations when deciding if new HTML fragments created by an ajax request should be constructed on the server or client side. Some things to consider:
Performance. The work your server has to do is what you should be concerned with. By doing more of the processing on the client side, you reduce the amount of work the server does, and speed things up. If the server can send a small bit of JSON instead of giant HTML fragment, for example, it'd be much more efficient to let the client do it. In situations where it's a small amount of data being sent either way, the difference is probably negligible.
Readability. The disadvantage to generating markup in your JavaScript is that it's much harder to read and maintain the code. Embedding HTML in quoted strings is nasty to look at in a text editor with syntax coloring set to JavaScript and makes for more difficult editing.
Separation of data, presentation, and behavior. Along the lines of readability, having HTML fragments in your JavaScript doesn't make much sense for code organization. HTML templates should handle the markup and JavaScript should be left alone to handle the behavior of your application. The contents of an HTML fragment being inserted into a page is not relevant to your JavaScript code, just the fact that it's being inserted, where, and when.
I tend to lean more toward returning HTML fragments from the server when dealing with ajax responses, for the readability and code organization reasons I mention above. Of course, it all depends on how your application works, how processing intensive the ajax responses are, and how much traffic the app is getting. If the server is having to do significant work in generating these responses and is causing a bottleneck, then it may be more important to push the work to the client and forego other considerations.
I'm currently working on a pretty computationally-heavy application right now and I'm rendering almost all of it on the client-side. I don't know exactly what your application is going to be doing (more details would be great), but I'd say your application could probably do the same. Just make sure all of your security- and database-related code lies on the server-side, because not doing so will open security holes in your application. Here are some general guidelines that I follow:
Don't ever rely on the user having a super-fast browser or computer. Some people are using Internet Explore 7 on old machines, and if it's too slow for them, you're going to lose a lot of potential customers. Test on as many different browsers and machines as possible.
Any time you have some code that could potentially slow down or freeze the browser momentarily, show a feedback mechanism (in most cases a simple "Loading" message will do) to tell the user that something is indeed going on, and the browser didn't just randomly freeze.
Try to load as much as you can during initialization and cache everything. In my application, I'm doing something similar to Gmail: show a loading bar, load up everything that the application will ever need, and then give the user a smooth experience from there on out. Yes, they're going to have to potentially wait a couple seconds for it to load, but after that there should be no problems.
Minimize DOM manipulation. Raw number-crunching JavaScript performance might be "fast enough", but access to the DOM is still slow. Avoid creating and destroying elements; instead simply hide them if you don't need them at the moment.
I recently ran into the same problem and decided to go with browser side processing, everything worked great in FF and IE8 and IE8 in 7 mode, but then... our client, using Internet Explorer 7 ran into problems, the application would freeze up and a script timeout box would appear, I had put too much work into the solution to throw it away so I ended up spending an hour or so optimizing the script and adding setTimeout wherever possible.
My suggestions?
If possible, keep non-critical calculations client side.
To keep data transfers low, use JSON and let the client side sort out the HTML.
Test your script using the lowest common denominator.
If needed use the profiling feature in FireBug. Corollary: use the uncompressed (development) version of jQuery.
I agree with you. Push as much as possible to users, but not too much. If your app slows or even worse crashes their browser you loose.
My advice is to actually test how you application acts when turned on for all day. Check that there are no memory leaks. Check that there isn't a ajax request created every half of second after working with application for a while (timers in JS can be a pain sometime).
Apart from that never perform user input validation with javascript. Always duplicate it on server.
Edit
Use jquery live binding. It will save you a lot of time when rebinding generated content and will make your architecture more clear. Sadly when I was developing with jQuery it wasn't available yet; we used other tools with same effect.
In past I also had a problem when one page part generation using ajax depends on other part generation. Generating first part first and second part second will make your page slower as expected. Plan this in front. Develop a pages so that they already have all content when opened.
Also (regarding simple pages too), keep number of referenced files on one server low. Join javascript and css libraries into one file on server side. Keep images on separate host, better separate hosts (creating just a third level domain will do too). Though this is worth it only on production; it will make development process more difficult.
Of course it depends on the data, but a majority of the time if you can push it client side, do. Make the client do more of the processing and use less bandwidth. (Again this depends on the data, you can get into cases that you have to send more data across to do it client side).
Some stuff like security checks should always be done on the server. If you have a computation that takes a lot of data and produces less data, also put it on the server.
Incidentally, did you know you could run Javascript on the server side, rendering templates and hitting databases? Check out the CommonJS ecosystem.
There could also be cross-browser support issues. If you're using a cross-browser, client-side library (eg JQuery) and it can handle all the processing you need then you can let the library take care of it. Generating cross-browser HTML server-side can be harder (tends to be more manual), depending on the complexity of the markup.
this is possible, but with the heavy intial page load && heavy use of caching. take gmail as an example
On initial page load, it downloads most of the js files it needed to run. And most of all cached.
dont over use of images and graphics.
Load all the data need to show in intial load and along with the subsequent predictable user data. in gmail & latest yahoo mail the inbox is not only populated with the single mail conversation body, It loads first few full email messages in advance at the time of pageload. secret of high resposiveness comes with the cost (gmail asks to load the light version if the bandwidth is low.i bet most of us have experienced ).
follow KISS principle. means keep ur desgin simple.
And never try to render the whole page using javascript in any case, you cannot predict all your endusers using the high config systems or high bandwidth systems.
Its smart to split the workload between your server and client.
If you think in the future you might want to create an API for your application (communicating with iPhone or android apps, letting other sites integrate with yours,) your would have to duplicate a bunch of code for all those devices if you go with a bare-bones server implementation of your application.
I really love the way Ajax makes a web app perform more like a desktop app, but I'm worried about the hits on a high volume site. I'm developing a database app right now that's intranet based, that no more then 2-4 people are going to be accessing at one time. I'm Ajaxing the hell out of it, but it got me to wondering, how much Ajax is too much?
At what point does the volume of hits, outweigh the benefits seen by using Ajax? It doesn't really seem like it would, versus a whole page refresh, since you are, in theory, only updating the parts that need updating.
I'm curious if any of you have used Ajax on high volume sites and in what capacity did you use it? Does it create scaling issues?
On my current project, we do use Ajax and we have had scaling problems. Since my current project is a J2EE site that does timekeeping for the employees of a large urban city, we've found that it's best if the browser side can cache data that won't change for the duration of a user session. Fortunately we're moving to a model where we have a single admin process the timekeeping for as many employees as possible. This would be akin to how an ERP application might work (or an email application). Consequently our business need is that the browser-side can hold a lot of data, but we don't expect the volume of hits to be a serious problem. So we've kept an XML data island on the browser-side. In addition, we load data only on an as-needed basis.
I highly recommend the book Ajax Design Patterns or their site.
Ajax should help your bandwidth on a high volume site if that is your concern as you are, like you said, only updating the parts that need updating. My problem with Ajax is that your site can be rendered useless if the visitors do not have javascript enabled, and most of the time I do not feel like coding the site again for non-javascript users.
Look at it this way: AJAX must not be the only option because of the possibility of !script, it must exist as a layer on top of an existing architecture to provide a superior experience in some regards. Given that, it is impossible for AJAX to create more requests or more work than simple HTML because it is handling the exact same data transfer.
Where it can save you bandwidth and server load is because AJAX provides you the ability to transfer only the data. You can save on redundant HTML, image, css, etc requests with every page refresh whilst providing a snappier user experience.
As mike nvck points out the technique of polling is a big exception to this rule, but that's about the technique not the tech: you would have the same kind of impact if you had a simple page poll.
Understand the tool and use it for what it was designed. If AJAX implementation is reducing performance, you've done something wrong.
(fwiw, my experience of profiling AJAX vs simple HTML tends to result in ~60% bandwidth, ~80-90% performance benefits)
The most common scaling issue of ajax apps is when they are to set up to check back with the server to see if the content got updated in the meantime without the need for user actively requesting it. 5 clients checking every 10 seconds is not 5000 clients checking every 10 sec.
Ajax on one side reduces the server workload because it usually shows or refreshes just part of the page, while on the other side it increases number of hits to the server. I would say that all then depends of the architecture of your web application. If your application needs a lot of processing for every hit (like database access) regardless of size of the response, then Ajax will hit you a lot.