This is cross post from software engineering Q/A
There are couple (a lot of) websites provides internet speed test, I tried to build same, but I'm still not able to get the accurate results.
Trying
Created several files on server, let say (1, 8, 32, 64, 128, 256, 512, 1024)KB.
Then on the client side, I'm downloading each of them.
measuring
start time to request to the server
1st response from server
time it finishes the downloading
then, internet speed = all transfered size / time taken in seconds.
I checked a couple of other websites which do not download large files / larger data (more than 5Kb), but instead a lot of request are made to the server in parallel,
Also there is something smoothing factor or stabalizing factor, or something which samples the data, and calculates better results.
Here is how speedtest.net implemented, but I'm still not able to understand it properly.
https://support.speedtest.net/hc/en-us/articles/203845400-How-does-the-test-itself-work-How-is-the-result-calculated-
Can someone guide me to understand and point to the right direction to calculate internet speed?
Edit: I want to show my users on my web/app how much speed they are getting on it. For this I'm trying to apply a general creteria, similar to speedtest, but instead of taking from multiple servers, just want to try with one server only.
The general idea is to compute parameters to be able to stuff the physical communication channel. The main part is to determine which number of parallel downloads will reach the goal.
A single communication is clearly not sufficient because there exist many overheads during which you can send other packets. In a very rough approximation where to receive messages you need to send a packet from A to B to request some data, and then the data is sent back from B to A, you can clearly request something else while the data is sent back. You can also think of how many data packet can be sent along the link from X to Y? Just like you can have several cars in the same road from B to A. Each car being a packet from a given communication.
Determining the speed of a connection is highly dependent on many factors, and what is obtained is only an approximation.
Related
I've been scratching my head and trying this for about a week now. So I hope I can find my help here..
I'm making an application that provides real-time data to the client, I've thought about Server-Sent-Events but that doesn't allow per-user responses AFAIK.
WebSocket is also an option but I'm not convinced about it, let me sketch my scenario which I did with WS:
Server fetches 20 records every second, and pushes these to an array
This array gets sent to all websocket connections every second, see this pseudo below:
let items = [ { ... some-data ... } ];
io.on("connection", socket => {
setInterval(() => {
io.emit("all_items", items);
}, 1000);
});
The user can select some items in the front end, the websocket receives this per connection
However, I'm conviced the way I'm taking this on is not a good way and enormously innefficient. Let me sketch the scenario of the program of what I want to achieve:
There is a database with let's say 1.000 records
User connects to the back-end from a (React) Front-end, gets connected to the main "stream" with about 20 fetched records (without filters), which the server fetches every second. SELECT * FROM Items LIMIT 20
Here comes the complex part:
The user clicks some checkboxes with custom filters (in the front-end) e.g. location = Shelf 2. Now, what's supposed to happen is that the websocket ALWAYS shows 20 records for that user, no matter what the filters are
I've imagined to have a custom query for each user with custom options, but I think that's bad and will absolutely destroy the server if you have like 10.000 users
How would I be able to take this on? Please, everything helps a little, thank you in advance.
I have to do some guessing about your app. Let me try to spell it out while talking just about the server's functionality, without mentioning MySQL or any other database.
I guess your server maintains about 1k datapoints with volatile values. (It may use a DBMS to maintain those values, but let's ignore that mechanism for the moment.) I guess some process within your application changes those values based on some kind of external stimulus.
Your clients, upon first connecting to your server, start receiving a subset of twenty of those values once a second. You did not specify how to choose that initial subset. All newly-connected clients get the same twenty values.
Clients may, while connected, apply a filter. When they do that, they start getting a different, filtered, subset from among all the values you have. They still get twenty values. Some or all the values may still be in the initial set, and some may not be.
I guess the clients get updated values each second for the same twenty datapoints.
You envision running the application at scale, with many connected clients.
Here are some thoughts on system design.
Keep your datapoints in RAM in a suitable data structure.
Write js code to apply the client-specified filters to that data structure. If that code is efficient you can handle millions of data points this way.
Back up that RAM data structure to a DBMS of your choice; MySQL is fine.
When your server first launches load the data structure from the database.
To get to the scale you mention you'll need to load-balance all this across at least five servers. You didn't mention the process for updating your datapoints, but it will have to fan out to multiple servers, somehow. You need to keep that in mind. It's impossible to advise you about that with the information you gave us.
But, YAGNI. Get things working, then figure out how to scale them up. (It's REALLY hard work to get to 10K users; spend your time making your app excellent for your first 10, then 100 users, then scale it up.)
Your server's interaction with clients goes like this (ignoring authentication, etc).
A client connects, implicitly requesting the "no-filtering" filter.
The client gets twenty values pushed once each second.
A client may implicitly request a different filter at any time.
Then the client continues to get twenty values, chosen by the selected filter.
So, most client communication is pushed out, with an occasional incoming filter request.
This lots-of-downbound-traffic little-bit-of-upbound-traffic is an ideal scenario for Server Sent Events. Websockets or socket.io are also fine. You could structure it like this.
New clients connect to the SSE endpoint at https://example.com/stream
When applying a filter they reconnect to another SSE endpoint at https://example.com/stream?filter1=a&filter2=b&filter3=b
The server sends data each second to each open SSE connection applying the filter. (Streams work very well for this in nodejs; take a look at the server side code for the signalhub package for an example.
I have to code a website with the capability of watching many live streams (video-surveillance cameras) at the same time.
So far, I'm using MJPEG and JS to play my live videos and it is working well ... be only up to 6 streams !
Indeed, I'm stuck with the 6 parallel downloads limit most browser have (link).
Does someone know how to by-pass this limit ? Is there a tip ?
So far, my options are:
increase the limit (only possible on Firefox) but I don't like messing with my users browser settings
merge the streams in one big stream/video on the server side, so that I can have one download at the time. But then I won't be able to deal with each stream individually, won't I ?
Switch to JPEG stream and deal with a queue of images to be refreshed on the front side (but if I have say 15 streams, I'm afraid I will collapse my client browser on the requests (15x25images/s)
Do I have any other options ? Is there a tip or a lib, for example could I merge my stream in one big pipe (so 1 download at the time) but have access to each one individually in the front code ?
I'm sure I'm on the right stack-exchange site to ask this, if I'm not please tell me ;-)
Why not stream (if you have control over the server side and the line is capable) in one connection? You do one request for all 15 streams to be send /streamed in one connection (not one big stream) so the headers of each chunk have to match the appropriate stream-id. Read more: http://qnimate.com/what-is-multiplexing-in-http2/
More in-depth here: https://hpbn.co/http2/
With http1.0/1.1 you are out of luck for this scenario - back then when developed one video or mp3 file was already heavy stuff (work arounds where e.g. torrent libraries but unreliable and not suited for most scenarios apart from mere downloading/streaming). For your interactive scenario http2 is the way to go imho.
As Codebreaker007 said, I would prefer HTTP2 stream multiplexing too. It is specifically designed to get around the very problem of too many concurrent connections.
However, if you are stuck with HTTP1.x I don't think you're completely out of luck. It is possible to merge the streams in a way so that the clientside can destructure and manipulate the individual streams, although admittedly it takes a bit more work, and you might have to resort to clientside polling.
The idea is simple - define a really simple data structure:
[streamCount len1 data1 len2 data2 ...]
Byte 0 ~ 3: 32-bit unsigned int number of merged streams
Byte 4 ~ 7: 32-bit unsigned int length of data of stream 1
Byte 8 ~ 8+len1: binary data of stream 1
Byte 8+len1+1 ~ 8+len1+4: length of data of stream 2
...
Each data is allowed to have a length of 0, and is handled no differently in this case.
On the clientside, poll continuously for more data, expecting this data structure. Then destructure it and pipe the data to the individual streams' buffer. Then you can still manipulate the component streams individually.
On the serverside, cache the data from individual component streams in memory. Then in each response empty the cache, compose this data structure and send.
But again, this is very much a plaster solution. I would recommend using HTTP2 stream as well, but this would be a reasonable fallback.
Huge json server requests: around 50MB - 100MB for example.
From what I know, it might crash when loading huge requests of data to a table (I usually use datatables), the result: memory reaches to almost 8G, and the browser crash. Chrome might not return a result, Firefox will usually ask if I want to wait or kill the process.
I'm going to start working on a project which will send requests for huge jsons, all compressed (done by the server side PHP). The purpose of my report is to fetch data, and display all in a table - made easy to filter and order. So I cant find the use of "lazy load"ing for this specific case.
I might use a vue-js datatable library this time (not sure which specifically).
What's exactly using so much of my memory? I know for sure that the json result is received. Is that rendering/parsing of the json to the DOM? (I'm referring to the datatable example for now: https://datatables.net/examples/data_sources/ajax)
What is the best practices in these kind of situations?
I started researching this issue and noticed that there are posts from 2010 that seem like they're not relevant at all.
There is no limit on the size of an HTTP response. There is a limit on other things, such as:
local storage
session storage
cache
cookies
query string length
memory (per your CPU limitations or browser allocation)
Instead, the problem is with your implementation of your datatable most likely. You can't just insert 100,000 nodes into the DOM and not expect some type of performance impact. Furthermore, if the datatable is performing logic against each of those datum as they're coming in and processing them before the node insertion, that's also going to be a big no no.
What you've done here is essentially pass the leg work of performing pagination from the server to the client, and with dire impacts.
If you must return a response that big, consider using one of the storage options that browsers provide (a few mentioned above). Then paginate off of the stored JSON response.
I have created a website for a friend. Because he wished to have a music player continue to play music through page loads, I decided to load content into the page via ajax (facilitated by jQuery). It works fine, it falls back nicely when there is no javascript, and the back/forward buttons are working great, but it's dreadfully slow on the server.
A couple points:
The initial page load is fairly quick. The Chrome developer console tells me that "index.php" loads in about 2.5 seconds. I have it set up so that query string params dictate which page is loaded, and this time frame is approximately accurate for them all. For the homepage, there is 8.4KB of data loaded.
When I load the content in via an ajax request, no matter the size of the data downloaded, it takes approximately 20 seconds. The smallest amount of data that is loaded in this way is about 500 bytes. There is obviously a mismatch here.
So Chrome tells me that the vast majority of the time spent is "waiting" which I take to mean that the server is dealing with the request. So, that can only mean, I guess, that either my code is taking a long time, or something screwy is going on with the server. I don't think it's my code, because it's fairly minimal:
$file = "";
if (isset($_GET['page'])) {
$file = $_GET['page'];
} else if (isset($_POST['page'])) {
$file = $_POST['page'];
} else {
$file = "home";
}
$file = 'content/' . $file . '.php';
if (file_exists($file)) {
include_once($file);
} else {
include_once('content/404.php');
}
This is in a content_loader.php file which my javascript (in this scenario) sends a GET request to along with a "page" parameter. HTML markup is returned and put into a DIV on the page.
I'm using the jQuery .get() shorthand function, so I don't imagine I could be messing anything up there, and I'm confident it's not a Javascript problem because the delay is in waiting for the data from the server. And again, even when the data is very small, it takes about 20 seconds.
I currently believe it's a problem with the server, but I don't understand why a request made through javascript would be so much slower than a request made the traditional way through the browser. As an additional note, some of the content pages do connect to a MySQL database, but some do not. It doesn't seem to matter what the page requires for processing or how much data it consists of, it takes right around 20 seconds.
I'm at a loss... does anyone know of anything that might explain this? Also, I apologize if this is not the correct place for such a question, none of the other venues seemed particularly well suited for the question either.
As I mentioned in my comment, a definite possibility could be reverse DNS lookups. I've had this problem before and I bet it's the source of your slow requests. There are certain Apache config directives you need to watch out for in both regular apache and vhost configs as well as .htaccess. Here are some links that should hopefully help:
http://www.tablix.org/~avian/blog/archives/2011/04/reverse_dns_lookups_and_apache/
http://betabug.ch/blogs/ch-athens/933
To find more resources just Google something like "apache slow reverse dns".
A very little explanation
In a reverse DNS lookup an attempt is made to resolve an IP address to a hostname. Most of the time with services like Apache, SSH and MySQL this is unnecessary and it's a bad idea as it only serves to slow down requests/connections. It's good to look for configuration settings for your different services and disable reverse DNS lookups if they aren't needed.
In Apache there are certain configuration settings that cause a reverse lookup to occur. Things like HostnameLookups and allow/deny rules specifying domains instead of IP addresses. See the links above for more info.
As you suggested in your comment, the PHP script is executing quickly once it finally runs. The time is spent waiting on Apache - most likely to do a reverse DNS lookup, and failing. You know the problem isn't with your code, it's with the other services involved in the request.
Hope this helps!
I have a script which uploads a lot of POST data using jQuery, but this interferes with all other requests as the outgoing data swamps any other requests the browser (and other things, like ssh clients) might make.
is it possible (unlikely, yes) to tell the connection to slow down a bit as it's not a priority, and let other connections through?
jQuery is tagged, because that's the major library I'm using, but I can work on a lower level if the answer needs it.
There was no definite answer to this question, but the ideas presented by commenters have worked over time.
When I need to upload a lot of similar data these days, I write a function which handles the actual upload. It gathers data and "pauses" for a few ms to see if there will be further upload requests. when sufficient time has passed with no upload requests (or th queue is a certain length or size), the function then aggregates all upload data into a single upload to a server-side function designed to split those apart, handle them, and return the various results to the function which then andles callbacks if there are any.
the above may sound complex, but it has made a huge difference in reducing network swamping.