I have website with backend in Python (Django) and JavaScript hosted on heroku. Also, I have code in python that does image classification with EfficientNet, so I want to integrate this code into my website.
The logical sequence of ideas is as follows:
The user upload an image on the site;
This image will be classified with the Python code;
The algorithm will return an image;
The returned image should be posted on the site.
Does anyone know what would be the best way to do this?
First of all, yes, if it is possible to implement what you are mentioning, I would implement the following:
Use celery to implement asynchronous tasks where when the photo is uploaded, Django tells celery that it has to do the asynchronous task (in this case, use the CNN) and can leave a pending status for the photo and once the task is complete, it changes the status and would appear published on the platform.
I recommend using asynchronous tasks for this because of the following:
The use of the convolutional neural network can take a certain time, let us remember that the default maximum response time of an HTTP request is 30 seconds and it could cut the request, the user would see it as an error and he can also complain because uploading a photo must wait a while and for user purposes they would think that the site is slow. The implementation of asynchronous tasks allows first in the HTTP request to indicate to the user that the image is being analyzed and secondly you do not have a limit of 30 seconds to analyze, in case of having many image uploads at the same time it can crash the server. That is why with celery you can even implement queues to solve this (Using redis or rabbitMQ).
If you want to implement knowing the status of the image in real time, you could add the use of a websocket, where when uploading the image in the response you get a URL that is the one of the websocket where you would receive information about the image once processed. You can use django-channels for it
Related
I'm building an application in React that lets users upload pictures to the S3 bucket. After the upload is finished, I want to spin up some workers to process the pictures through a neural network that will analyze the image and tag it based on its content.
I don't want to run this action on the server itself, rather delegate it to a separate set of instances that will handle the processing. What would be the best solution to handle such a problem? It'll need to scale nicely depending on the amount of data to process.
It'd be great if this can be easily integrated using Node.js or Python if possible.
I'm working on a project in which I have to develop a simple PHP based web module from where the user (admins) can send SMS messages (Followup) to students, as for the sake of advertisement and other needs.
The SMS API is very simple and I just need to send a GET request to a Cross Origin Domain along with the phone number and message.
I tested it with the file_get_contents("sms_api_url?credentials"); and it works fine.
What worries me is that the SMS will be sent to TONS of numbers and so I have to send the request multiple times using a loop, which will take a lot of time and I think will be too much resource consuming.
Also the max execution time for PHP is set to 30 seconds which I don't want to change.
I thought to use the Client side JavaScript for sending cross origin request in a loop so that it wont affect my server but that wouldn't be secure as it would reveal the API credentials.
What Technology should I use to accomplish my goals? and send tons of get request efficiently?
You've told us nothing about the the actual volume you need to handle, the metrics for the processing/connection time nor what constraints there are on the implementation.
As it stands this is way too broad to answer. But some approaches you might consider are:
1) Running concurrent requests - but note that just like domain sharding, this can undermine your bandwidth if over used
2) You can have PHP scripts running indefinitely outside the webserver (using the CLI SAPI) and these can be launched from a web session.
I thought to use the Client side JavaScript for sending cross origin request in a loop so that it wont affect my server but that wouldn't be secure as it would reveal the API credentials.
If you send directly to the endpoint, then yes, you'd need the credentials in the browser. But if you implement a proxy script which injects the credentials on your webserver then you can use your own credentials from the browser.
Using cron has certian advantages - but you really don't want to be spawning a task from crond to send one SMS message - it needs to run in batches, and you need to manage the concurrency.
You might want to consider switching to a different aggregator whom can offer bulk processing.
Regardless of the aproach you will need a way to store the messages/phone numbers and a locking mechanism around retrieval processing.
Personally, I'd be tempted to look at using an MTA for this or perhaps even Kannel - but that's more an approach for handling volumes in excess of 300,000 per day.
To send as many network requests as needed in less than 30 seconds are two requirements that kind of contradict themselves. Also, raw "efficiency" can just mean squeeze every single resource in the server, which not may be desirable.
Said that, I think the key points are:
I may be wrong but, as far as I know, there're only two ways to prevent a non-authorised party from consuming a web service: private credentials and IP filtering. None are possible in browser-based JavaScript.
Don't make a human being stare in front of the computer until a task of this kind completes. There's absolutely no need to and it can even cause the task to abort.
If you need to send the same text to different recipients, find out whether the SMS provider has an API that allows to do it in a single API request. Large batch deliveries get one or two orders of magnitude harder when this feature is not available.
In short you need:
A command line script
A task scheduler (e.g. cron)
Prefer server stability to maximum efficiency (you may even want to throttle your requests)
Send the requests from the server, but don't do it in the PHP script that generates the page.
Instead, store information about the desired messages in a database.
Write another program which, periodically, checks the database for unsent messages and makes the call to the API. You could run it using cron.
I think this should be fairly simple, but I think I am looking into things too much and it's not making sense.
What I am currently doing
I am creating a web app using Node + React to record audio in the browser. I'm using RecordRTC on the client side to record the audio from the user's microphone. All is fine and dandy, but sometimes it takes a long time to upload the audio file after the user is finished singing. I want to process this file before sending it back to the user in the next step, so speed is critical here as they are waiting for this to occur.
In order to make the experience smoother for my users, I want to kick off the audio upload process as soon as I begin to receive the audio blobs from RecordRTC. I can get access to these blobs as RecordRTC allows me to pass a timeslice value (in ms) and an 'ondatavailable' function, that will get passed a blob every timeslice amount of milliseconds.
What I have tried
Currently I have it all easily working with FormData() as I only send the file once the user has finished singing.
My first idea was to find an example of something like the Fetch API being used in a manner that resembles what I'm after. There are plenty of examples, but all of them treat the source file as already being available, but as I want to continually add blobs as they come (without being able to pre-determine when these might stop coming, as a user may decide to stop the singing process early) this doesn't look promising.
I then considered a 'write my own' process whereby many request are made instead of attempting one long continuous style one. This would involve attaching a unique identifier to each request, and having the server concatenate each chunk together where the ids match. However, I'm not sure how flexible this would be in the future in say a multi-server environment, not to mention handling dropped connections etc, and no real way to tell the server to scrap everything in the case of a user aborted event such as closing the tab/webpage etc.
Finally, I looked into what was available through the likes of NPM etc without success, before conceding that perhaps my Google Fu was letting me down.
What I want
Ideally, I want to create a SINGLE new request once the recording begins, then take the blob every time I receive it in 'ondataavailable', send it to my request (which will pump it through to my server once it receives something) indefinitely. Once the audio stops (I get this event from RecordRTC as well so can control this), I want to finish/close up my request so that the server knows it can now begin to process the file. As part of the uploading process, I also need to pass in a field or two of text data in the body, so this will need to be handled as well. On the server side, each chunk should be immediately accessible once the server receives it, so that I can begin to create the audio file/append to the audio file on the server side and have it ready for processing almost immediately after the user has finished their singing.
Note: The server is currently set to look for and process multi-part uploads via the multer library on npm, but I am more than happy to change this in order to get the functionality I want.
Thanks!
Providing an update for anyone that may stumble upon this question in their own search.
We ended up 'rolling our own' custom uploader which, on the client side, sends the audio blobs in chunks of up to 5 1-second blobs to the server. Each request contains a 'request number' which is simply +1 of the previous request number, starting at 1. The reason for sending 5 1-second blobs is RecordRTC, at least at the time, would not capture the final X number of seconds. EG. If using 5 second blobs instead, a 38 second song would lose the final 3 seconds. Upon reaching the end of the recording, it sends a final request (marked with an additional header to let the server know it's the final request). The uploader works in a linked list style to ensure that each previous request has been processed before sending the next one.
On the server side, the 5 blobs are appended into a single 5 second audio blob via FFMPEG. This does introduce an external dependency but we were already using FFMPEG for much of our application so it was an easy decision. The produced file has the request number appended to its filename. Upon receiving the final request, we use FFMPEG again to do a final concatenation of all the received files to get our final file.
On very slow connections, we're seeing time savings upwards of 60+ seconds, so it has significantly improved the app's usability for slower internet connections.
If anyone wants the code to use for themselves please PM through here. (It's fairly unpolished but I will clean it up before sending)
I have written an application in node.js which takes input from user and generates pdfs file based on few templates.
I am using pdfkit npm for this purpose. My application is running in production. But my application is very slow, below are the reasons :
What problem I am facing :
It is working in sync manner. I can explain it by giving an example- Suppose a request come to the application to generate a pdf, is starts processing and after processing it returns back the response with generated pdf url. But if multiple request comes to the server it process each request one by one(in sync manner).
All request in queue have to wait untill the previous one is finished.
Maximum time my application gives Timeout or Internal Server Error.
I can not change the library, why ?
There are 40 templates I have written in js for pdfkit. And each template is of 1000 - 3000 lines.
If I will change the lib, i have to rewrite those templates according to new library.
It will take many months to rewrite and test it properly.
What solution I am using now :
I am managing a queue now, once a request come it got queued and a satisfactory message send back in response to the user.
Why this solution is not feasible ?
User should be provided valid pdf url upon success of request. But in queue approach, user is getting only a confirmation message. And pdf is being processed later in queue.
What kind of solution I am seeking now ?
Any way through which I can make this application multi-threaded/asynchronous, So that it will be capable of handling multiple request on a time without blocking the resource?
Please save my life.
I hate to break it to you, but doing computation in the order tasks come in is a pretty fundamental part of node. It sounds like loading these templates is a CPU-bound task, and since Node is single-threaded, it knocks these off the queue in the order they come in.
On the other hand, any framework would have a similar problem. Node being single-threading means its actually very efficient, because it doesn't lose cycles to context switching.
How many PDF-generations can your program handle at once? What type of hardware are you running this on? If it's failing on a few requests a second, then there's probably a programming fix.
For node, the more things you can make asynchronous the better. For example, any time you're reading a file in, it should be asynchronous.
Can you post the code for one of your PDF-creating request functions?
Just for fun I was creating a JavaScript console for controlling my PC. It involves a small webserver that takes command strings and forwards those to the system using popen calls (to be more specific popen4 on a Ruby mongrel server). The stdout channels are redirected to the http response.
The problem is that the response only arrives once the entire contents of stdout has been sent. This is ok for small commands, but not for a command like find / which lists all the files in the system. In such situations it would be nice to have the results shown progressively in the webview (just like in the regular terminal).
I thought that using XMLHttpRequest synchronously might result in progressive downloading, but it doesn't seem so.
Is there any way to make it work?
Quick question, are you flushing the response stream? If not the the request will wait until it has been. Just a thought as this is the case when creating progressive download of files etc