Trying to create a search bar for users that can type in a query which will then search through my JSON file to retrieve the proper contents.
Examples:
1- User input: "15 inch touchscreen"
Match with: "15-inch", "15", "Touchmonitor", "1537L", "Stand for 1501L-1601L"
2- User input: "3243 ids"
Match with: "3243L", "IDS"
Basically going for a full blown search function - so obviously speed will be a factor.
Questions:
Is there anyway to handle partial matches like that in Javascript or
jQuery?
Would it be faster to load all products client side on page load and
then search through them later, or search the JSON file at the time
of the query?
Dealing with around a 5000 line JSON file so about 200KB
Hard to say without seeing your code.
The preferred way is to make the heavy calculations and logic on the server side - In case your data and logic is expected to grow in size and complexity.
So I'd definately suggest to look into a product like elastic: https://www.elastic.co/
However, if that file is constant (not expected to grow) you could definately implement that using plain javascript on the client.
Related
I'm looking for suggestions on how to go about handling the following use case scenario with python django framework, i'm also open to using javascript libraries/ajax.
I'm working with pre-existing table/model called revenue_code with over 600 million rows of data.
The user will need to search three fields within one search (code, description, room) and be able to select multiple search results similar to kendo controls multi select. I first started off by combining the codes in django-filters as shown below, but my application became unresponsive, after waiting 10-15 minutes i was able to view the search results but couldn't select anything.
https://simpleisbetterthancomplex.com/tutorial/2016/11/28/how-to-filter-querysets-dynamically.html
I've also tried to use kendo controls, select2, and chosen because i need the user to be able to select as many rev codes as they need upward to 10-20, but all gave the same unresponsive page when it attempted to load the data into the control/multi-select.
Essentially what I'm looking for is something like this below, which allows the user to select multiple selections and will handle a massive amount of data without becoming unresponsive? Ideally i'd like to be able to query my search without displaying all the data.
https://petercuret.com/add-ajax-to-django-without-writing-javascript/
Is Django framework meant to handle this type of volume. Would it be better to export this data into a file and read the file? I'm not looking for code, just some pointers on how to handle this use case.
What the basic mechanism of "searching 600 millions"? Basically how database do that is to build an index, before search-time, and sufficiently general enough for different types of query, and then at search time you just search on the index - which is much smaller (to put into memory) and faster. But no matter what, "searching" by its nature, have no "pagination" concept - and if 600 millions record cannot go into memory at the same time, then multiple swapping out and in of parts of the 600 millions records is needed - the more parts then the slower the operation. These are hidden behind the algorithms in databases like MySQL etc.
There are very compact representation like bitmap index which can allow you to search on data like male/female very fast, or any data where you can use one bit per piece of information.
So whether Django or not, does not really matters. What matters is the tuning of database, the design of tables to facilitate the queries (types of indices), and the total amount of memory at server end to keep the data in memory.
Check this out:
https://dba.stackexchange.com/questions/20335/can-mysql-reasonably-perform-queries-on-billions-of-rows
https://serverfault.com/questions/168247/mysql-working-with-192-trillion-records-yes-192-trillion
How many rows are 'too many' for a MySQL table?
You can't load all the data into your page at once. 600 million records is too many.
Since you mentioned select2, have a look at their example with pagination.
The trick is to limit your SQL results to maybe 100 or so at a time. When the user scrolls to the bottom of the list, it can automatically load in more.
Send the search query to the server, and do the filtering in SQL (or NoSQL or whatever you use). Database engines are built for that. Don't try filtering/sorting in JS with that many records.
So, I have a main indexedDB objectstore with around 30.000 records on which I have to run full text search queries. Doing this with the ydn fts plugin this generates a second objectstore with around 300.000 records. Now, as generating this 'index' datastore takes quite long I figured it had be faster to distribute the content of the index as well. This in turn generated a zip file of around 7MB which after decompressing on the client side gives more than 40MB of data. Currently I loop over this data inserting it one by one (async, so callback time is used to parse next lines) which takes around 20 minutes. As I am doing this in the background of my application through a webworker this is not entirely unacceptable, but still feels hugely inefficient. Once it has been populated the database is actually fast enough to be even used on mid to high end mobile devices, however the population time of 20 minutes to one hour on mobile devices is just crazy. Any suggestions how this could be improved? Or is the only option minimizing the number of records? (which would mean writing my own full text search... not something I would look forward to)
Your data size is quite large for mobile browser. Unless user constantly using your app, it is not worth sending all data to client. You should use server side for full text search, while catching opportunistically as illustrated by this example app. In this way, user don't have to wait for full text search indexing.
Full-text search require to index all tokens (words) except some stemming words. Stemming is activated only when lang is set to en. You should profile your app which parts is taking time. I guess browser is taking most of the time, in that case, you cannot do much optimization other than parallelization. Sending indexed data (as you described above) will not help much (but please confirm by comparing). Also Web worker will not help. I assume your app have no problem with slow respond due to indexing.
Do you have other complaint other than slow indexing time?
I have a long json array that needs to be sent to an html5 mobile app and parsed. The whole array has around 700kb (gziped to 150kb) and it's 554976 characters long at the moment. But it will increase on time.
Using jquery to parse the json, my app crashes while trying to parse it. And so does jsonlint, json parser.fr and any other online json validator I try, so I'm guessing eval() is not an option either.
Might be a broad question but what is the maximum "acceptable" length for a json array?
I have already removed as much data as I can from the array, the only option I can think of is to split the array in 3-4 server calls and parse it separately in the app. Is there any other option?
EDIT
Thanks to #fabien for pointing that if jsonlint crashes there is a problem on the json. There was a hidden "space" character in one of the nodes. It was parsed correctly on the server but not on the client.
I've parsed way bigger arrays with jquery.
My first guess is that there's an error in your json.
You have many ways to find it (sublime text could highlight the error but some time, it's a bit long). Try to paste it in a web tool like http://www.jsoneditoronline.org/. and use any of the buttons (to format or to send to the right view). It'll tell you where the error is.
I have about 40,000 contacts in the database and using ASP.NET (VB.NET).
I have to search through these contacts with one text box that filters contacts instantly. Also need to accommodate multiple words in the textbox.
Trying to load all the contacts on the client side with JSON, and using javascript table filter does not work efficiently. It waits for the whole table to be loaded and fails to go beyond 10000 or so records.
Please let me know if there is any way to achieve this efficiently
Well, of course a pure javascript solution is going to have to load all of the contacts in order to filter them... Javascript is purely on the client side of things.
What you need to do is research page methods and have your javascript call a page method, passing in exactly what had been typed up to that point. Then your page method should issue a select call to the database to pull the top N records that begin with what was typed. A good value for N is probably 10.
Also, you probably should have your javascript not make the call until at least a few characters have been typed in; something like 3 or 4 is usually good.
I have a dictionary list of about 58040 words and i don't think jquery auto complete can handle that many words as the browser hangs.
The list is
words = ['axxx','bxxx','cxxx', an so on];
$(".CreateAddKeyWords input").autocomplete({ source: words });
Am i doing something wrong
Is there another free tool that i can use
Edit
i am using .net and i have retrieved the data fro the database and can loop through the data server side, but how do you send the data back, if json format how should the format look like?
Is there another free tool that i can use
Yes, instead of hardcoding 58040 words in your HTML or javascript file you could load them from a remote datasource using AJAX. Basically you will have a server side script which when queries with the current user input will prefilter the result and send it to the client to display suggestions.
You should assign a minimum length of user entry before searching (so it isn't querying with 1 or 2 characters).
$(".CreateAddKeyWords input").autocomplete({ source: words, minLength: 3 });
It's possible the browser is hanging because it is trying to search on the very first character which is not very useful. ~58k entries is not a large dataset by most regards, especially when you narrow it by 2-3 character contents requirements.
That's just way too much data to have it load in your webpage. Limit it to 2 letters.
1) set the autocomplete min length to at least 2
2) Create a webpage that returns JSON data - http://mydomain.com/words.php?q={letters}
You can have the filter sort be 'begins with' before 'contains'; or any variation you prefer.
Use that page as your remote data source. With the min length set, autocomplete knows when to query for new data.
I thought this was an interesting problem, and hacked up a backend service that solves auto-completion.
My code is at https://github.com/badgerman/fastcgi/ (look for complete.c), and the quick and dirty javascript proof of concept from that repository is currently at http://enno.kn-bremen.de/prefix.html (no guarantees that it will stay up for very long, since this is running on the Raspberry Pi in my home).