How do I match this text faster? - javascript

I'm building an autosuggest for names. When the user types in the textbox, it hits the server and runs this:
var names = [ list of 1000 names ]; //I have a list of 1000 names, this is static.
var query = 'alex';
var matched_names = [];
//This is when it gets slow....
names.forEach(function(name){
if(name.indexOf(query) >= 0){
matched_names.push(name);
}
});
return matched_names;
How can I make this faster? I'm using Node.js

If the names are static then move this code to the client and run it there. The only reason to run code like this on the server is if the data source is dynamic in some way.
Doing this logic client-side will dramatically improve performance.

You should probably use filter instead, for one thing, because it's native:
var names = [ /* list of 1000 names */ ];
var query = 'alex';
var matched_names = names.filter(function(name) {
return name.indexOf(query) > -1;
});
return matched_names;

If you store the names in sorted order, then you can use binary search to find the region of names within the sorted order that start with the fragment of name that the user has typed so far, instead of checking all the names one by one.
On a system with a rather odd programming language, where I wanted to find all matches containing what the user had typed so far in any position, I got a satisfactory result for not much implementation effort by reviving http://en.wikipedia.org/wiki/Key_Word_in_Context. (Once at university I searched through a physical KWIC index, printed out from an IBM lineprinter, and then bound as a document for just this purpose.

I would suggest you to do this stuff on the client-side and prefer (for now) a while loop instead of a filter/forEach approach:
var names = [ /* list of 1000 names */ ]
, query = 'alex'
, i = names.length
, matched_names = [];
while(i--){
if(names[i].indexOf(query) > -1){
matched_names.push(names[i]);
}
}
return matched_names;
This will be much faster (even if filter/forEach are natively supported). See this benchmark: http://jsperf.com/function-loops/4

Related

Building a progressive query in Mongoose based on user input

I'm writing a simple REST API that is basically just a wrapper around a Mongo DB. I usually like to use the following query params for controlling the query (using appropriate safeguards, of course).
_filter=<field_a>:<value_a>,<field_b><value_b>
some values can be prepended with < or > for integer greater-than/less-than comparison
_sort=<field_c>:asc,<field_d>:desc
_fields=<field_a>,<field_b>,<field_c>
_skip=<num>
_limit=<num>
Anyway, the implementation details on these are not that important, just to show that there's a number of different ways we want to affect the query.
So, I coded the 'filter' section something like this (snipping out many of the validation parts, just to get to point):
case "filter":
var filters = req.directives[k].split(',');
var filterObj = {};
for(var f in filters) {
// field validation happens here...
splitFields = filters[f].split(':');
if (/^ *>/.test(splitFields[1])) {
filterObj[splitFields[0]] = {$gt: parseInt(splitFields[1].replace(/^ *>/, ''), 10)};
}
else if (/^ *</.test(splitFields[1])) {
filterObj[splitFields[0]] = {$lt: parseInt(splitFields[1].replace(/^ *</, ''), 10)};
}
else {
filterObj[splitFields[0]] = splitFields[1];
}
}
req.directives.filter = filterObj;
break;
// Same for sort, fields, etc...
So, by the end, I have an object to pass into .find(). The issue I'm having, though, is that the $gt gets changed into '$gt' as soon as it's saved as a JS object key.
Does this look like a reasonable way to go about this? How do I get around mongoose wanting keys like $gt or $lt, and Javascript wanting to quote them?

How to optimize this filtering in javascript?

I have a big object (called "container" in the following code) from a json file. This object contains many elements (about 20 000), each of it being an object with a "rank" property. It is too big to print it here, but here is an object with a same structure:
{
guy1: {rank:0, infos:"the first guy ever"},
guy0: {rank:2, infos:"another guy"},
something: {rank:1, infos:"something else"}
}
First I wanted to put the 10 of them with the smallest rank in a list (after I slightly modified them with a function I call here modify), so I did:
var res = [];
for (var key in container) {
if (container.hasOwnProperty(key) && container[key]["rank"]<10) {
res.push(modify(container[key]));
}
}
But now I want to do_something with the 10 of them with the smallest rank but only among those who pass my_test. I could do:
var res = [];
var filt_cont = Array.from(container).filter(my_test);//actually this doesn't work. I currently use a long non-efficient way to do that but i guess I can find something better on my own)
for (var key in filt_cont) {
if (filt_cont.hasOwnProperty(key) && filt_cont[key]["rank"]<10) {
res.push(modify(filt_cont[key]));
}
}
but as I want my code to be the fastest possible I would like to know if there is a faster way to do that. Also I want to keep the 10 best (with the smallest rank) among those who pass the test, and this only gives those who both were in the first 10 before the filter and pass the filter.
It that's relevant, my_test is four comparisons and two attributes readings and modify is 6 attribute readings and 2 int additions.
Finally I go with this:
var filt_cont = []
for (var key in container) {
filt_cont[container[key]["rank"]] = container[key]
}
filt_cont = filt_cont.filter(my_test)
return filt_cont[0:9]
Works fast enough, apparently creating a list like this, adding items by their index, is not too expensive.
I note that if undefined would pass my_test that could be a problem, but as that's not the case it's alright.

When to compare several Arrays in a 'for' loop

I'm still quite new to Javascript and Google Apps Script, and I'm attempting to create a script that takes your friends steam IDs, loops over their owned games, lists them to a spreadsheet, and displays if someone owns a game or not.
I've achieved the first part, looping over all of the owned games for each ID and adding them to an array if they don't already exist in the array works perfectly using:
var steamId = ['SomeSteamIDs'];
var gameIds = [];
var games = [];
function getGames(){
for (var i = 0; i < steamId.length; i++){
var response = UrlFetchApp.fetch("http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/?key=**YourSteamKey**&steamid=" + steamId[i] + "&format=json&include_appinfo=1");//Steam URL.
Logger.log('Response Code: ' + response.getResponseCode());//Checks the request actually connected.
var data = JSON.parse(response.getContentText());//Gets the plaintext JSON response and converts it to an object.
for (var n = 0; n < data.response.games.length; n++){//For the length of the games list
var code = data.response.games[n].appid;
var name = data.response.games[n].name;
if (gameIds.indexOf(code) === -1){//if the AppID doesn't appear in the 'appId' array and sub-arrays
gameIds[gameIds.length] = code;//then put it in the appId array for comparison
games[games.length] = [code,name];// and add the AppId and Game to the games array for writting.
};
};
}
var range = sheet.getRange("A2:B" + (gameIds.length + 1));//+1 here to compensate for starting on line 2, instead of 1.
range.setValues(games);//Perform one write action
}
This works perfectly in compiling a master list of games that are owned across all SteamIDs, but I'm having difficulty in finding a way of checking off what games are owned by each Individual ID, and what is not.
Initially I was experimenting with adding a 'Yes/No' string to the 'games' array when running the 'getGames' function, but any solution I come up with looses data. If I compare the values too early, the 'gameIds' array doesn't contain all of the data, so the first SteamID misses out on comparing against any games that the last SteamID owns.
If I do it too late, the 'data' variable only contains the response data from the last SteamID it checked, so I miss out on checking what games the first SteamID owns.
I've read the answer at How to compare arrays in JavaScript? several times now, but I'm still trying to figure out if I can use it to solve my issue.
Is there a way for me to achieve what I'm looking for, and what would be the most efficient way?
I would approach this a bit differently. I would keep an object of gameList with game object keys by id that have a name property and then userList property that is an array of users attached to each game. This will do a few things for you. One, you can lookup the game in constant time now instead of looping to find it in the array (which indexOf does). Two, you now have a unique list of games (all properties of a games object) with an array of user ids (who owns them) for easy lookup. here's the code of what I'm describing
var steamIds = [],
gameList = {};
function getGames(){
steamIds.forEach(function(steamId) {
var response = UrlFetchApp.fetch("http://api.steampowered.com/IPlayerService/GetOwnedGames/v0001/?key=**YourSteamKey**&steamid=" + steamId[i] + "&format=json&include_appinfo=1");//Steam URL.
Logger.log('Response Code: ' + response.getResponseCode());//Checks the request actually connected.
var data = JSON.parse(response.getContentText());//Gets the plaintext JSON response and converts it to an object.
data.response.games.forEach(function(game) {//For the length of the games list
var code = game.appid;
var name = game.name;
if (!gameList.hasOwnProperty(code)) {
gameList[code] = {name: name, userList: [steamId]};
} else {
gameList[code].userList.push(steamId);
}
});
});
}
Here's and example of the end result of what the gameList will look like when it's done
gameList : {
123: {
name: 'Some Game',
userList: [80, 90, 52]
},
567: {
name: 'Another Game',
userList: [68]
}
}
Writing to the cells will change a bit but your information is now associated in a way that makes it easy to get information about a particular game or users that own it.

Filter/Search JavaScript array of objects based on other array in Node JS

i have one array of ids and one JavaScript objects array. I need to filter/search the JavaScript objects array with the values in the array in Node JS.
For example
var id = [1,2,3];
var fullData = [
{id:1, name: "test1"}
,{id:2, name: "test2"}
,{id:3, name: "test3"}
,{id:4, name: "test4"}
,{id:5, name: "test5"}
];
Using the above data, as a result i need to have :
var result = [
{id:1, name: "test1"}
,{id:2, name: "test2"}
,{id:3, name: "test3"}
];
I know i can loop through both and check for matching ids. But is this the only way to do it or there is more simple and resource friendly solution.
The amount of data which will be compared is about 30-40k rows.
This will do the trick, using Array.prototype.filter:
var result = fullData.filter(function(item){ // Filter fulldata on...
return id.indexOf(item.id) !== -1; // Whether or not the current item's `id`
}); // is found in the `id` array.
Please note that this filter function is not available on IE 8 or lower, but the MDN has a polyfill available.
As long as you're starting with an unsorted Array of all possible Objects, there's no way around iterating through it. #Cerbrus' answer is one good way of doing this, with Array.prototype.filter, but you could also use loops.
But do you really need to start with an unsorted Array of all possible Objects?
For example, is it possible to filter these objects out before they ever get into the Array? Maybe you could apply your test when you're first building the Array, so that objects which fail the test never even become part of it. That would be more resource-friendly, and if it makes sense for your particular app, then it might even be simpler.
function insertItemIfPass(theArray, theItem, theTest) {
if (theTest(theItem)) {
theArray.push(theItem);
}
}
// Insert your items by using insertItemIfPass
var i;
for (i = 0; i < theArray.length; i += 1) {
doSomething(theArray[i]);
}
Alternatively, could you use a data structure that keeps track of whether an object passes the test? The simplest way to do this, if you absolutely must use an Array, would be to also keep an index to it. When you add your objects to the Array, you apply the test: if an object passes, then its position in the Array gets put into the index. Then, when you need to get objects out of the Array, you can consult the index: that way, you don't waste time going through the Array when you don't need to touch most of the objects in the first place. If you have several different tests, then you could keep several different indexes, one for each test. This takes a little more memory, but it can save a lot of time.
function insertItem(theArray, theItem, theTest, theIndex) {
theArray.push(theItem);
if (theTest(theItem)) {
theIndex.push(theArray.length - 1);
}
}
// Insert your items using insertItem, which also builds the index
var i;
for (i = 0; i < theIndex.length; i += 1) {
doSomething(theArray[theIndex[i]]);
}
Could you sort the Array so that the test can short-circuit? Imagine a setup where you've got your array set up so that everything which passes the test comes first. That way, as soon as you hit your first item that fails, you know that all of the remaining items will fail. Then you can stop your loop right away, since you know there aren't any more "good" items.
// Insert your items, keeping items which pass theTest before items which don't
var i = 0;
while (i < theArray.length) {
if (!theTest(theArray[i])) {
break;
}
doSomething(theArray[i]);
i += 1;
}
The bottom line is that this isn't so much a language question as an algorithms question. It doesn't sound like your current data structure -an unsorted Array of all possible items- is well-suited for your particular problem. Depending on what else the application needs to do, it might make more sense to use another data structure entirely, or to augment the existing structure with indexes. Either way, if it's planned carefully, will save you some time.

optimize search through large js string array?

if I have a large javascript string array that has over 10,000 elements,
how do I quickly search through it?
Right now I have a javascript string array that stores the description of a job,
and I"m allowing the user to dynamic filter the returned list as they type into an input box.
So say I have an string array like so:
var descArr = {"flipping burgers", "pumping gas", "delivering mail"};
and the user wants to search for: "p"
How would I be able to search a string array that has 10000+ descriptions in it quickly?
Obviously I can't sort the description array since they're descriptions, so binary search is out. And since the user can search by "p" or "pi" or any combination of letters, this partial search means that I can't use associative arrays (i.e. searchDescArray["pumping gas"] )
to speed up the search.
Any ideas anyone?
As regular expression engines in actual browsers are going nuts in terms of speed, how about doing it that way? Instead of an array pass a gigantic string and separate the words with an identifer.
Example:
String "flipping burgers""pumping gas""delivering mail"
Regex: "([^"]*ping[^"]*)"
With the switch /g for global you get all the matches. Make sure the user does not search for your string separator.
You can even add an id into the string with something like:
String "11 flipping burgers""12 pumping gas""13 delivering mail"
Regex: "(\d+) ([^"]*ping[^"]*)"
Example: http://jsfiddle.net/RnabN/4/ (30000 strings, limit results to 100)
There's no way to speed up an initial array lookup without making some changes. You can speed up consequtive lookups by caching results and mapping them to patterns dynamically.
1.) Adjust your data format. This makes initial lookups somewhat speedier. Basically, you precache.
var data = {
a : ['Ant farm', 'Ant massage parlor'],
b : ['Bat farm', 'Bat massage parlor']
// etc
}
2.) Setup cache mechanics.
var searchFor = function(str, list, caseSensitive, reduce){
str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
var found = [];
var reg = new RegExp('^\\s?'+str, 'g' + caseSensitive ? '':'i');
var i = list.length;
while(i--){
if(reg.test(list[i])) found.push(list[i]);
reduce && list.splice(i, 1);
}
}
var lookUp = function(str, caseSensitive){
str = str.replace(/(?:^\s*|\s*$)/g, ''); // trim whitespace
if(data[str]) return cache[str];
var firstChar = caseSensitive ? str[0] : str[0].toLowerCase();
var list = data[firstChar];
if(!list) return (data[str] = []);
// we cache on data since it's already a caching object.
return (data[str] = searchFor(str, list, caseSensitive));
}
3.) Use the following script to create a precache object. I suggest you run this once and use JSON.stringify to create a static cache object. (or do this on the backend)
// we need lookUp function from above, this might take a while
var preCache = function(arr){
var chars = "abcdefghijklmnopqrstuvwxyz".split('');
var cache = {};
var i = chars.length;
while(i--){
// reduce is true, so we're destroying the original list here.
cache[chars[i]] = searchFor(chars[i], arr, false, true);
}
return cache;
}
Probably a bit more code then you expected, but optimalisation and performance doesn't come for free.
This may not be an answer for you, as I'm making some assumptions about your setup, but if you have server side code and a database, you'd be far better off making an AJAX call back to get the cut down list of results, and using a database to do the filtering (as they're very good at this sort of thing).
As well as the database benefit, you'd also benefit from not outputting this much data (10000 variables) to a web based front end - if you only return those you require, then you'll save a fair bit of bandwidth.
I can't reproduce the problem, I created a naive implementation, and most browsers do the search across 10000 15 char strings in a single digit number of milliseconds. I can't test in IE6, but I wouldn't believe it to more than 100 times slower than the fastest browsers, which would still be virtually instant.
Try it yourself: http://ebusiness.hopto.org/test/stacktest8.htm (Note that the creation time is not relevant to the issue, that is just there to get some data to work on.)
One thing you could do wrong is trying to render all results, that would be quite a huge job when the user has only entered a single letter, or a common letter combination.
I suggest trying a ready made JS function, for example the autocomplete from jQuery. It's fast and it has many options to configure.
Check out the jQuery autocomplete demo
Using a Set for large datasets (1M+) is around 3500 times faster than Array .includes()
You must use a Set if you want speed.
I just wrote a node script that needs to look up a string in a 1.3M array.
Using Array's .includes for 10K lookups:
39.27 seconds
Using Set .has for 10K lookups:
0.01084 seconds
Use a Set.

Categories

Resources