Create Groups Without Repeating Previous Groupings - javascript

I have created a random group creator, but random doesn't really guarantee that you work with people you haven't worked with before. If someone was able to generate a "Random Group Generator With History" that tracked previous groups and avoided putting people in groups with the same people over and over, I would definitely use it! Does anyone know how to do this?
For clarity: Given an array of strings
["Jason", "Kim", "Callie", "Luke"]
and an array of previous pairings (also arrays)
[[["Jason", "Kim"], ["Callie", "Luke"]], [["Jason", "Luke"], ["Callie", "Kim"]]]
return groupings with the fewest number of repeat group members
[["Jason", "Callie"], ["Luke", "Kim"]]
I'm imagining that the number I am trying to minimize is the number of repeat partners. So for each pair of two people, for every time they have already been on a team, if the result puts them on the same team, the result would have a score of that. For the example, the "scoring" to arrive at the return value could look like this:
["Jason", "Kim"] have a score of 1, they have been paired together before
["Callie", "Luke"] have a score of 1, they have been paired together before
["Jason", "Luke"] have a score of 1, they have been paired together before
["Callie", "Kim"] have a score of 1, they have been paired together before
["Jason", "Callie"] have a score of 0, they have not been paired together before
["Luke", "Kim"] have a score of 0, they have not been paired together before
Choose the sets that cover the entire list while generating the smallest score. In this case, the pairings ["Jason", "Callie"] and ["Luke", "Kim"] cover the entire set, and have a score of 0 (no repeated groupings) and therefore it is an optimal solution (0 being the best possible outcome).
This is probably the wrong way to do this (since I'm imagining it would take n squared time), but hopefully it gives a sense of what I'm trying to optimize for. This would not need to be a perfect optimization, just a "decent answer" that doesn't put the same groups together every single time.
Ideally, it would be able to handle any size group, and also be able to handle the fact that someone might be out that day (not all people will be in all of the arrays). I would love a javascript answer, but I should be able to translate if someone can come up with the logic.

You could collect all pairings in an object and count. Then take only the ones with a smaller count.
function getKey(array) {
return array.slice().sort().join('|');
}
var strings = ["Jason", "Kim", "Callie", "Luke"],
data = [[["Jason", "Kim"], ["Callie", "Luke"]], [["Jason", "Luke"], ["Callie", "Kim"]]],
object = {},
i, j,
keys;
for (i = 0; i < strings.length - 1; i++) {
for (j = i + 1; j < strings.length; j++) {
object[getKey([strings[i], strings[j]])] = 0;
}
}
data.forEach(function (a) {
a.forEach(function (b, i) {
object[getKey(b)]++;
});
});
keys = Object.keys(object).sort(function (a, b) {
return object[b] - object[a];
});
keys.forEach(function (k) {
console.log(k, object[k]);
});
console.log(object);
.as-console-wrapper { max-height: 100% !important; top: 0; }

Related

Search through a big collection of objects

i have a really big collection of objects that i want to search through.
The array have > 60.000 items and the search performance can be really slow from time to time.
One object in that array looks like this:
{
"title": "title"
"company": "abc company"
"rating": 13 // internal rating based on comments and interaction
...
}
I want to search for the title and the company info and order that by the rating of the items.
This is what my search currently look like:
onSearchInput(searchTerm) {
(<any>window).clearTimeout(this.searchInputTimeout);
this.searchInputTimeout = window.setTimeout(() => {
this.searchForFood(searchTerm);
}, 500);
}
searchForFood(searchTerm) {
if (searchTerm.length > 1) {
this.searchResults = [];
this.foodList.map(item => {
searchTerm.split(' ').map(searchTermPart => {
if (item.title.toLowerCase().includes(searchTermPart.toLowerCase())
|| item.company.toLowerCase().includes(searchTermPart.toLowerCase())) {
this.searchResults.push(item);
}
});
});
this.searchResults = this.searchResults.sort(function(a, b) {
return a.rating - b.rating;
}).reverse();
} else {
this.searchResults = [];
}
}
Question: Is there any way to improve the search logic and performance wise?
A bunch of hints:
It's a bit excessive to put searching through 60,000 items on the front-end. Any way you can perform part of the search on the back-end? If you really must do it on the front-end considering searching in chunks of e.g. 10,000 and then using a setImmediate() to perform the next part of the search so the user's browser won't completely freeze during processing time.
Do the splitting and lowercasing of the search term outside of the loop.
map() like you're using it is weird as you don't use the return value. Better to use forEach(). Better still, is use filter() to get the items that match.
When iterating over the search terms, use some() (as pointed out in the comments) as it's an opportunity to early return.
sort() mutates the original array so you don't need to re-assign it.
sort() with reverse() is usually a smell. Instead, swap the sides of your condition to be b - a.
At this scale, it may make sense to do performance tests with includes(), indexOf(), roll-your-own-for-loop, match() (can almost guarantee it will be slower though)
Alex's suggestions are good. My only suggestion would be, if you could afford to pre-process the data during idle time (preferably don't hold up first render or interaction) you could process the data into a modified prefix trie. That would let you search for the items in O(k) time where k is the length of the search term (right now you are searching in O(kn) time because you look at every item and then do an includes which takes k time (it's actually a little worse because of the toLowerCase's but I don't want to get into the weeds of it).
If you aren't familiar with what a trie is, hopefully the code below gives you the idea or you can search for information with your search engine of choice. It's basically a mapping of characters in a string in nested hash maps.
Here's some sample code of how you might construct the trie:
function makeTries(data){
let companyTrie = {};
let titleTrie = {};
data.forEach(item => {
addToTrie(companyTrie, item.company, item, 0);
addToTrie(titleTrie, item.title, item, 0);
});
return {
companyTrie,
titleTrie
}
}
function addToTrie(trie, str, item, i){
trie.data = trie.data || [];
trie.data.push(item);
if(i >= str.length)
return;
if(! trie[str[i]]){
trie[str[i]] = {};
}
addToTrie(trie[str[i]], str, item, ++i);
}
function searchTrie(trie, term){
if(trie == undefined)
return [];
if(term == "")
return trie.data;
return searchTrie(trie[term[0]], term.substring(1));
}
var testData = [
{
company: "abc",
title: "def",
rank: 5
},{
company: "abd",
title: "deg",
rank: 5
},{
company: "afg",
title: "efg",
rank: 5
},{
company: "afgh",
title: "efh",
rank: 5
},
];
const tries = makeTries(testData);
console.log(searchTrie(tries.companyTrie, "afg"));

dc.js Using two reducers without a simple dimension and second grouping stage

Quick question following up my response from this post:
dc.js Box plot reducer using two groups
Just trying to fully get my head around reducers and how to filter and collect data so I'll step through my understanding first.
Data Format:
{
"SSID": "eduroam",
"identifier": "Client",
"latitude": 52.4505,
"longitude": -1.9361,
"mac": "dc:d9:16:##:##:##",
"packet": "PR-REQ",
"timestamp": "2018-07-10 12:25:26",
"vendor": "Huawei Technologies Co.Ltd"
}
(1) Using the following should result in an output array of key value pairs (Key MAC Address & Value Count of networks connected to):
var MacCountsGroup = mac.group().reduce(
function (p, v) {
p[v.mac] = (p[v.mac] || 0) + v.counter;
return p;
},
function (p, v) {
p[v.mac] -= v.counter;
return p;
},
function () {
return {}; // KV Pair of MAC -> Count
}
);
(2) Then in order to use the object this must be passed flattened so it can be passed to a chart as follows:
function flatten_object_group(group) {
return {
all: function () {
return group.all().map(function (kv) {
return {
key: kv.key,
value: Object.values(kv.value).filter(function (v) {
return v > 0;
})
};
});
}
};
}
var connectionsGroup = flatten_object_group(MacCountsGroup);
(3) Then I pass mac as a piechart dimension & connectionsGroup as the group. This gives a chart back a chart with roughly 50,000 slices based on my dataset.
var packetPie = dc.pieChart("#packetPie");
packetPie
.height(495)
.width(350)
.radius(180)
.renderLabel(true)
.transitionDuration(1000)
.dimension(mac)
.ordinalColors(['#07453E', '#145C54', '#36847B'])
.group(connectionsGroup);
This works A'OK and I follow up to this point.
(4) Now I want to group by the values given out by the first reducer, i.e I want to combine all of the mac addresses with 1 network connection, 2 network connections and so on as slices.
How would this be done as a dimension of "Network connections"? How can I produce this summarized data which doesn't exist in my source data and is generated from mac?
Or would this require an intermediate function between the first reducer and flattening to combine all of the values from the first reducer?
You don't need to do all of that to get a pie chart of mac addresses.
There are a few faulty understandings in points 1-3, which I guess I'll address first. It looks like you copy and pasted code from the previous question, so I'm not really sure if this helps.
(1) If you have a dimension of mac addresses, reducing it like this won't have any further effect. The original idea was to dimension/group by vendor and then reduce counts for each mac address. This reduction will group by mac address and then further count instances of each mac address within each bin, so it's just an object with one key. It will produce a map of key value pairs like
{key: 'MAC-123', value: {'MAC-123': 12}}
(2) This will flatten the object within the values, dropping the keys and producing just an array of counts
{key: 'MAC-123', value: [12]}
(3) Since the pie chart is expecting simple key/value pairs with the value being a number, it is probably unhappy with getting values like the array [12]. The values are probably coerced to NaN.
(4) Okay, here's the real question, and it's actually not as easy as your previous question. We got off easy with the box plot because the "dimension" (in crossfilter terms, the keys you filter and group on) existed in your data.
Let's forget the false lead in points 1-3 above, and start from first principles.
There is no way to look at an individual row of your data and determine, without looking at anything else, if it belongs to the category "has 1 connection", "has 2 connections", etc. Assuming you want to be able to click on slices in the pie chart and filter all the data, we'll have to find another way to implement that.
But first let's look at how to produce a pie chart of "number of network connections". That's a little bit easier, but as far as I know, it does require a true "double reduce".
If we use the default reduction on the mac dimension, we'll get an array of key/value pairs, where the key is a mac address, and the value is the number of connections for that address:
[
{
"key": "1c:b7:2c:48",
"value": 8
},
{
"key": "1c:b7:be:ef",
"value": 3
},
{
"key": "6c:17:79:03",
"value": 2
},
...
How do we now produce a key/value array where the key is number of connections, and the value is the array of mac addresses for that number of connections?
Sounds like a job for the lesser-known Array.reduce. This function is the likely inspiration for crossfilter's group.reduce(), but it's a bit simpler: it just walks through an array, combining each value with the result of the last. It's great for producing an object from an array:
var value_keys = macPacketGroup.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
Great:
{
"1": [
"b8:1d:ab:d1",
"dc:d9:16:3a",
"dc:d9:16:3b"
],
"2": [
"6c:17:79:03",
"6c:27:79:04",
"b8:1d:aa:d1",
"b8:1d:aa:d2",
"dc:da:16:3d"
],
But we wanted an array of key/value pairs, not an object!
var key_count_value_macs = Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
Great, that looks just like what a "real group" would produce:
[
{
"key": "1",
"value": [
"b8:1d:ab:d1",
"dc:d9:16:3a",
"dc:d9:16:3b"
]
},
{
"key": "2",
"value": [
"6c:17:79:03",
"6c:27:79:04",
"b8:1d:aa:d1",
"b8:1d:aa:d2",
"dc:da:16:3d"
]
},
...
Wrapping all that in a "fake group", which when asked to produce .all(), queries the original group and does the above transformations:
function value_keys_group(group) {
return {
all: function() {
var value_keys = group.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
return Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
}
}
}
Now we can plot the pie chart! The only fancy thing here is that the value accessor should look at the length of the array for each value (instead of assuming the value is just a number):
packetPie
// ...
.group(value_keys_group(macPacketGroup))
.valueAccessor(kv => kv.value.length);
Demo fiddle.
However, clicking on slices won't work. I'll return to that in a minute - just want to hit "save" first!
Part 2: Filtering based on counts
As I remarked at the start, it's not possible to create a crossfilter dimension which will filter based on the count of connections. This is because crossfilter always needs to look at each row and determine, based only on the information in that row, whether it belongs in a group or filter.
If you add another chart at this point and try clicking on a slice, everything in the other charts will disappear. This is because the keys are now counts, and counts are invalid mac addresses, so we're telling it to filter to a key which doesn't exist.
However, we can obviously filter by mac address, and we also know the mac addresses for each count! So this isn't so bad. It just requires a filterHandler.
Although, hmmm, in producing the fake group, we seem to have forgotten value_keys. It's hidden away inside the function, and then let go.
It's a little ugly, but we can fix that:
function value_keys_group(group) {
var saved_value_keys;
return {
all: function() {
var value_keys = group.all().reduce(function(p, kv) {
if(!p[kv.value])
p[kv.value] = [];
p[kv.value].push(kv.key);
return p;
}, {});
saved_value_keys = value_keys;
return Object.keys(value_keys)
.map(k => ({key: k, value: value_keys[k]}));
},
value_keys: function() {
return saved_value_keys;
}
}
}
Now, every time .all() is called (every time the pie chart is drawn), the fake group will stash away the value_keys object. Not a great practice (.value_keys() would return undefined if you called it before .all()), but safe based on the way dc.js works.
With that out of the way, the filterHandler for the pie chart is relatively simple:
packetPie.filterHandler(function(dimension, filters) {
if(filters.length === 0)
dimension.filter(null);
else {
var value_keys = packetPie.group().value_keys();
var all_macs = filters.reduce(
(p, v) => p.concat(value_keys[v]), []);
dimension.filterFunction(k => all_macs.indexOf(k) !== -1);
}
return filters;
});
The interesting line here is another call to Array.reduce. This function is also useful for producing an array from another array, and here we use it just to concatenate all of the values (mac addresses) from all of the selected slices (connection counts).
Now we have a working filter. It doesn't make too much sense to combine it with the box plot from the last question, but the new fiddle demonstrates that filtering based on number of connections does work.
Part 3: what about zeroes?
As commonly comes up, crossfilter considers a bin with value zero to still exist, so we need to "remove the empty bins". However, in this case, we've added a non-standard method to the first fake group, in order to allow filtering. (We could have just used a global there, but globals are messy.)
So, we need to "pass through" the value_keys method:
function remove_empty_bins_pt(source_group) {
return {
all:function () {
return source_group.all().filter(function(d) {
return d.key !== '0';
});
},
value_keys: function() {
return source_group.value_keys();
}
};
}
packetPie
.group(remove_empty_bins_pt(value_keys_group(macPacketGroup)))
Another oddity here is we are filtering out the key zero, and that's a string here!
Demo fiddle!
Alternately, here's a better solution! Do the bin filtering before passing to value_keys_group, and then we can use the ordinary remove_empty_bins!
function remove_empty_bins(source_group) {
return {
all:function () {
return source_group.all().filter(function(d) {
//return Math.abs(d.value) > 0.00001; // if using floating-point numbers
return d.value !== 0; // if integers only
});
}
};
}
packetPie
.group(value_keys_group(remove_empty_bins(macPacketGroup)))
Yet another demo fiddle!!

sorting two associative arrays/stacks

I am implementing an algorithm I designed and am exploring different approaches
This isn't a homework problem but I am going to explain it like one: lets say a merchant has bought inventory of apples on different days, and also sold some on different days. I want the weighted average timestamp of their current purchases.
I am storing this data object as timestamp string in epoch time, and quantity of apples. My dataset actually has the purchases and the sells in separate data sets, like so:
//buys
var incomingArray = {
"1518744389": 10,
"1318744389": 30
};
//sells
var outgoingArray = {
"1518744480": 3,
"1418744389": 5,
"1408744389": 8
};
and I would like the outcome to show only the remainding incomingArray timestamp purchase pairs.
var incomingArrayRemaining = {
"1518744389": 7,
"1318744389": 17
};
Where you see there was one outgoing transaction for 3 apples at a later timestamp, therefore subtracting from 10. And there were 13 outgoing transactions before the buy of 10, but after the purchase of 30, so they only subtract from the 30.
Note, if more than 10 were transferred after 10, it would subtract from both 10 and 30. The number of apples can never be less than 0.
First, to accomplish my goals it seems that I need to know how many are actually still owned from the lot they were purchased in.
Instead of doing stack subtracting in the LIFO method, it seems like this has to be more like Tax Lot Accounting. Where the lots themselves have to be treated independently.
Therefore I would have to take the timestamp of the first index of the sell in the outgoing array and find the nearest older timestamp of the buy in the incoming array
Here is what I tried:
for (var ink in incomingArray) {
var inInt = parseInt(ink);
for (var outk in outgoingArray) {
if (inInt >= 0) {
var outInt = parseInt(outk);
if (outInt >= inInt) {
inInt = inInt - outInt;
if (intInt < 0) {
outInt = inInt * -1; //remainder
inInt = 0;
} //end if
} //end if
} //end if
} //end innter for
} //end outer for
It is incomplete and the nested for loop solution will already have poor computational time.
That function merely tries to sort the transactions so that only the remaining balance remains, by subtracting an outgoing from the nearest incoming balance, and carrying that remainder to the next incoming balance
I feel like a recursive solution would be better, or maybe something more elegant that I hadn't thought of (nested Object forEach accessor in javascript)
After I get them sorted then I need to actually do the weighted average method, which I have some ideas for already.
First sorting, then weighted average of the remaining quantities.
Anyway, I know the javascript community on StackOverflow is particularly harsh about asking for help but I'm at an impasse because not only do I want a solution, but a computationally efficient solution, so I will probably throw a bounty on it.
You could convert the objects into an array of timestamp-value pairs. Outgoing ones could be negative. Then you can easily sort them after the timestamp and accumulate how you like it:
const purchases = Object.entries(incomingArray).concat(Object.entries(outgoingArray).map(([ts, val]) => ([ts, -val])));
purchases.sort(([ts1, ts2]) => ts1 - ts2);
Now you could iterate over the timespan and store the delta in a new array when the value increases (a new ingoing):
const result = [];
let delta = 0, lastIngoing = purchases[0][0];
for(const [time, value] of purchases){
if(value > 0){
// Store the old
result.push([lastIngoing, delta]);
// Set up new
delta = 0;
lastIngoing = time;
} else {
delta += value;
}
}

Most efficient way to merge sorted collections into a new sorted collection?

I have to merge into a single sorted collection various collections that are already sorted. This operation needs to be done for not very large collections (around 5 elements) and with not many sources (maybe about 10), but the output should be calculated fast (non-blocking server).
In the case of two source collections it's pretty trivial, but when the number of source collections increases, there are different strategies to consider (n source collections, each with m elements):
Concatenate, then sort. It has O(n*m*log(n*m)) complexity (quicksorting n*m elements).
Scan all heads, select the lowest, push it to the destination collection. I guess that it has O(n*n*m) complexity (scan n heads n*m times, that is the total number of elements).
Merging collections in pairs until you have only one collection. I guess that it has O(n*n*m) complexity too (do n merging phases, each phase with at most n*m elements).
Keep the collections in a min-heap (using the head as the key), removing the min and re-adding the collection with the next head in every step. I guess that this has O(log(n)*n*m) (the heap remove and insert operations are log(n) and they are done n*m times).
So, the computational complexity (if I haven't made any mistakes) of the sophisticated heap-based selection of the min element every time seems to be the best. Memory-wise the heap might be more garbage-collector friendly, as it doesn't need as many intermediate temporary collections.
Have I missed something (mistakes calculating the complexities or missed any alternative way to do it)? My implementation is most likely to be done in Javascript or maybe Scala, so any runtime concerns that might apply to those execution environments are welcome!
BTW this is not related: Most efficient way to merge collections preserving order?
Since your data set is so small, your asymptotic considerations are pretty much irrelevant (although correct) because the much more important factor will be micro-optimization. I would suggest you to compare at least the following options, in increasing order of effort needed to implement them:
Concatenate and just use the built-in sort algorithm. This is good because it is likely to be very fast for short arrays and because in the case of Javascript, it is written in the host language and likely to be much faster than any given pure JS solution (even with JIT compilation)
Concatenate and use insertion sort. It is O(k) in the number of descents. For 5 lists of size 10 each, the worst case for k is 1000. But in reality it will be much less. You could probably even prevent the worst case, for example by presorting the lists in order of increasing first element or just randomizing the order.
Part the sources into two groups, concatenate each and sort them using insertion sort (for 25 elements insertion sort is likely to be the overall fastest comparison sort) and then merge the results.
for 10 sources, merge them in a binary-tree fashion (merge 1 with 2, 3 with 4 etc. and then repeat for the resulting lists of size 20). Essentially this is just the second phase of merge sort
These are general suggestions. Depending on your data set, there is likely to be a customly tailored option that is even better. For example, use counting sort if the input are small enough numbers. Maybe use radix sort for larger numbers, but that's probably not going to give you a lot of speed over insertion sort. If you have any pattern in the input, exploit it.
tl;dr
Merge.
Full version
There's nothing like testing.
Let's suppose your underlying collection is a List. We'll store them in an Array.
val test = Array(
List("alpha", "beta", "gamma", "delta", "epsilon"),
List("one", "two", "three", "four", "five", "six"),
List("baby", "child", "adult", "senior"),
List("red", "yellow", "green"),
List("red", "orange", "yellow", "green", "blue", "indigo", "violet"),
List("tabby", "siamese", "manx", "persian"),
List("collie", "boxer", "bulldog", "rottweiler", "poodle", "terrier"),
List("budgie", "cockatiel", "macaw", "galah", "cockatoo"),
List("Alabama", "California", "Georgia", "Maine", "Texas", "Vermont", "Wyoming"),
List("I", "have", "to", "merge", "into")
).map(_.sorted)
Then your initial idea might be to just flatten and sort.
scala> val ans = th.pbench{ test.flatten.sorted.toList }
Benchmark (20460 calls in 183.2 ms)
Time: 8.246 us 95% CI 8.141 us - 8.351 us (n=18)
Garbage: 390.6 ns (n=1 sweeps measured)
ans: List[String] = List(Alabama, California, Georgia, I, Maine, ...)
Or you might implement a custom flatten-and-sort:
def flat(ss: Array[List[String]], i0: Int, i1: Int): Array[String] = {
var n = 0
var i = i0
while (i < i1) {
n += ss(i).length
i += 1
}
val a = new Array[String](n)
var j = 0
i = i0
while (i < i1) {
var s = ss(i)
while (s ne Nil) {
a(j) = s.head
j += 1
s = s.tail
}
i += 1
}
a
}
def mrg(ss: Array[List[String]]): List[String] = {
val a = flat(ss, 0, ss.length)
java.util.Arrays.sort(a, new java.util.Comparator[String]{
def compare(x: String, y: String) = x.compare(y)
})
a.toList
}
scala> val ans = th.pbench{ mrg(test) }
Benchmark (20460 calls in 151.7 ms)
Time: 6.883 us 95% CI 6.850 us - 6.915 us (n=18)
Garbage: 293.0 ns (n=1 sweeps measured)
ans: List[String] = List(Alabama, California, Georgia, I, Maine, ...)
Or a custom pairwise merge
def mer(s1: List[String], s2: List[String]): List[String] = {
var s3 = List.newBuilder[String]
var i1 = s1
var i2 = s2
while (true) {
if (i2.head < i1.head) {
s3 += i2.head
i2 = i2.tail
if (i2 eq Nil) {
do {
s3 += i1.head
i1 = i1.tail
} while (i1 ne Nil)
return s3.result
}
}
else {
s3 += i1.head
i1 = i1.tail
if (i1 eq Nil) {
do {
s3 += i2.head
i2 = i2.tail
} while (i2 ne Nil)
return s3.result
}
}
}
Nil // Should never get here
}
followed by a divide-and-conquer strategy
def mge(ss: Array[List[String]]): List[String] = {
var n = ss.length
val a = java.util.Arrays.copyOf(ss, ss.length)
while (n > 1) {
var i,j = 0
while (i < n) {
a(j) = if (i+1 < n) mer(a(i), a(i+1)) else a(i)
i += 2
j += 1
}
n = j
}
a(0)
}
and then you see
scala> val ans = th.pbench{ mge(test) }
Benchmark (40940 calls in 141.1 ms)
Time: 2.806 us 95% CI 2.731 us - 2.882 us (n=19)
Garbage: 146.5 ns (n=1 sweeps measured)
ans: List[String] = List(Alabama, California, Georgia, I, Maine, ...)
So, there you go. For data of the size you indicated, and for using lists (which merge very cleanly), a good bet is indeed divide-and-conquer merges. (Heaps probably won't be any better and may be worse because of the additional complexity of maintaining a heap; heapsort tends to be slower than mergesort for the same reason.)
(Note: th.pbench is a call to my microbenchmarking utility, Thyme.)
Some additional suggestions involve an insertion sort:
def inst(xs: Array[String]): Array[String] = {
var i = 1
while (i < xs.length) {
val x = xs(i)
var j = i
while (j > 0 && xs(j-1) > x) {
xs(j) = xs(j-1)
j -= 1
}
xs(j) = x
i += 1
}
xs
}
But these are not competitive with the merge sort either with one sweep:
scala> val ans = th.pbench( inst(flat(test, 0, test.length)).toList )
Benchmark (20460 calls in 139.2 ms)
Time: 6.601 us 95% CI 6.414 us - 6.788 us (n=19)
Garbage: 293.0 ns (n=1 sweeps measured)
ans: List[String] = List(Alabama, California, Georgia, I, Maine, ...)
or two:
scala> th.pbench( mer(inst(flat(test, 0, test.length/2)).toList,
inst(flat(test, test.length/2,test.length)).toList) )
Benchmark (20460 calls in 119.3 ms)
Time: 5.407 us 95% CI 5.244 us - 5.570 us (n=20)
Garbage: 390.6 ns (n=1 sweeps measured)
res25: List[String] = List(Alabama, California, Georgia, I, Maine,

Sort an array by a preferred order

I'd like to come up with a good way to have a "suggested" order for how to sort an array in javascript.
So say my first array looks something like this:
['bob','david','steve','darrel','jim']
Now all I care about, is that the sorted results starts out in this order:
['jim','steve','david']
After that, I Want the remaining values to be presented in their original order.
So I would expect the result to be:
['jim','steve','david','bob','darrel']
I have an API that I am communicating with, and I want to present the results important to me in the list at the top. After that, I'd prefer they are just returned in their original order.
If this can be easily accomplished with a javascript framework like jQuery, I'd like to hear about that too. Thanks!
Edit for clarity:
I'd like to assume that the values provided in the array that I want to sort are not guaranteed.
So in the original example, if the provided was:
['bob','steve','darrel','jim']
And I wanted to sort it by:
['jim','steve','david']
Since 'david' isn't in the provided array, I'd like the result to exclude it.
Edit2 for more clarity:
A practical example of what I'm trying to accomplish:
The API will return something looking like:
['Load Average','Memory Usage','Disk Space']
I'd like to present the user with the most important results first, but each of these fields may not always be returned. So I'd like the most important (as determined by the user in some other code), to be displayed first if they are available.
Something like this should work:
var presetOrder = ['jim','steve','david']; // needn't be hardcoded
function sortSpecial(arr) {
var result = [],
i, j;
for (i = 0; i < presetOrder.length; i++)
while (-1 != (j = $.inArray(presetOrder[i], arr)))
result.push(arr.splice(j, 1)[0]);
return result.concat(arr);
}
var sorted = sortSpecial( ['bob','david','steve','darrel','jim'] );
I've allowed for the "special" values appearing more than once in the array being processed, and assumed that duplicates should be kept as long as they're shuffled up to the front in the order defined in presetOrder.
Note: I've used jQuery's $.inArray() rather than Array.indexOf() only because that latter isn't supported by IE until IE9 and you've tagged your question with "jQuery". You could of course use .indexOf() if you don't care about old IE, or if you use a shim.
var important_results = {
// object keys are the important results, values is their order
jim: 1,
steve: 2,
david: 3
};
// results is the orig array from the api
results.sort(function(a,b) {
// If compareFunction(a, b) is less than 0, sort a to a lower index than b.
// See https://developer.mozilla.org/en/JavaScript/Reference/Global_Objects/Array/sort
var important_a = important_results[a],
important_b = important_results[b],
ret;
if (important_a && !important_b) {ret = -1}
else if (important_b && !important_a) {ret = 1}
else if (important_a && important_b) {ret = important_a - important_b}
else {ret = 0}; // keep original order if neither a or b is important
return(ret);
}
)
Use a sorting function that treats the previously known important results specially--sorts them to the head of the results if present in results.
items in important_results don't have to be in the results
Here's a simple test page:
<html>
<head>
<script language="javascript">
function test()
{
var items = ['bob', 'david', 'steve', 'darrel', 'jim'];
items.sort(function(a,b)
{
var map = {'jim':-3,'steve':-2,'david':-1};
return map[a] - map[b];
});
alert(items.join(','));
}
</script>
</head>
<body>
<button onclick="javascript:test()">Click Me</button>
</body>
</html>
It works in most browsers because javascript typically uses what is called a stable sort algorithm, the defining feature of which is that it preserves the original order of equivalent items. However, I know there have been exceptions. You guarantee stability by using the array index of each remaining item as it's a1/b1 value.
http://tinysort.sjeiti.com/
I think this might help. The $('#yrDiv').tsort({place:'start'}); will add your important list in the start.
You can also sort using this function the way you like.
Live demo ( jsfiddle seems to be down)
http://jsbin.com/eteniz/edit#javascript,html
var priorities=['jim','steve','david'];
var liveData=['bob','david','steve','darrel','jim'];
var output=[],temp=[];
for ( i=0; i<liveData.length; i++){
if( $.inArray( liveData[i], priorities) ==-1){
output.push( liveData[i]);
}else{
temp.push( liveData[i]);
}
}
var temp2=$.grep( priorities, function(name,i){
return $.inArray( name, temp) >-1;
});
output=$.merge( temp2, output);
there can be another way of sorting on order base, also values can be objects to work with
const inputs = ["bob", "david", "steve", "darrel", "jim"].map((val) => ({
val,
}));
const order = ["jim", "steve", "david"];
const vMap = new Map(inputs.map((v) => [v.val, v]));
const sorted = [];
order.forEach((o) => {
if (vMap.has(o)) {
sorted.push(vMap.get(o));
vMap.delete(o);
}
});
const result = sorted.concat(Array.from(vMap.values()));
const plainResult = result.map(({ val }) => val);
Have you considered using Underscore.js? It contains several utilities for manipulating lists like this.
In your case, you could:
Filter the results you want using filter() and store them in a collection.
var priorities = _.filter(['bob','david','steve','darrel','jim'],
function(pName){
if (pName == 'jim' || pName == 'steve' || pName == 'david') return true;
});
Get a copy of the other results using without()
var leftovers = _.without(['bob','david','steve','darrel','jim'], 'jim', 'steve', 'david');
Union the arrays from the previous steps using union()
var finalList = _.union(priorities, leftovers);

Categories

Resources