How to compare if two dates are closer to each other - javascript

I want to check if two dates are within a range of each other, and reorder my array. I need to compare the dates from an array with the current date.
I have this:
var currentDate = new Date(); /* current date = 2021/08/18 */
listOfObjects = [ { "user": "John", "date": "2021-08-20" }, { "user": "Bob", "date": "2021-08-17" }, { "user": "Joe", "date": "2021-08-09" } ]
The return shoud be like this:
[ { "user": "Bob", "date": "2021-08-17" }, { "user": "John", "date": "2021-08-20" }, { "user": "Joe", "date": "2021-08-09" } ]

In JavaScript, an array is sorted by sorting the textual representation of its items.
The default sort order is ascending, built upon converting the elements into strings, then comparing their sequences of UTF-16 code units values. (Source: Array.prototype.sort() - JavaScript | MDN)
See following example:
const numbers = [1, 30, 4, 21, 100000];
numbers.sort();
console.log(numbers);
We see that the output is the array with alphabetically sorted numbers:
1, 100000, 21, 30, 4
In most cases, this is not what we want (or what most people expect). To sort numbers numerically, we pass a custom compare function to sort:
function i_cmp(a, b) {
let d = a-b;
if (d < 0)
return -1;
if (d > 0)
return 1;
return 0;
}
numbers.sort(i_cmp);
console.log(numbers);
output:
1,4,21,30,100000
To sort an array by a criterion that depends on further conditions, it's handy to pass a function bound to runtime values that is created by another function. Here we sort items by their absolute distance of a fixed value x.
function d_cmp(x) {
return function(a, b) {
let d = Math.abs(a-x)-Math.abs(b-x);
if (d < 0)
return -1;
if (d > 0)
return 1;
return 0;
}
}
numbers.sort(d_cmp(50));
console.log(numbers);
output:
30,21,4,1,100000
Hemera already answered how to get date distances. The rest, accessing date attributes, should be easy to implement.
For a live demo of above code (combined) see: https://ideone.com/e7DaOx

You can subtract the dates and compare the results. Like new Date("2021-08-18") - new Date("2021-08-17") = 86400000 cause dates are saved as milliseconds from a random but standardize reference date.
Then you can use this difference by using Math.abs(number) as a condition for finding the nearest dates to the given one and put it in a simple sorting function like below:
function orderByDateDistance(nDate, nList){
// easy sorting by finding the current min (also working with max)
for(let tA=0;tA<nList.length-1;tA++){ // iterating over all except the last will be sorted
let tIndex = tA; // current index
let tDifference = Math.abs(nDate-new Date(nList[tA]["date"])); // current difference
for(let tB=tA+1;tB<nList.length;tB++){ // iterating over unsorted list part
if(Math.abs(nDate-new Date(nList[tB]["date"])) < tDifference){ // compare current difference with stored
tIndex = tB; // save index
tDifference = Math.abs(nDate-new Date(nList[tB]["date"])); // save value optional
}
}
// change items
let tBuffer = nList[tA]; // save current object
nList[tA] = nList[tIndex]; // copy next lowest object
nList[tIndex] = tBuffer; // copy buffered object
}
return nList; // optional returning
}
// Your example
console.log(
orderByDateDistance(
new Date("2021/08/18"),
[
{"user": "John", "date": "2021-08-20"},
{"user": "Bob", "date": "2021-08-17"},
{"user": "Joe", "date": "2021-08-09"}
]
)
);

Related

Remove array from array of arrays javascript

I want delete array from array of arrays without iterate on them.
the first array is like this :
array1 = [
[
[{"id":"A1","y":12},{"id":"A4","y":12}],
[{"id":"A2","y":1}],
[{"id":"A3","y":6}]
]
the second array is :
array2 = [{"id":"A1","y":12},{"id":"A4","y":12}]
I use javascript and i want delete array2 from array1.
So sorry for wrong description by the way i found that in my exemple the second array array is
array2 = [{ "id": "A1", "y": 12 }, { "id": "A4", "y": 2 }]
So the two arrays have just one columns which is equal. How i can delete array1 from array2 even they have just one attribute which is equal ?
Thank you.
splice is usually used to remove a particular item from the array. I think you want to do something like this:
var array = [
[{"name": "sik", "gender": "male"}],
[{"name": "sug", "gender": "female"}],
[{"name": "hyd", "gender": "male"}]
];
// it removes the second array from the array: removes the female one.
array.splice( 1, 1 );
Description: The function splice takes two arguments, first one is for the index of item, and second one until how much you want to delete.
For Example: splice( 2, 5 ) - means that: from the index 2, keep deleting until 5 items.
Hope this helps, Thanks.
Feels weird, but you can check every element of the array and match to the second array. Either way you need to iterate through your array. You can't simply type array1[i] == array2, you need to check it differently. A hack is to stringify them as JSON and compare strings, but there are other methods you can use to compare arrays.
Here is a simple demo with a for loop
array1 = [
[{ "id": "A1", "y": 12 }, { "id": "A4", "y": 12 }],
[{ "id": "A2", "y": 1 }],
[{ "id": "A3", "y": 6 }]
]
array2 = [{ "id": "A1", "y": 12 }, { "id": "A4", "y": 12 }]
l = array1.length
for (var i = 0; i < l; i++) {
// if arrays match
if (JSON.stringify(array1[i]) == JSON.stringify(array2)) {
// delete the element at `i` position
array1.splice(i, 1);
i--;
l--;
// break; // if you can guarantee that no more instances will occur
}
}
console.log(array1)
In Plain Javascript, we cannot filter them. This will work if u are dealing with valued datatypes. Here you are comparing arrays which are a referenced datatype. so here we need to write our own search stack
function searchForArray(haystack, needle){
var i, j, current;
for(i = 0; i < haystack.length; ++i){
if(needle.length === haystack[i].length){
current = haystack[i];
for(j = 0; j < needle.length && needle[j] === current[j]; ++j);
if(j === needle.length)
return i;
}
}
return -1;
}
var arr = [[1,3],[1,2]];
var n = [1,3];
console.log(searchForArray(arr,n));
You can refer this question which has some reference resources too.
Check whether an array exists in an array of arrays?
I know you asked for a solution without iteration, but i just thought to give an idea of references which may be useful for u.

javascript array sorting / array match sorted array [duplicate]

For example, if I have these arrays:
var name = ["Bob","Tom","Larry"];
var age = ["10", "20", "30"];
And I use name.sort() the order of the "name" array becomes:
var name = ["Bob","Larry","Tom"];
But, how can I sort the "name" array and have the "age" array keep the same order? Like this:
var name = ["Bob","Larry","Tom"];
var age = ["10", "30", "20"];
You can sort the existing arrays, or reorganize the data.
Method 1:
To use the existing arrays, you can combine, sort, and separate them:
(Assuming equal length arrays)
var names = ["Bob","Tom","Larry"];
var ages = ["10", "20", "30"];
//1) combine the arrays:
var list = [];
for (var j = 0; j < names.length; j++)
list.push({'name': names[j], 'age': ages[j]});
//2) sort:
list.sort(function(a, b) {
return ((a.name < b.name) ? -1 : ((a.name == b.name) ? 0 : 1));
//Sort could be modified to, for example, sort on the age
// if the name is the same. See Bonus section below
});
//3) separate them back out:
for (var k = 0; k < list.length; k++) {
names[k] = list[k].name;
ages[k] = list[k].age;
}
This has the advantage of not relying on string parsing techniques, and could be used on any number of arrays that need to be sorted together.
Method 2: Or you can reorganize the data a bit, and just sort a collection of objects:
var list = [
{name: "Bob", age: 10},
{name: "Tom", age: 20},
{name: "Larry", age: 30}
];
list.sort(function(a, b) {
return ((a.name < b.name) ? -1 : ((a.name == b.name) ? 0 : 1));
});
for (var i = 0; i<list.length; i++) {
alert(list[i].name + ", " + list[i].age);
}
​
For the comparisons,-1 means lower index, 0 means equal, and 1 means higher index. And it is worth noting that sort() actually changes the underlying array.
Also worth noting, method 2 is more efficient as you do not have to loop through the entire list twice in addition to the sort.
http://jsfiddle.net/ghBn7/38/
Bonus Here is a generic sort method that takes one or more property names.
function sort_by_property(list, property_name_list) {
list.sort((a, b) => {
for (var p = 0; p < property_name_list.length; p++) {
prop = property_name_list[p];
if (a[prop] < b[prop]) {
return -1;
} else if (a[prop] !== a[prop]) {
return 1;
}
}
return 0;
});
}
Usage:
var list = [
{name: "Bob", age: 10},
{name: "Tom", age: 20},
{name: "Larry", age: 30},
{name: "Larry", age: 25}
];
sort_by_property(list, ["name", "age"]);
for (var i = 0; i<list.length; i++) {
console.log(list[i].name + ", " + list[i].age);
}
Output:
Bob, 10
Larry, 25
Larry, 30
Tom, 20
You could get the indices of name array using Array.from(name.keys()) or [...name.keys()]. Sort the indices based on their value. Then use map to get the value for the corresponding indices in any number of related arrays
const indices = Array.from(name.keys())
indices.sort( (a,b) => name[a].localeCompare(name[b]) )
const sortedName = indices.map(i => name[i]),
const sortedAge = indices.map(i => age[i])
Here's a snippet:
const name = ["Bob","Tom","Larry"],
age = ["10", "20", "30"],
indices = Array.from(name.keys())
.sort( (a,b) => name[a].localeCompare(name[b]) ),
sortedName = indices.map(i => name[i]),
sortedAge = indices.map(i => age[i])
console.log(indices)
console.log(sortedName)
console.log(sortedAge)
This solution (my work) sorts multiple arrays, without transforming the data to an intermediary structure, and works on large arrays efficiently. It allows passing arrays as a list, or object, and supports a custom compareFunction.
Usage:
let people = ["john", "benny", "sally", "george"];
let peopleIds = [10, 20, 30, 40];
sortArrays([people, peopleIds]);
[["benny", "george", "john", "sally"], [20, 40, 10, 30]] // output
sortArrays({people, peopleIds});
{"people": ["benny", "george", "john", "sally"], "peopleIds": [20, 40, 10, 30]} // output
Algorithm:
Create a list of indexes of the main array (sortableArray)
Sort the indexes with a custom compareFunction that compares the values, looked up with the index
For each input array, map each index, in order, to its value
Implementation:
/**
* Sorts all arrays together with the first. Pass either a list of arrays, or a map. Any key is accepted.
* Array|Object arrays [sortableArray, ...otherArrays]; {sortableArray: [], secondaryArray: [], ...}
* Function comparator(?,?) -> int optional compareFunction, compatible with Array.sort(compareFunction)
*/
function sortArrays(arrays, comparator = (a, b) => (a < b) ? -1 : (a > b) ? 1 : 0) {
let arrayKeys = Object.keys(arrays);
let sortableArray = Object.values(arrays)[0];
let indexes = Object.keys(sortableArray);
let sortedIndexes = indexes.sort((a, b) => comparator(sortableArray[a], sortableArray[b]));
let sortByIndexes = (array, sortedIndexes) => sortedIndexes.map(sortedIndex => array[sortedIndex]);
if (Array.isArray(arrays)) {
return arrayKeys.map(arrayIndex => sortByIndexes(arrays[arrayIndex], sortedIndexes));
} else {
let sortedArrays = {};
arrayKeys.forEach((arrayKey) => {
sortedArrays[arrayKey] = sortByIndexes(arrays[arrayKey], sortedIndexes);
});
return sortedArrays;
}
}
See also https://gist.github.com/boukeversteegh/3219ffb912ac6ef7282b1f5ce7a379ad
If performance matters, there is sort-ids package for that purpose:
var sortIds = require('sort-ids')
var reorder = require('array-rearrange')
var name = ["Bob","Larry","Tom"];
var age = [30, 20, 10];
var ids = sortIds(age)
reorder(age, ids)
reorder(name, ids)
That is ~5 times faster than the comparator function.
It is very similar to jwatts1980's answer (Update 2).
Consider reading Sorting with map.
name.map(function (v, i) {
return {
value1 : v,
value2 : age[i]
};
}).sort(function (a, b) {
return ((a.value1 < b.value1) ? -1 : ((a.value1 == b.value1) ? 0 : 1));
}).forEach(function (v, i) {
name[i] = v.value1;
age[i] = v.value2;
});
You are trying to sort 2 independet arrays by only calling sort() on one of them.
One way of achieving this would be writing your own sorting methd which would take care of this, meaning when it swaps 2 elements in-place in the "original" array, it should swap 2 elements in-place in the "attribute" array.
Here is a pseudocode on how you might try it.
function mySort(originals, attributes) {
// Start of your sorting code here
swap(originals, i, j);
swap(attributes, i, j);
// Rest of your sorting code here
}
inspired from #jwatts1980's answer, and #Alexander's answer here I merged both answer's into a quick and dirty solution;
The main array is the one to be sorted, the rest just follows its indexes
NOTE: Not very efficient for very very large arrays
/* #sort argument is the array that has the values to sort
#followers argument is an array of arrays which are all same length of 'sort'
all will be sorted accordingly
example:
sortMutipleArrays(
[0, 6, 7, 8, 3, 4, 9],
[ ["zr", "sx", "sv", "et", "th", "fr", "nn"],
["zero", "six", "seven", "eight", "three", "four", "nine"]
]
);
// Will return
{
sorted: [0, 3, 4, 6, 7, 8, 9],
followed: [
["zr", th, "fr", "sx", "sv", "et", "nn"],
["zero", "three", "four", "six", "seven", "eight", "nine"]
]
}
*/
You probably want to change the method signature/return structure, but that should be easy though. I did it this way because I needed it
var sortMultipleArrays = function (sort, followers) {
var index = this.getSortedIndex(sort)
, followed = [];
followers.unshift(sort);
followers.forEach(function(arr){
var _arr = [];
for(var i = 0; i < arr.length; i++)
_arr[i] = arr[index[i]];
followed.push(_arr);
});
var result = {sorted: followed[0]};
followed.shift();
result.followed = followed;
return result;
};
var getSortedIndex = function (arr) {
var index = [];
for (var i = 0; i < arr.length; i++) {
index.push(i);
}
index = index.sort((function(arr){
/* this will sort ints in descending order, change it based on your needs */
return function (a, b) {return ((arr[a] > arr[b]) ? -1 : ((arr[a] < arr[b]) ? 1 : 0));
};
})(arr));
return index;
};
I was looking for something more generic and functional than the current answers.
Here's what I came up with: an es6 implementation (with no mutations!) that lets you sort as many arrays as you want given a "source" array
/**
* Given multiple arrays of the same length, sort one (the "source" array), and
* sort all other arrays to reorder the same way the source array does.
*
* Usage:
*
* sortMultipleArrays( objectWithArrays, sortFunctionToApplyToSource )
*
* sortMultipleArrays(
* {
* source: [...],
* other1: [...],
* other2: [...]
* },
* (a, b) => { return a - b })
* )
*
* Returns:
* {
* source: [..sorted source array]
* other1: [...other1 sorted in same order as source],
* other2: [...other2 sorted in same order as source]
* }
*/
export function sortMultipleArrays( namedArrays, sortFn ) {
const { source } = namedArrays;
if( !source ) {
throw new Error('You must pass in an object containing a key named "source" pointing to an array');
}
const arrayNames = Object.keys( namedArrays );
// First build an array combining all arrays into one, eg
// [{ source: 'source1', other: 'other1' }, { source: 'source2', other: 'other2' } ...]
return source.map(( value, index ) =>
arrayNames.reduce((memo, name) => ({
...memo,
[ name ]: namedArrays[ name ][ index ]
}), {})
)
// Then have user defined sort function sort the single array, but only
// pass in the source value
.sort(( a, b ) => sortFn( a.source, b.source ))
// Then turn the source array back into an object with the values being the
// sorted arrays, eg
// { source: [ 'source1', 'source2' ], other: [ 'other1', 'other2' ] ... }
.reduce(( memo, group ) =>
arrayNames.reduce((ongoingMemo, arrayName) => ({
...ongoingMemo,
[ arrayName ]: [
...( ongoingMemo[ arrayName ] || [] ),
group[ arrayName ]
]
}), memo), {});
}
You could append the original index of each member to the value, sort the array, then remove the index and use it to re-order the other array. It will only work where the contents are strings or can be converted to and from strings successfuly.
Another solution is keep a copy of the original array, then after sorting, find where each member is now and adjust the other array appropriately.
I was having the same issue and came up with this incredibly simple solution. First combine the associated ellements into strings in a seperate array then use parseInt in your sort comparison function like this:
<html>
<body>
<div id="outPut"></div>
<script>
var theNums = [13,12,14];
var theStrs = ["a","b","c"];
var theCombine = [];
for (var x in theNums)
{
theCombine[x] = theNums[x] + "," + theStrs;
}
var theSorted = theAr.sort(function(a,b)
{
var c = parseInt(a,10);
var d = parseInt(b,10);
return c-d;
});
document.getElementById("outPut").innerHTML = theS;
</script>
</body>
</html>
How about:
var names = ["Bob","Tom","Larry"];
var ages = ["10", "20", "30"];
var n = names.slice(0).sort()
var a = [];
for (x in n)
{
i = names.indexOf(n[x]);
a.push(ages[i]);
names[i] = null;
}
names = n
ages = a
Simplest explantion is the best, merge the arrays, and then extract after sorting:
create an array
name_age=["bob#10","Tom#20","Larry#30"];
sort the array as before, then extract the name and the age, you can use # to reconise where
name ends and age begins. Maybe not a method for the purist, but I have the same issue and this my approach.

sorting array by predefined number

Let's say I have multidimensional array
var arr = [{
"id": "1",
"firstname": "SUSAN",
"dezibel": "91"
}, {
"id": "2",
"firstname": "JOHNNY",
"dezibel": "74"
}, {
"id": "3",
"firstname": "ANDREA",
"dezibel": "67"
}];
How can I sort it by "dezibel" but not ascending or descending, but closest to a giving number? For example,
var num = 78;
so target value is 78. and final sorting must be: 74, 67, 91.
You'll need to use a custom sort function that compares the absolute difference of each object's dezibel attribute from 78.
var arr = [{
"id": "1",
"firstname": "SUSAN",
"dezibel": "91"
}, {
"id": "2",
"firstname": "JOHNNY",
"dezibel": "74"
}, {
"id": "3",
"firstname": "ANDREA",
"dezibel": "67"
}];
num = 78;
arr.sort(
function(first,second){
var a = Math.abs(num - (+first.dezibel));
var b = Math.abs(num - (+second.dezibel));
return a - b;
});
alert(JSON.stringify(arr));
Write a sort function which calculates the distance to your number:
arr.sort(function(a, b){
return Math.abs(num-a) - Math.abs(num-b);
});
Use this to sort the dezibel properties in your array. It will calculate the distance between each of them and num. It will then select the smaller of the two distances, and continue in this manner to sort the whole array.
Just sort by the absolute difference.
var arr = [{ "id": "1", "firstname": "SUSAN", "dezibel": "91" }, { "id": "2", "firstname": "JOHNNY", "dezibel": "74" }, { "id": "3", "firstname": "ANDREA", "dezibel": "67" }],
num = 78;
arr.sort(function (a, b) {
return Math.abs(a.dezibel - num) - Math.abs(b.dezibel - num);
});
document.write('<pre>' + JSON.stringify(arr, 0, 4) + '</pre>');
.sort optionally takes a function. The function takes 2 values at a time, and compares them:
If the first value should sort higher than the second, the function should return a positive number.
If the first value should sort lower than the second, the function should return a negative number.
If the values are equal, the function should returns 0.
So, if you wanted to sort by dezibel in ascending order, you could do
arr.sort(function(a,b){
return a.dezibel- b.dezibel;
});
However, you want to sort by dezibel's distance from some number. To find the magnitude of the difference from 78 and the dezibel value, take the absolute value of the difference:
Math.abs(78 - a.dezibel)
Now, if we want to sort based on that value for each object, we can take the difference of that Math.abs call for both a and b:
arr.sort(function(a,b){
return Math.abs(78 - a.dezibel) - Math.abs(78 - b.dezibel);
});
You can use the array sort function for this:
arr.sort(function(a, b) {
return num - 1 * a.dezibel + num - 1 * b.dezibel
})
I would just add a distance and then sort it ...
num=78
for e in arr
e.dist=Math.abs(e.dezibel - num)
arr.sort (l,r) =>
if l.dist > r.dist then 1 else -1
delete the dist after
but put in one line is ok, too
arr.sort (l,r) =>
if Math.abs(l.dezibel - num) > Math.abs(r.dezibel - num) then 1 else -1

Merging multiple duplicate objects into one from JavaScript array

My JSON file looks like the following, somewhere around 1000-2000 objects.
[{
"date": "2015-01-25T22:13:18Z",
"some_object": {
"first_group": 20,
"second_group": 90,
"third_group": 39,
"fourth_group": 40
}
}, {
"date": "2015-01-25T12:20:32Z",
"some_object": {
"first_group": 10,
"second_group": 80,
"third_group": 21,
"fourth_group": 60
}
}, {
"date": "2015-02-26T10:53:03Z",
"some_object": {
"first_group": 12,
"second_group": 23,
"third_group": 13,
"fourth_group": 30
}
}]
After copying it in an array I need to perform the following manipulation on it:
First. Remove duplicate objects. 2 objects are considered the same if they have the same date (without taking the time into consideration). So in my JSON, the first two objects are considered the same. Now the tricky part is that when a duplicate is found, we shouldn't just randomly remove one of them, but merge (not sure if merge is the right word) the fields from some_object, so it becomes one object in the array. Therefore, with the JSON above, the first two objects would become one:
{
"date": "2015-02-26T00:00:00Z",
"some_object": {
"first_group": 30, //20+10
"second_group": 170, //90+80
"third_group": 60, //39+21
"fourth_group": 100 //40+60
}
}
Even trickier is that there could be some 3-10 objects with the same date, but different time in the array. Therefore those should be merged into 1 object according to the rule above.
Second. Sort this array of objects ascending (from oldest to newest of the date field).
So what's so hard? Where did you get stuck?
I found out how to sort the array ascending (based on date) by using this and some of this.
But I have no idea how to do the first point of removing the duplicates and merging, in a time-efficient manner. Maybe something inside:
var array = [];//reading it from the JSON file
var object_date_sort_asc = function (obj1, obj2) {
if (obj1.date > obj2.date) return 1;
if (obj1.date < obj2.date) return -1;
//some magic here
return 0;
};
array.sort(object_date_sort_asc);
Any ideas?
Use an object whose properties are the dates, to keep track of dates that have already been seen, and the values are the objects. When you encounter a date that's been seen, just merge the elements.
var seen = {};
for (var i = 0; i < objects.length; i++) {
var cur = objects[i];
if (cur.date in seen) {
var seen_cur = seen[cur.date];
seen_cur.some_object.first_group += cur.some_object..first_group;
seen_cur.some_object..second_group += cur.some_object..second_group;
...
} else {
seen[cur.date] = cur;
}
}
Once this is done, you can convert the seen object to an array and sort it.
var arr = [];
for (var k in seen) {
arr.push(seen[k]);
}
To remove duplicate objects, you can loop through your array using .map(). In each iteration, you push the dates, parsed using some simple regex (which removes the time), into an array—if and only if it is not present in the array to begin with:
If it is not in the array, push into array (of unique dates) and return the object
If it is in the array, do nothing
The logic above can be described as the following, assuming your array is assigned to the data variable:
// Remove duplicates
var dates = [];
var data_noDupes = $.map(data, function(item){
var item_date = item.date.replace(/([\d\-]+)T.*/gi, '$1');
if (dates.indexOf(item_date) === -1) {
dates.push(item_date);
return item;
}
});
This should remove all recurring instances of the same date.
With regards to the second part: to sort, you simply sort the returned array by the date, again parsed using some simply regex that removes the time:
// Sort data_noDupes
function sortByDate(a, b){
var a_item_date = a.date.replace(/([\d\-]+)T.*/gi, '$1'),
b_item_date = b.date.replace(/([\d\-]+)T.*/gi, '$1');
return ((a_item_date < b_item_date) ? -1 : ((a_item_date > b_item_date) ? 1 : 0));
}
If you want to be extra safe, you should use momentjs to parse your date objects instead. I have simply modified how the dates are parsed in the functional example below, but with exactly the same logic as described above:
$(function() {
var data = [{
"date": "2015-02-26T10:53:03Z",
"some_object": {
"first_group": 12,
"second_group": 23,
"third_group": 13,
"fourth_group": 30
}
}, {
"date": "2015-01-25T12:20:32Z",
"some_object": {
"first_group": 10,
"second_group": 80,
"third_group": 21,
"fourth_group": 60
}
}, {
"date": "2015-01-25T22:13:18Z",
"some_object": {
"first_group": 20,
"second_group": 90,
"third_group": 39,
"fourth_group": 40
}
}];
// Remove duplicates
var dates = [];
var data_noDupes = $.map(data, function(item) {
// Get date and format date
var item_date = moment(new Date(item.date)).format('YYYY-MM-DD');
// If it is not present in array of unique dates:
// 1. Push into array
// 2. Return object to new array
if (dates.indexOf(item_date) === -1) {
dates.push(item_date);
return item;
}
});
// Sort data_noDupes
function sortByDate(a, b) {
var a_item_date = moment(new Date(a.date));
return ((a_item_date.isBefore(b.date)) ? -1 : ((a_item_date.isAfter(b.date)) ? 1 : 0));
}
data_noDupes.sort(sortByDate);
console.log(data_noDupes);
$('#input').val(JSON.stringify(data));
$('#output').val(JSON.stringify(data_noDupes));
});
body {
padding: 0;
margin: 0;
}
textarea {
padding: 0;
margin: 0;
height: 100vh;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.10.6/moment.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="input"></textarea>
<textarea id="output"></textarea>

Comparing arrays of strings for similarity

I have available to me hundreds of JSON strings. Each of these contains an array of 15-20 words sorted by some predetermined weight. This weight, if it's worth noting, is the amount of times these words are found in some chunk of text. What's the best way of finding similarity between arrays of words that are structured like this?
First idea that came to my head was to create a numerical hash of all the words together and basically compare these values to determine similarity. I wasn't very successful with this, since the resulting hash values of very similar strings were not very close. After some research regarding string comparison algorithms, I come to Stackoverflow in hopes of receiving more guidance. Thanks in advance, and please let me know if you need more details of the problem.
Edit 1: Clarifying what I'm trying to do: I want to determine how similar two arrays are according to the words each of these have. I would also like to take into consideration the weight each word carries in each array. For example:
var array1 = [{"word":"hill","count":5},{"word":"head","count":5}];
var array2 = [{"word":"valley","count":7},{"word":"head","count":5}];
var array3 = [{"word":"head", "count": 6}, {"word": "valley", "count": 5}];
var array4 = [{"word": "valley", "count": 7}, {"word":"head", "count": 5}];
In that example, array 4 and array 2 are more similar than array 2 and array 3 because, even though both have the same words, the weight is the same for both of them in array 4 and 2. I hope that makes it a little bit easier to understand. Thanks in advance.
I think that what you want is "cosine similarity", and you might also want to look at vector space models. If you are coding In Java, you can use the open source S-space package.
(added on 31 Oct) Each element of the vector is the count of one particular string. You just need to transform your arrays of strings into such vectors. In your example, you have three words - "hill", "head", "valley". If your vector is in that order, the vectors corresponding to the arrays would be
// array: #hill, #head, #valley
array1: {5, 5, 0}
array2: {0, 5, 7}
array3: {0, 6, 5}
array4: {0, 5, 7}
Given that each array has to be compared to every other array, you are looking at a serious amount of processing along the lines of ∑(n-1) times the average number of "words" in each array. You'll need to store the score for each comparison, then make some sense of it.
e.g.
var array1 = [{"word":"hill","count":5},{"word":"head","count":5}];
var array2 = [{"word":"valley","count":7},{"word":"head","count":5}];
var array3 = [{"word":"head", "count": 6}, {"word": "valley", "count": 5}];
var array4 = [{"word": "valley", "count": 7}, {"word":"head", "count": 5}];
// Comparison score is summed product of matching word counts
function compareThings() {
var a, b, i = arguments.length,
j, m, mLen, n, nLen;
var word, score, result = [];
if (i < 2) return;
// For each array
while (i--) {
a = arguments[i];
j = i;
// Compare with every other array
while (j--) {
b = arguments[j];
score = 0;
// For each word in array
for (m=0, mLen = b.length; m<mLen; m++) {
word = b[m].word
// Compare with each word in other array
for (n=0, nLen=a.length; n<nLen; n++) {
// Add to score
if (a[n].word == word) {
score += a[n].count * b[m].count;
}
}
}
// Put score in result
result.push(i + '-' + j + ':' + score);
}
}
return result;
}
var results = compareThings(array1, array2, array3, array4);
alert('Raw results:\n' + results.join('\n'));
/*
Raw results:
3-2:65
3-1:74
3-0:25
2-1:65
2-0:30
1-0:25
*/
results.sort(function(a, b) {
a = a.split(':')[1];
b = b.split(':')[1];
return b - a;
});
alert('Sorted results:\n' + results.join('\n'));
/*
Sorted results:
3-1:74
3-2:65
2-1:65
2-0:30
3-0:25
1-0:25
*/
So 3-1 (array4 and array2) have the highest score. Fortunately the comparison need only be one way, you don't have to compare a to b and b to a.
Here is an attempt. The algorithm is not very smart (a difference > 20 is the same as not having the same words), but could be a useful start:
var wordArrays = [
[{"word":"hill","count":5},{"word":"head","count":5}]
, [{"word":"valley","count":7},{"word":"head","count":5}]
, [{"word":"head", "count": 6}, {"word": "valley", "count": 5}]
, [{"word": "valley", "count": 7}, {"word":"head", "count": 5}]
]
function getSimilarTo(index){
var src = wordArrays[index]
, values
if (!src) return null;
// compare with other arrays
weighted = wordArrays.map(function(arr, i){
var diff = 0
src.forEach(function(item){
arr.forEach(function(other){
if (other.word === item.word){
// add the absolute distance in count
diff += Math.abs(item.count - other.count)
} else {
// mismatches
diff += 20
}
})
})
return {
arr : JSON.stringify(arr)
, index : i
, diff : diff
}
})
return weighted.sort(function(a,b){
if (a.diff > b.diff) return 1
if (a.diff < b.diff) return -1
return 0
})
}
/*
getSimilarTo(3)
[ { arr: '[{"word":"valley","count":7},{"word":"head","count":5}]',
index: 1,
diff: 100 },
{ arr: '[{"word":"valley","count":7},{"word":"head","count":5}]',
index: 3,
diff: 100 },
{ arr: '[{"word":"head","count":6},{"word":"valley","count":5}]',
index: 2,
diff: 103 },
{ arr: '[{"word":"hill","count":5},{"word":"head","count":5}]',
index: 0,
diff: 150 } ]
*/
Sort the arrays by word before attempting comparison. Once this is complete, comparing two arrays will require exactly 1 pass through each array.
After sorting the arrays, here is a compare algorithm (psuedo-java):
int compare(array1, array2)
{
returnValue = 0;
array1Index = 0
array2Index = 0;
while (array1Index < array1.length)
{
if (array2Index < array2.length)
{
if (array1[array1Index].word == array2[array2Index].word) // words match.
{
returnValue += abs(array1[array1Index].count - array2[array2Index].count);
++array1Index;
++array2Index;
}
else // account for the unmatched array2 word.
{
// 100 is just a number to give xtra weight to unmatched numbers.
returnValue += 100 + array2[array2Index].count;
++array2Index;
}
}
else // array2 empty and array1 is not empty.
{
// 100 is just a number to give xtra weight to unmatched numbers.
returnValue += 100 + array1[array1Index].count;
}
}
// account for any extra unmatched array 2 values.
while (array2Index < array2.length)
{
// 100 is just a number to give xtra weight to unmatched numbers.
returnValue += 100 + array2[array2Index].count;
}
return returnValue;
}

Categories

Resources