How to exclude redundant patterns among a large array of glob string

How to exclude redundant patterns among a large array of glob string - javascript

I have been working on this algorithm for days, and no idea how to figure out the most suitable/easy/optimized solution.
Here I have a large array of string as the followings
[
*.*.complete
*.*.read
*.*.update
*.order.cancel
accounting.*.delete
accounting.*.update
accounting.*.void
accounting.account.*
admin.user.read
admin.user.update
admin.format.delete
...
]
// the array may be in random order
all the values are in some wildcard patterns (in fact, they are the permissions for my system)
what i want to do is to remove redundant patterns, for example: admin.json_api.read is redundant due to *.*.read
can someone give me any suggestion/approach?

The following approach takes different glob segment length's into account as well.
Thus in a first step the glob-array is reduced into one or more segment-length specific arrays of better inspectable glob-items.
Such an item features e.g. a regex specific pattern of its actual glob-value.
Within a final tasks each segment-length specific array gets sanitized separately into an array of non redundant glob-values.
The latter gets achieved by 1st sorting each array descending by each item's glob-value (which assures a sorting from the more to the less generic glob values) and 2nd by rejecting each item where its glob-value gets covered already by a more generic glob-value.
And the base of such a detection is the glob-value specific regex where the asterisk wild card translates into a regex pattern with the same meaning ... thus any glob value of '*.' equals a regex of /[^.]+\./ and any terminating '.*' equals a regex of /\.[^.]+/.
Since the sanitizing task is done via flatMap, the end result is a flat array again ...
function createGlobInspectionItem(glob) {
const segments = glob.split('.');
return {
value: glob,
pattern: glob
.replace((/\*\./g), '[^.]+.')
.replace((/\.\*$/), '.[^.]+')
.replace((/(?<!\^)\./g), '\\.'),
segmentCount: segments.length,
};
}
function collectGlobInspectionItems({ index, result }, glob) {
const globItem = createGlobInspectionItem(glob);
const groupKey = globItem.segmentCount;
let groupList = index[groupKey];
if (!groupList) {
groupList = index[groupKey] = [];
result.push(groupList);
}
groupList.push(globItem);
return { index, result };
}
function createSanitizedGlobList(globItemList) {
const result = [];
let globItem;
globItemList.sort(({ value: aValue }, { value: bValue }) =>
(aValue > bValue && -1) || (aValue < bValue && 1) || 0
);
while (globItem = globItemList.pop()) {
globItemList = globItemList.filter(({ value }) =>
!RegExp(globItem.pattern).test(value)
);
result.push(globItem);
}
return result.map(({ value }) => value);
}
const sampleData = [
// 3 segments
'*.*.complete',
'*.*.read',
'*.*.update',
'*.order.cancel',
'accounting.*.delete',
'accounting.*.update',
'accounting.*.void',
'accounting.account.user',
'accounting.account.*',
'accounting.account.admin',
'admin.user.read',
'admin.user.update',
'admin.format.delete',
// 2 segments
'*.read',
'*.update',
'user.read',
'user.update',
'format.delete',
'format.account',
];
console.log(
'... intermediata inspection result grouped by section length ...',
sampleData
.reduce(collectGlobInspectionItems, { index: {}, result: [] })
.result
);
console.log(
'... final sanitized and flattened glob array ...',
sampleData
.reduce(collectGlobInspectionItems, { index: {}, result: [] })
.result
.flatMap(createSanitizedGlobList)
);
.as-console-wrapper { min-height: 100%!important; top: 0; }

General idea:
Each your pattern can be transformed into regex using:
new RegExp('^' + pattern
.replace(/[./]/g, '\\$&') // escape chars (list isn't full)
.replace(/\*/g, '(.*)') // replace asterisk with '(.*)' - any char(s)
+ '$') // match only full pattern
If one pattern match another one - you don't need both, because pattern with * include second one:
if (pattern1.include('*') && pattern1.test(pattern2)) {
// delete pattern2
}
Simple realization can be found below (still need to optimize a bit).
Full code:
// Your initial array
const patterns = [
'*.*.complete',
'*.*.read',
'*.*.update',
'*.order.cancel',
'accounting.*.delete',
'accounting.*.update',
'accounting.*.void',
'accounting.account.*',
'admin.user.read',
'admin.user.update',
'admin.format.delete',
]
// Build a new one with regexes
const withRegexes = patterns.map(pattern => {
// Create a regex if pattern contain asterisk
const regexp = pattern.includes('*') ? new RegExp('^' + pattern
.replace(/[./]/g, '\\$&')
.replace(/\*/g, '(.*)')
+ '$') : null;
return { pattern, regexp };
});
// Array of indexes of elements where it's pattern already matched by another pattern
let duplicateIndexes = [];
for (let i = 0; i < withRegexes.length - 1; i++) {
for (let j = i + 1; j < withRegexes.length; j++) {
if (withRegexes[i].regexp
&& withRegexes[i].regexp.test(withRegexes[j].pattern)) {
duplicateIndexes.push(j);
}
}
}
// Get unique indexes to delete in desc order
duplicateIndexes = [ ...new Set(duplicateIndexes) ].sort((a, b) => b - a);
// Clear up initial array
for (let index of duplicateIndexes) {
patterns.splice(index, 1);
}
// New one
console.log(patterns);

Related

JavaScript: Sentence variations using the reduce function

For search purposes, given a string like BBC Sport I want to construct an array that looks like:
[ 'BBC', 'BB', 'B', 'Sport', 'Spor', 'Spo', 'Sp', 'S' ]
I've implenented it using 2 for loops:
const s = "BBC sport";
const tags = [];
const words = s.split(" ");
for (let word of words) {
const wl = word.length;
for (let i = 0; i < wl; i++) {
tags.push(word.substr(0, wl - i));
}
}
// tags now equals [ 'BBC', 'BB', 'B', 'Sport', 'Spor', 'Spo', 'Sp', 'S' ]
However, I'd like to implement it, if possible, with the reduce function instead of for loops.
How would you solve it?

Honestly I'd write the code the way you did. Two loops are readable, maintainable and fast.
If you really need a oneliner:
s.split(" ").flatMap(word => Array.from(word, (_, i) => word.slice(0, i + 1)))

Here is a solution relying on function generators (which I would use) and a solution with reduce (as you asked) (which I wouldn't personally use), accepting an input string and a separator.
In your case, the separator is blankspace, of course, but it can be customized.
The below code will iterate through the input string and slice the relevant part of the string for each occurrence, by capitalizing it (since it looks like you are).
This should be elastic enough and, at the same time, easy to customize by eventually adding additional parameters to the toTagList method, or allowing further transformations since it's iterable.
const s = "BBC sport";
function* toTagList(input, separator) {
// split by the separator.
for (const block of input.split(separator)) {
// For each string block, split the whole word.
var splitted = block.split('');
// slice the input array by taking from the first character to the last one, then decrease to get only the previous portions of said word.
for (var i = splitted.length; i > 0; i--) {
// Finally, yield the capitalized string.
yield capitalize(splitted.slice(0, i).join(''));
}
}
}
// this just capitalizes the string.
function capitalize(input) {
return input.charAt(0).toUpperCase() + input.substring(1, input.length);
}
console.log([...toTagList(s, ' ')]);
If you really want to do that with reduce:
const s = "BBC sport";
const tags = s.split(' ').reduce((acc, next) => {
return acc.push(...Array.from({length: next.length}).map((_, i) => {
return (next.split('').slice(0, i + 1)).join('')
})), acc;
}, [])
console.log(tags);

JS - How to find the index of deepest pair of parentheses?

I have a string:
"5 * ((6 + 2) - 1)"
I need to find the deepest pair of parentheses and their contents.
I've googled a lot and I can't find anything specific to finding the index. A lot of solutions found how many levels there are, and things like that, but nothing was really helpful. I was thinking of counting the layers, and then using a loop to solve and repeat solving until done, but it seems like this would be really slow.
I have no idea where to start, so I haven't written any code yet.
I want a function to return 5, the string index of the deepest set of parentheses. I also need to do the same for the deepest ")", since I need the pair. Example:
const deepestPair = (str) => {
// Find deepest pair of parentheses
}
deepestPair("(2(5)4)(3)") // Returns [2, 4], the indexes of the deepest open/close parentheses

You could check opening and closing parentheses and use a counter for getting the most nested indices.
const deepestPair = str => {
var indices,
max = 0,
count = 0,
last;
[...str].forEach((c, i) => {
if (c === '(') {
last = i;
count++;
return;
}
if (c === ')') {
if (count > max) {
indices = [last, i];
max = count;
}
count--;
}
});
return indices;
}
console.log(deepestPair("(2(5)4)(3)")); // [2, 4]

You can use RegExp ([(])[^()]+[)] to match ( followed by one or more characters that are not ( or ) and closing ), /[)]/ to match closing parenthesis, return indexes of matches
const deepestPair = (str, index = null) =>
[index = str.match(/([(])[^()]+[)]/).index
, str.slice(index).match(/[)]/).index + index]
console.log(deepestPair("(2(5)4)(3)"));

Here is a simple way to get the deepest pair using two stacks. It also returns the depth of the pair in a structure, with the open and close indices.
It uses a singles stack to hold the open parenthesis found so far, and another stack (pairs) for matched parenthesis.
Each time a closing parenthesis is found, the last open parenthesis is popped from the singles stack and put in the pairs.
Then you just have to sort this pairs stack using the depth property and get the first item.
const deepestPair = str => {
const singles = [];
const pairs = [];
[...str].forEach((c, i) => {
if (c === '(') {
singles.push({ depth: singles.length + 1, open: i });
} else if (c === ')' && singles.length) {
pairs.push({ ...singles.pop(), close: i });
}
})
pairs.sort((a, b) => b.depth - a.depth);
return pairs.length ? pairs[0] : {};
};
console.log(deepestPair('(2(5)4)(3)'));
console.log(deepestPair('(2(5)(1)(11(14))4)(3)'));
If you want to get an array as the result you can replace the last line by this:
return pairs.length ? [pairs[0].open, pairs[0].close] : [];

JS sort array by three types of sorting

I need to sort an array by the following order based on a search term.
Exact string.
Starts with.
Contains.
Code :
var arr = ['Something here Hello', 'Hell', 'Hello'];
var term = 'Hello';
var sorted = arr.slice().sort((a, b) => {
let value = 0;
if (a.startsWith(term)) {
value = -1;
}
if (a.indexOf(term) > -1) {
value = -1;
}
if (a === term) {
value = -1;
}
return value;
});
console.log(sorted);
The expected result is:
["Hello", "Hell", "Something here Hello"]
I'm not sure how to do this with the built-in sort function because it looks like it's not meant to use with cases like that. Any advice, please?

You need a function which returns a value for the staged sorting.
Inside of the callback for sorting, you need to return the delta of the two values which reflects the relation between the two strings.
const compareWith = term => string => {
if (string === term) return 1;
if (term.startsWith(string)) return 2; // switch string and term
if (string.includes(term)) return 3; // use includes
return Infinity; // unknown strings move to the end
};
var array = ['Something here Hello', 'Hell', 'Hello'],
term = 'Hello',
order = compareWith(term);
array.sort((a, b) => order(a) - order(b));
console.log(array);

Finding a jumbled character sequence in minimum steps

There is a string consisting of all alphabets(a-z). I have to guess the jumbled sequence in minimum number of steps. After each guess, I will know if each of my character is in the right position or not.
I'm using the following approach:
Maintaining a list of indices of where each character can go
Generating a random sequence from above and updating the list on each response
Here's my code:
var validIDs = {};
function initialise() {
let indices = [0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25];
for(let i=0;i<26;i++) {
validIDs[String.fromCharCode(97+i)] = [...indices];
}
}
// response is a bool array[26]
// indicating the matching positions of pattern with the secret jumbled sequence.
function updateIDs(pattern, response) {
let index;
for(let i=0;i<pattern.length;i++) {
if(response[i]) {
validIDs[pattern[i]] = [i];
} else {
index = validIDs[pattern[i]].indexOf(i);
validIDs[pattern[i]].splice(index,1);
}
}
}
My validIDs is an object with [a-z] as keys and stores the possible positions of each character. Example: { a: [0, 1, 2], b: [3], ...and so on till 'z' }. The aim is to tighten the constraints on this object and finally arrive at the secret pattern.
I'm trying to create a valid pattern from this object without using brute force and would like to have some randomness as well. I wrote the following function to take a random index for each letter and create a sequence, but this fails if all the available indices of a letter are already taken.
function generateNewSequence() {
let sequence = [], result = [];
let rand, index = 0;
for(let letter of Object.keys(validIDs)) {
//Finding a random index for letter which is not already used
rand = Math.floor(Math.random()*validIDs[letter].length);
while(sequence.indexOf(validIDs[letter][rand]) !== -1) rand = Math.floor(Math.random()*validIDs[letter].length);
index = validIDs[letter][rand];
sequence.push(index);
result[index] = letter;
}
return result.join('');
}
Note: Another constraint is that the generated sequence should not contain any duplicates.

Can I use wildcards when searching an array of strings in Javascript?

Given an array of strings:
x = ["banana","apple","orange"]
is there a built in shortcut for performing wildcard searches?
ie., maybe
x.indexOf("*na*") //returns index of a string containing the substring na

Expanding on Pim's answer, the correct way to do it (without jQuery) would be this:
Array.prototype.find = function(match) {
return this.filter(function(item){
return typeof item == 'string' && item.indexOf(match) > -1;
});
}
But really, unless you're using this functionality in multiple places, you can just use the existing filter method:
var result = x.filter(function(item){
return typeof item == 'string' && item.indexOf("na") > -1;
});
The RegExp version is similar, but I think it will create a little bit more overhead:
Array.prototype.findReg = function(match) {
var reg = new RegExp(match);
return this.filter(function(item){
return typeof item == 'string' && item.match(reg);
});
}
It does provide the flexibility to allow you to specify a valid RegExp string, though.
x.findReg('a'); // returns all three
x.findReg("a$"); // returns only "banana" since it's looking for 'a' at the end of the string.

Extending on #Shmiddty's answer, here are useful JavaScript ideas:
Extend Array with a new method: Array.prototype.method = function(arg) { return result; }
Filter arrays using: Array.filter(function(e) { return true|false; })
Apply formula to elements in an array: Array.map(function(e) { return formula(e); })
Use regular expressions: either /.*na.*/ or new Regex('.*na.*')
Use regular expressions to match: let result = regex.test(input);
Use Array.prototype.reduce to aggergate a result after running a function on every element of an array
i.e. I prefer the input argument to be a regex, so, it gives you either:
A short but universal pattern matching input,
e.g. contains, starts with, ends width, as well as more sophisticated matches
The ability to specify an input pattern as a string
SOLUTION 1: filter, test, map and indexOf
Array.prototype.find = function(regex) {
const arr = this;
const matches = arr.filter( function(e) { return regex.test(e); } );
return matches.map(function(e) { return arr.indexOf(e); } );
};
let x = [ "banana", "apple", "orange" ];
console.log(x.find(/na/)); // Contains 'na'? Outputs: [0]
console.log(x.find(/a/)); // Contains 'a'? Outputs: [0,1,2]
console.log(x.find(/^a/)); // Starts with 'a'? Outputs: [1]
console.log(x.find(/e$/)); // Ends with 'e'? Outputs: [1,2]
console.log(x.find(/pear/)); // Contains 'pear'? Outputs: []
SOLUTION 2: reduce, test
Array.prototype.find = function(regex) {
return this.reduce(function (acc, curr, index, arr) {
if (regex.test(curr)) { acc.push(index); }
return acc;
}, [ ]);
}
let x = [ "banana", "apple", "orange" ];
console.log(x.find(/na/)); // Contains 'na'? Outputs: [0]
console.log(x.find(/a/)); // Contains 'a'? Outputs: [0,1,2]
console.log(x.find(/^a/)); // Starts with 'a'? Outputs: [1]
console.log(x.find(/e$/)); // Ends with 'e'? Outputs: [1,2]
console.log(x.find(/pear/)); // Contains 'pear'? Outputs: []

You can extend the array prototype to find matches in an array
Array.prototype.find = function(match) {
var matches = [];
$.each(this, function(index, str) {
if(str.indexOf(match) !== -1) {
matches.push(index);
}
});
return matches;
}
You can then call find on your array like so
// returns [0,3]
["banana","apple","orange", "testna"].find('na');

using regex can do this in javascript
var searchin = item.toLowerCase();
var str = columnId;
str = str.replace(/[*]/g, ".*").toLowerCase().trim();
return new RegExp("^"+ str + "$").test(searchin);

In addition to everything else that has been said, you can do this:
var x = ["banana", "apple", "orange"];
var y = [];
for (var i in x) {
if (x[i].indexOf('na') > -1) {
y.push(i);
}
}
Results: y = [0]

Develop Reference

JavaScript is the programming language of the Web.

How to exclude redundant patterns among a large array of glob string - javascript

Related

JavaScript: Sentence variations using the reduce function

JS - How to find the index of deepest pair of parentheses?

JS sort array by three types of sorting

Finding a jumbled character sequence in minimum steps

Can I use wildcards when searching an array of strings in Javascript?

Categories

Resources