Generate JS Regex from a set of strings [closed] - javascript

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
Is there any way or any library out there that can compute a JS RegEx from a set of strings that I want to be matched?
For example, I have this set of strings:
abc123
abc212
And generate abc\d\d\d ?
Or this set:
aba111
abb111
abc
And generate ab. ?
Note that I don't need a very precise RegEx, I just want one that can do strings, . and .*

Not without producing all the possible outcomes of a certain Grammar, some of which are infinite. This means it's not possible in the general case for finding a specific wanted grammar from a given input set. Even in your cases, you need to give every possible production of the Grammar (regular expression) in order to know exactly what regular expression you are happening to look for. For example the first set, there are several regular expressions that can match it, some of which could be:
abc[0-9][0-9][0-9]
abc[1-2][0-5][2-3]
abc[1-2][0-5][2-3]a*
abc\d*
abc\d+
abc\d+a*b*c*
...
And so on. That being said you could find a grammar that happens to match that sets conditions. One way is to simply brute-force the similarities and differences of each input item. So to do this with the second example:
aba111
abb111
abc
The ab part is the same for all of them so we start with ab as the regexp. Then the next character can be a, b or c so we can say (a|b|c). Then 1 or empty three times. That would result in:
ab(a|b|c)(1|)(1|)(1|)
Which is a correct regular expression, but maybe not the one you wanted.

May be this is too simple but you can use this,
var arr = ['abc121','abc212','cem23'];
var regex_arr = [];
arr.sort(function(a, b){return -a.length+b.length;});
for(var i in arr[0]){
for(var j in arr){
if(i>=arr[j].length){
regex_arr[i] = {value:'',reg:'*',use_self:false};
}else{
var c = arr[j][i];
var current_r = '.';
if(isNaN(c)){
if(/^[A-Za-z]$/.test(c)){
current_r = '\\w';
}else{
current_r = '\\W';
}
//... may be more control
}else{
current_r = '\\d';
}
if(!regex_arr[i]){
regex_arr[i] = {value:c,reg:current_r,use_self:true};
}else{
if(regex_arr[i].value!=c){
if(regex_arr[i].reg!=current_r){
regex_arr[i].reg = '.';
}
regex_arr[i].use_self = false;
regex_arr[i].value = c;
}
}
}
}
}
var result = '';
for(var i in regex_arr){
if(regex_arr[i].use_self){
result += regex_arr[i].value;
}else{
result += regex_arr[i].reg;
}
if(regex_arr[i].reg=='*'){
break;
}
}
console.log("regex = "+result);
for(var i in arr){
var r = new RegExp(result);
console.log(arr[i] + ' = '+r.test(arr[i]));
}
Results
regex = \w\w\w\d\d*
abc121 = true
abc212 = true
cem23 = true

Related

Need to have one string from Looped Chars JS [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
need to get the output as 1 string instead of looped string
the output I got each letter on its own
need to have the second output which is one word
Thanks in advance:D
let start = 0;
let swappedName = "elZerO";
for (let i = start; i<swappedName.length; i++){
if (swappedName[i] == swappedName[i].toUpperCase()) {
console.log(swappedName[i].toLowerCase());
}else {
console.log(swappedName[i].toUpperCase());
}
}
//Output
E
L
z
E
R
o
// Need to be
"ELzERo"
Use string = string0+string1 , or keep adding values to an array, then join the array with array.join()
MasteringJs has a great guide on ways to merge characters and strings.
let start = 0;
let swappedName = "elZerO";
var outputString="";
var outputStringArray=[];
var newChar="";
for (let i = start; i<swappedName.length; i++){
if (swappedName[i] == swappedName[i].toUpperCase()) {
newChar = swappedName[i].toLowerCase();
}else {
newChar=swappedName[i].toUpperCase();
}
outputStringArray.push(newChar);
outputString+=newChar;
}
console.log("[Output using string1 + string 2] is "+outputString); // Another example of concating string
console.log("[Output using array.join] is "+outputStringArray.join("")); // Another example of concating string
// Need to be
"ELzERo"

Best way to compare 2 similar strings? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
Original Question:
I have a lot of products with various names, I have two variations of
the names I need compared (Basically finding out if these two strings
are the same products). I don't want any false flags, does anyone have
recommendations on how I can achieve this?
Here is a product example:
Canon 50mm f/1.2L vs Canon EF 50mm f/1.2L USM Lens
There are other variations, but this will be the typical difference.
Is there any easy functionality I could implement to get a certain
answer? Only thing I can think of is maybe splitting the strings and
comparing and say if x matches a, b, or c.
My original question was a bit vague. The end goal is to be able to compare two strings and see how similar they are - e.g. 0%, 50%, or 100% similar. In this scenario I am using lens products from different sources, they use similar names - however I have no product sku/id for proper comparison.
The string score plugin has solved my issue, providing a value of how similar these products are.
In the bioinformatics word and I believe in other domains, this kind of pattern matching/searching algorithm is called fuzzy search.
There is a nodeJS module called string_score for it. Essentially you feed the API with 2 pieces of string and it returns you a score of how similar they are.
Example:
var test = require('string_score');
var match_percent = "Canon EF 50mm f/1.2L USM Lens".score("Canon 50mm f/1.2L");
console.log("Match score= " + match_percent);
Output:
Match score= 0.7938133874239354
Using the score as a baseline for comparison. You can say if it has a score of equip or over 80 then it is a match.
More Example:
var score = 0;
score = "hello world".score("he");
console.log("Match score => " + score);
score = "hello world".score("hel");
console.log("Match score => " + score);
score = "hello world".score("hell");
console.log("Match score => " + score);
score = "hello world".score("hello");
console.log("Match score => " + score);
<script type="text/javascript" src="//cdnjs.cloudflare.com/ajax/libs/string_score/0.1.10/string_score.min.js"></script>
References:
String_score: https://github.com/joshaven/string_score
You have to think about how would you recognize if two strings are the same product yourself, just by reading them.
Based solely on the examples you provided, it seems that the way to tell two strings representing a product are the same is if every word (a token separated by spaces) from the shorter string is contained in the longer string.
You might also want to ignore capitalization.
Something like this should work for the basic usage:
const tokens = s => s.toLowerCase().split(/\s+/g);
const sameProducts = (s1, s2) => {
const s1Tokens = tokens(s1);
const s2Tokens = tokens(s2);
const [shorterTokens, longerTokens] = s1Tokens.length > s2Tokens.length
? [s2Tokens, s1Tokens]
: [s1Tokens, s2Tokens];
return shorterTokens.every(st => longerTokens.includes(st));
}
console.log(
sameProducts(
'Canon 50mm f/1.2L',
'Canon EF 50mm f/1.2L USM Lens'
)
)
This code would have quadratic time complexity because the most expensive operation means that, for every token in the shorter string, you have to iterate through every token in the longer string.
A simple optimization would be to build a Set<token> from the longer string. This would make the operation linear because searching a set is O(1).
const tokens = s => s.toLowerCase().split(/\s+/g);
const sameProducts = (s1, s2) => {
const s1Tokens = tokens(s1);
const s2Tokens = tokens(s2);
const [shorterTokens, longerTokens] = s1Tokens.length > s2Tokens.length
? [s2Tokens, s1Tokens]
: [s1Tokens, s2Tokens];
const longerTokensSet = longerTokens.reduce((s, t) => {
s.add(t);
return s;
}, new Set());
return shorterTokens.every(st => longerTokensSet.has(st));
}
console.log(
sameProducts(
'Canon 50mm f/1.2L',
'Canon EF 50mm f/1.2L USM Lens'
)
)
Now you have to consider, do all tokens have to match? Maybe only tokens corresponding to the brand and focal-length have to match.
If this is the case, you might also want to validate both strings while parsing them and return false immediately if the product strings are invalid.
Here's a rough idea:
const productSet = new Set(['canon'])
const focalLengthsSet = new Set(['50mm']);
const isMeaningful = t => productSet.has(t) || focalLengthsSet.has(t);
const meaningfulTokens = s => s.toLowerCase().split(/\s+/g).filter(isMeaningful);
const validTokens = (tokens, s) => {
const valid = tokens.length === 2; // <-- could do better validation here
console.assert(valid, `Missing token(s) in ${s}`);
return valid;
}
const sameProducts = (s1, s2) => {
const s1Tokens = meaningfulTokens(s1);
if (!validTokens(s1Tokens, s1)) { return false; }
const s2Tokens = meaningfulTokens(s2);
if (!validTokens(s2Tokens, s2)) { return false; }
const [shorterTokens, longerTokens] = s1Tokens.length > s2Tokens.length
? [s2Tokens, s1Tokens]
: [s1Tokens, s2Tokens];
const longerTokensSet = longerTokens.reduce((s, t) => {
s.add(t);
return s;
}, new Set());
return shorterTokens.every(st => longerTokensSet.has(st));
}
console.log(
sameProducts(
'Canon 50mm f/1.3',
'Canon EF 50mm f/1.2'
)
)
console.log(
sameProducts(
'Canon 50mm f/1.3',
'Canon EF f/1.2' // <-- missing focal length
)
)
Now you could consider does every focal length correspond to every product or is it more product-specific?
Do tokens contain logic that explicitly depends on previously matched tokens?
All of the above are just basic approaches and techniques you could use but the actual solution would heavily depend on your exact circumstances.
A common algorithm for measuring string similarity is called the Levenstein distance.
The Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.
This algorithm would allow you to perhaps match the strings directly if you edit distance threshold is strict enough (although this could provide false positives) or you could even account for misspelled products for example when comparing individual tokens by making sure they are within a specific edit distance from one another.

JavaScript prototypes - technical interview [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
I had a JavaScript interview last wednesday, and I had trouble with one of the questions. Maybe you guys can give me hand with it?
The question was: how would you go about this printing var a and s to the console, in camel case, with the help of a prototype function...
var s = “hello javier”;
var a = “something else”;
String.prototype.toCamelCase = function() {
/* code */
return capitalize(this);
};
...so the result is the same as doing this?
console.log(s.toCamelCase());
console.log(a.toCamelCase());
>HelloJavier
>SomethingElse
Thanks!
var s = 'hello javier';
var a = 'something else';
String.prototype.toCamelCase = function() {
return capitalize(this);
};
function capitalize(string) {
return string.split(' ').map(function(string) {
return string.charAt(0).toUpperCase() + string.slice(1);
}).join('');
}
console.log(a.toCamelCase());
console.log(s.toCamelCase());
Reference
How do I make the first letter of a string uppercase in JavaScript?
I would go with something like this:
var s = "hello javier";
var a = "something else";
String.prototype.toCamelCase = function() {
function capitalize(str){
var strSplit = str.split(' ');
// starting the loop at 1 because we don't want
// to capitalize the first letter
for (var i = 1; i < strSplit.length; i+=1){
var item = strSplit[i];
// we take the substring beginning at character 0 (the first one)
// and having a length of one (so JUST the first one)
// and we set that to uppercase.
// Then we concatenate (add on) the substring beginning at
// character 1 (the second character). We don't give it a length
// so we get the rest.
var capitalized = item.substr(0,1).toUpperCase() + item.substr(1);
// then we set the value back into the array.
strSplit[i] = capitalized;
}
return strSplit.join('');
}
return capitalize(this);
};
// added for testing output
console.log(s.toCamelCase());
console.log(a.toCamelCase());

Javascript Numbers and Comma with input pattern [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Hi I need to combine 2 error checking procedures. I dont use jquery
I only want these values appear 0123456789,
My HTML, i need to know the pattern for other instances of my website
<input type="text" pattern="?" maxlength="10" id="f2f11c3" value="0"></input>
My JS
document.getElementById("f2f11c3").addEventListener("keyup", function(){addcommas("f2f11c3")}, false);
.
function addcommas(id)
{
//i dont know what to place here
//every 3 numbers must have a comma
//ie. input is 123.c39,1mc
//it must also remove if a comma is placed manually
//result must be 123,391
}
Hope someone could help. Thanks!
document.getElementById('f2f11c3').
addEventListener("input", function(){addcommas();}, false);
function addcommas()
{
var v = document.getElementById('f2f11c3');
var t = v.value.replace(/\D/g, '');
var i,temp='';
for(i=t.length; i>=0;i-=3){
if(i==t.length) {
temp=t.substring(i-3,i);
}
else
{
if(t.substring(i-3,i)!="")
temp = t.substring(i-3,i)+','+temp;
}
if(i<0) {temp=t.substring(0,i+3)+','+temp; break;}
}
v.value = temp;
}
DEMO
function addcommas(id) {
var arr = [];
// loop over the id pushing numbers into the array
for (var i = 0, l = id.length; i < l; i++) {
if (id[i] >= 0 && id[i] <= 9) {
arr.push(id[i]);
}
}
// loop over the array splicing in commas at every 3rd position
for (var i = 0, l = arr.length; i < l; i += 3) {
arr.splice(i, 0, ',');
i++;
l++;
}
// remove the first unnecessary comma
arr.shift()
// return the comma-separated string
return arr.join('');
}
DEMO
The id is an HTML element's id, not the value
function addcommas(id)
{
//Not really needed, but just to shorten the details below
var x = document.getElementById(id);
//Current value but removes anything aside from numbers 0-9 and comma (not really needed)
var curval = x.value.replace(/[^\d,]/g,'');
//Strips the comma from the current value if someone entered it manually.
var nocomma = x.value.replace(/[^\d]/g,'');
//If not blank, prevent NAN error
if (nocomma.length>0)
{
//Converts text to int
nocomma = parseInt(nocomma, 10);
//Dont know why, but doesnt work without this
nocomma = nocomma+0;
//Converts it back to string to add the comma
nocomma = nocomma+"";
//Adds comma every 3 numbers, I got this from other research, dont know how it works
x.value = nocomma.replace(/(\d)(?=(\d{3})+$)/g, '$1,');
}
}
My Input in the HTML is as follows
//for iphone, this will popout the numberpad with choices 0-9 only. Easier to type, better mobile usability.
<input type="text" pattern="\d*" maxlength="12" id="f2f11c3" value="0"></input>

How to extract a number from a two line string in javascript [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
I would like to extract the number "40" from following two lines:
Total Boys:4 (40 min)
Main Students:0 (0 min)
How can I do that using javascript? Thank in advance!
Or without a regex
str.split('min').shift().split('(').pop().trim();
FIDDLE
Simply use a regular expression:
var str = 'Total Boys:4 (40 min)\nMain Students: 0 (0 min)';
var number = str.match(/\((\d+)/)[1]; // 40
Here's a simple regex to pull that value out:
var str = 'Total Boys:4 (40 min)\nMain Students:0 (0 min)';
var regexp = /.*\((\d+) min\)\n.*/;
var matches = regexp.exec(str);
alert('match: ' + matches[1]);
Another way of doing it would be simply str.match(/\d+/g)[1] using regex.
DEMO
suppose you have your string inside str andyou want to store the number inside n
another approach is this:
var index1,
index2,
index3,
n;
index1 = str.indexOf('Total Boys:', 0);
index2 = str.indexOf('(', index1) + 1;
index3 = str.indexOf(' ', index2);
n = str.substring(index2, index3);
note that this approach will get only the min value of "Total Boys", not "Main Students".
It will work even if you have another similar line before "Total Boys"

Categories

Resources