regex validating single occurences of characters

regex validating single occurences of characters - javascript

I want to check an input string to validate a proper text. The validation will be done with javascript and right now I'm using this code:
keychar = String.fromCharCode(keynum);
var text = txtBox.value + keychar;
textcheck = /(?!.*(.)\1{1})^[fenFN,]*$/;
return textcheck.test(text);
The strings that are allowed are for example:
f
e
f,e
n,f,e,F,N
Examples of not allowed:
ff
fe
f,f
f,ee
f,e,n,f
n,,(although this could be ok)
Is this possible to solve with regex in Javascript?

Although it is possible using regex, it produces a rather big regex that might be hard to comprehend (and therefor maintain). I'd go for a "manual" option as Benjam suggested.
Using regex however, you could do it like this:
var tests = [
'f',
'e',
'f,e',
'n,f,e,F,N',
'ff',
'fe',
'f,f',
'f,ee',
'f,e,n,f',
'n,,',
'f,e,e'
];
for(var i = 0; i < tests.length; i++) {
var t = tests[i];
print(t + ' -> ' + (t.match(/^([a-zA-Z])(?!.*\1)(,([a-zA-Z])(?!.*\3))*$/) ? 'pass' : 'fail'));
}
which will print:
f -> pass
e -> pass
f,e -> pass
n,f,e,F,N -> pass
ff -> fail
fe -> fail
f,f -> fail
f,ee -> fail
f,e,n,f -> fail
n,, -> fail
f,e,e -> fail
as you can see on Ideone.
A small explanation:
^ # match the start of the input
([a-zA-Z]) # match a single ascii letter and store it in group 1
(?!.*\1) # make sure there's no character ahead of it that matches what is inside group 1
( # open group 2
,([a-zA-Z])(?!.*\3) # match a comma followed by a single ascii letter (in group 3) that is not repeated
)* # close group 2 and repeat it zero or more times
$ # match the endof the input

I don't think you can do it with regexps alone, as they are not very good at looking around in the text for duplicates. I'm sure it can be done, but it won't be pretty at all.
What you might want to do is parse the string character by character and store the current character in an array, and while you're parsing the string, check to see if that character has already been used, as follows:
function test_text(string) {
// split the string into individual pieces
var arr = string.split(',');
var used = [];
// look through the string for duplicates
var idx;
for (idx in arr) {
// check for duplicate letters
if (used.indexOf(arr[idx])) {
return false;
}
// check for letters that did not have a comma between
if (1 < arr[idx].length) {
return false;
}
used.push(arr[idx]);
}
return true;
}
You might also want to make sure that the browser you are running this on supports Array.indexOf by including this script somewhere: Mozilla indexOf

Related

regex to extract numbers starting from second symbol

Sorry for one more to the tons of regexp questions but I can't find anything similar to my needs. I want to output the string which can contain number or letter 'A' as the first symbol and numbers only on other positions. Input is any string, for example:
---INPUT--- -OUTPUT-
A123asdf456 -> A123456
0qw#$56-398 -> 056398
B12376B6f90 -> 12376690
12A12345BCt -> 1212345
What I tried is replace(/[^A\d]/g, '') (I use JS), which almost does the job except the case when there's A in the middle of the string. I tried to use ^ anchor but then the pattern doesn't match other numbers in the string. Not sure what is easier - extract matching characters or remove unmatching.

I think you can do it like this using a negative lookahead and then replace with an empty string.
In an non capturing group (?:, use a negative lookahad (?! to assert that what follows is not the beginning of the string followed by ^A or a digit \d. If that is the case, match any character .
(?:(?!^A|\d).)+
var pattern = /(?:(?!^A|\d).)+/g;
var strings = [
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
for (var i = 0; i < strings.length; i++) {
console.log(strings[i] + " ==> " + strings[i].replace(pattern, ""));
}

You can match and capture desired and undesired characters within two different sides of an alternation, then replace those undesired with nothing:
^(A)|\D
JS code:
var inputStrings = [
"A-123asdf456",
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
console.log(
inputStrings.map(v => v.replace(/^(A)|\D/g, "$1"))
);

You can use the following regex : /(^A)?\d+/g
var arr = ['A123asdf456','0qw#$56-398','B12376B6f90','12A12345BCt', 'A-123asdf456'],
result = arr.map(s => s.match(/(^A|\d)/g).join(''));
console.log(result);

Regex match cookie value and remove hyphens

I'm trying to extract out a group of words from a larger string/cookie that are separated by hyphens. I would like to replace the hyphens with a space and set to a variable. Javascript or jQuery.
As an example, the larger string has a name and value like this within it:
facility=34222%7CConner-Department-Store;
(notice the leading "C")
So first, I need to match()/find facility=34222%7CConner-Department-Store; with regex. Then break it down to "Conner Department Store"
var cookie = document.cookie;
var facilityValue = cookie.match( REGEX ); ??

var test = "store=874635%7Csomethingelse;facility=34222%7CConner-Department-Store;store=874635%7Csomethingelse;";
var test2 = test.replace(/^(.*)facility=([^;]+)(.*)$/, function(matchedString, match1, match2, match3){
return decodeURIComponent(match2);
});
console.log( test2 );
console.log( test2.split('|')[1].replace(/[-]/g, ' ') );

If I understood it correctly, you want to make a phrase by getting all the words between hyphens and disallowing two successive Uppercase letters in a word, so I'd prefer using Regex in that case.
This is a Regex solution, that works dynamically with any cookies in the same format and extract the wanted sentence from it:
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Demo:
var str = "facility=34222%7CConner-Department-Store;";
var matches = str.match(/([A-Z][a-z]+)-?/g);
console.log(matches.map(function(m) {
return m.replace('-', '');
}).join(" "));
Explanation:
Use this Regex (/([A-Z][a-z]+)-?/g to match the words between -.
Replace any - occurence in the matched words.
Then just join these matches array with white space.

Ok,
first, you should decode this string as follows:
var str = "facility=34222%7CConner-Department-Store;"
var decoded = decodeURIComponent(str);
// decoded = "facility=34222|Conner-Department-Store;"
Then you have multiple possibilities to split up this string.
The easiest way is to use substring()
var solution1 = decoded.substring(decoded.indexOf('|') + 1, decoded.length)
// solution1 = "Conner-Department-Store;"
solution1 = solution1.replace('-', ' ');
// solution1 = "Conner Department Store;"
As you can see, substring(arg1, arg2) returns the string, starting at index arg1 and ending at index arg2. See Full Documentation here
If you want to cut the last ; just set decoded.length - 1 as arg2 in the snippet above.
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1)
//returns "Conner-Department-Store"
or all above in just one line:
decoded.substring(decoded.indexOf('|') + 1, decoded.length - 1).replace('-', ' ')
If you want still to use a regular Expression to retrieve (perhaps more) data out of the string, you could use something similar to this snippet:
var solution2 = "";
var regEx= /([A-Za-z]*)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/;
if (regEx.test(decoded)) {
solution2 = decoded.match(regEx);
/* returns
[0:"facility=34222|Conner-Department-Store",
1:"facility",
2:"34222",
3:"Conner-Department-Store",
index:0,
input:"facility=34222|Conner-Department-Store;"
length:4] */
solution2 = solution2[3].replace('-', ' ');
// "Conner Department Store"
}
I have applied some rules for the regex to work, feel free to modify them according your needs.
facility can be any Word built with alphabetical characters lower and uppercase (no other chars) at any length
= needs to be the char =
34222 can be any number but no other characters
| needs to be the char |
Conner-Department-Store can be any characters except one of the following (reserved delimiters): :/?#[]#;,'
Hope this helps :)
edit: to find only the part
facility=34222%7CConner-Department-Store; just modify the regex to
match facility= instead of ([A-z]*)=:
/(facility)=([0-9]*)\|(\S[^:\/?#\[\]\#\;\,']*)/

You can use cookies.js, a mini framework from MDN (Mozilla Developer Network).
Simply include the cookies.js file in your application, and write:
docCookies.getItem("Connor Department Store");

What's the JS RegExp for this specific string?

I have a rather isolated situation in an inventory management program where our shelf locations have a specific format, which is always Letter: Number-Letter-Number, such as Y: 1-E-4. Most of us coworkers just type in "y1e4" and are done with it, but that obviously creates issues with inconsistent formats in a database. Are JS RegExp's the ideal way to automatically detect and format these alphanumeric strings? I'm slowly wrapping my head around JavaScript's Perl syntax, but what's a simple example of formatting one of these strings?

spec: detect string format of either "W: D-W-D" or "WDWD" and return "W: D-W-D"
This function will accept any format and return undefined if it doesnt match, returns the formatted string if a match does occur.
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if(r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}
var possibles = ["y1e1", "y:1e1", "Y: 1r3", "y: 32e4", "1:e3e"];
possibles.forEach(function(posssiblity) {
console.log(`input(${posssiblity}), result(${validateInventoryCode(posssiblity)})`);
})
function validateInventoryCode(input) {
var regexp = /^([a-zA-Z]+)(?:\:\s*)?(\d+)-?(\w+)-?(\d+)$/
var r = regexp.exec(input);
if (r != null) {
return `${r[1]}: ${r[2]}-${r[3]}-${r[4]}`;
}
}

I understand the question as "convert LetterNumberLetterNumber to Letter: Number-Letter-Number.
You may use
/^([a-z])(\d+)([a-z])(\d+)$/i
and replace with $1: $2-$3-$4
Details:
^ - start of string
([a-z]) - Group 1 (referenced with $1 from the replacement pattern) capturing any ASCII letter (as /i makes the pattern case-insensitive)
(\d+) - Group 2 capturing 1 or more digits
([a-z]) - Group 3, a letter
(\d+) - Group 4, a number (1 or more digits)
$ - end of string.
See the regex demo.
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re, '$1: $2-$3-$4');
console.log(result);
OR - if the letters must be turned to upper case:
var re = /^([a-z])(\d+)([a-z])(\d+)$/i;
var s = 'y1e2';
var result = s.replace(re,
(m,g1,g2,g3,g4)=>`${g1.toUpperCase()}: ${g2}-${g3.toUpperCase()}-${g4}`
);
console.log(result);

this is the function to match and replace the pattern: DEMO
function findAndFormat(text){
var splittedText=text.split(' ');
for(var i=0, textLength=splittedText.length; i<textLength; i++){
var analyzed=splittedText[i].match(/[A-z]{1}\d{1}[A-z]{1}\d{1}$/);
if(analyzed){
var formattedString=analyzed[0][0].toUpperCase()+': '+analyzed[0][1]+'-'+analyzed[0][2].toUpperCase()+'-'+analyzed[0][3];
text=text.replace(splittedText[i],formattedString);
}
}
return text;
}

i think it's just as it reads:
y1e4
Letter, number, letter, number:
/([A-z][0-9][A-z][0-9])/g
And yes, it's ok to use regex in this case, like form validations and stuff like that. it's just there are some cases on which abusing of regular expressions gives you a bad performance (into intensive data processing and the like)
Example
"HelloY1E4world".replace(/([A-z][0-9][A-z][0-9])/g, ' ');
should return: "Hello world"
regxr.com always comes in handy

Remove Any Non-Digit And Check if Formatted as Valid Number

I'm trying to figure out a regex pattern that allows a string but removes anything that is not a digit, a ., or a leading -.
I am looking for the simplest way of removing any non "number" variables from a string. This solution doesn't have to be regex.
This means that it should turn
1.203.00 -> 1.20300
-1.203.00 -> -1.20300
-1.-1 -> -1.1
.1 -> .1
3.h3 -> 3.3
4h.34 -> 4.34
44 -> 44
4h -> 4
The rule would be that the first period is a decimal point, and every following one should be removed. There should only be one minus sign in the string and it should be at the front.
I was thinking there should be a regex for it, but I just can't wrap my head around it. Most regex solutions I have figured out allow the second decimal point to remain in place.

You can use this replace approach:
In the first replace we are removing all non-digit and non-DOT characters. Only exception is first hyphen that we negative using a lookahead.
In the second replace with a callback we are removing all the DOT after first DOT.
Code & Demo:
var nums = ['..1', '1..1', '1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3',
'4h.34', '4.34', '44', '4h'
]
document.writeln("<pre>")
for (i = 0; i < nums.length; i++)
document.writeln(nums[i] + " => " + nums[i].replace(/(?!^-)[^\d.]+/g, "").
replace(/^(-?\d*\.\d*)([\d.]+)$/,
function($0, $1, $2) {
return $1 + $2.replace(/[.]+/g, '');
}))
document.writeln("</pre>")

A non-regex solution, implementing a trivial single-pass parser.
Uses ES5 Array features because I like them, but will work just as well with a for-loop.
function generousParse(input) {
var sign = false, point = false;
return input.split('').filter(function(char) {
if (char.match(/[0-9]/)) {
return sign = true;
}
else if (!sign && char === '-') {
return sign = true;
}
else if (!point && char === '.') {
return point = sign = true;
}
else {
return false;
}
}).join('');
}
var inputs = ['1.203.00', '-1.203.00', '-1.-1', '.1', '3.h3', '4h.34', '4.34', '4h.-34', '44', '4h', '.-1', '1..1'];
console.log(inputs.map(generousParse));
Yes, it's longer than multiple regex replaces, but it's much easier to understand and see that it's correct.

I can do it with a regex search-and-replace. num is the string passed in.
num.replace(/[^\d\-\.]/g, '').replace(/(.)-/g, '$1').replace(/\.(\d*\.)*/, function(s) {
return '.' + s.replace(/\./g, '');
});

OK weak attempt but seems fine..
var r = /^-?\.?\d+\.?|(?=[a-z]).*|\d+/g,
str = "1.203.00\n-1.203.00\n-1.-1\n.1\n3.h3\n4h.34\n44\n4h"
sar = str.split("\n").map(s=> s.match(r).join("").replace(/[a-z]/,""));
console.log(sar);

Javascript regular expression is returning # character even though it's not captured

text = 'ticket number #1234 and #8976 ';
r = /#(\d+)/g;
var match = r.exec(text);
log(match); // ["#1234", "1234"]
In the above case I would like to capture both 1234 and 8976. How do I do that. Also the sentence can have any number of '#' followed by integers. So the solution should not hard not be hard coded assuming that there will be at max two occurrences.
Update:
Just curious . Checkout the following two cases.
var match = r.exec(text); // ["#1234", "1234"]
var match = text.match(r); //["#1234", "#8976"]
Why in the second case I am getting # even though I am not capturing it. Looks like string.match does not obey capturing rules.

exec it multiple times to get the rest.
while((match = r.exec(text)))
log(match);

Use String.prototype.match instead of RegExp.prototype.exec:
var match = text.match(r);
That will give you all matches at once (requires g flag) instead of one match at a time.

Here's another way
var text = 'ticket number #1234 and #8976 ';
var r = /#(\d+)/g;
var matches = [];
text.replace( r, function( all, first ) {
matches.push( first )
});
log(matches);
// ["1234", "8976"]

Develop Reference

JavaScript is the programming language of the Web.

regex validating single occurences of characters - javascript

Related

regex to extract numbers starting from second symbol

Regex match cookie value and remove hyphens

What's the JS RegExp for this specific string?

Remove Any Non-Digit And Check if Formatted as Valid Number

Javascript regular expression is returning # character even though it's not captured

Categories

Resources