Regular expression to isolate numbers from symbols and merge them as decimal numbers in JS - javascript

I am trying to isolate specific format groups within strings and convert them through JS or jQuery and regex, strings like these
#aba #abc #33-25
#02-20 #abe #abf
#abg #abe #00-50 #aba
#aja #255-45
to these
33.25€
2.20€
0.50€
255.45€
.
1. In regex level, my workaround so far for isolating #xx.xx groups within strings is
s = "#aba #abc #33-25"
s.match( /(#[W\d-]+)/ ) //["#33-25", "#33-25"]
It recognize the #33-25 substring but outputs it 2 times in an array which is obviously insufficient.
.
2. Also how it will work (JS or jQuery) to (solved by #Kosh's answer)
remove # symbols
replace hyphen symbol (-) to dot symbol (.)
when #01-xx or #0-xx to convert to 1.xx or 0.xx (where x is obviously decimal numbers, always 2)

Use match and replace:
const convert = (s) => s
.match(/#0*(0|[1-9]+)-(\d\d)\b/g)
.map(m => m.replace(/#0*(0|[1-9]+)-(\d\d)/g, '$1.$2€'));
console.log(convert(`#aba #abc #33-25
#02-20 #abe #abf
#abg #abe #00-50 #aba
#aja #255-45`))

You can use
const texts = ['#aba #abc #33-25','#02-20 #abe #abf','#abg #abe #00-50 #aba','#aja #255-45'];
texts.forEach( x => {
const m = x.match(/#(\d+)-(\d+)/);
if (m) {
console.log(x, '=>', `${m[1]}.${m[2]}€`);
}
});
Here, x.match(/#(\d+)-(\d+)/) finds the first match of a /#(\d+)-(\d+)/ regex (# matches a # char, (\d+) captures one or more digits into Group 1, - matches a hyphen and (\d+) captures one or more digits into Group 2) in a string, and then ${m[1]}.${m[2]}€ builds the final result where m[1] is Group 1 and m[2] is Group 2 value.

Related

Regex match apostrophe inside, but not around words, inside a character set

I'm counting how many times different words appear in a text using Regular Expressions in JavaScript. My problem is when I have quoted words: 'word' should be counted simply as word (without the quotes, otherwise they'll behave as two different words), while it's should be counted as a whole word.
(?<=\w)(')(?=\w)
This regex can identify apostrophes inside, but not around words. Problem is, I can't use it inside a character set such as [\w]+.
(?<=\w)(')(?=\w)|[\w]+
Will count it's a 'miracle' of nature as 7 words, instead of 5 (it, ', s becoming 3 different words). Also, the third word should be selected simply as miracle, and not as 'miracle'.
To make things even more complicated, I need to capture diacritics too, so I'm using [A-Za-zÀ-ÖØ-öø-ÿ] instead of \w.
How can I accomplish that?
1) You can simply use /[^\s]+/g regex
const str = `it's a 'miracle' of nature`;
const result = str.match(/[^\s]+/g);
console.log(result.length);
console.log(result);
2) If you are calculating total number of words in a string then you can also use split as:
const str = `it's a 'miracle' of nature`;
const result = str.split(/\s+/);
console.log(result.length);
console.log(result);
3) If you want a word without quote at the starting and at the end then you can do as:
const str = `it's a 'miracle' of nature`;
const result = str.match(/[^\s]+/g).map((s) => {
s = s[0] === "'" ? s.slice(1) : s;
s = s[s.length - 1] === "'" ? s.slice(0, -1) : s;
return s;
});
console.log(result.length);
console.log(result);
You might use an alternation with 2 capture groups, and then check for the values of those groups.
(?<!\S)'(\S+)'(?!\S)|(\S+)
(?<!\S)' Negative lookbehind, assert a whitespace boundary to the left and match '
(\S+) Capture group 1, match 1+ non whitespace chars
'(?!\S) Match ' and assert a whitespace boundary to the right
| Or
(\S+) Capture group 2, match 1+ non whitespace chars
See a regex demo.
const regex = /(?<!\S)'(\S+)'(?!\S)|(\S+)/g;
const s = "it's a 'miracle' of nature";
Array.from(s.matchAll(regex), m => {
if (m[1]) console.log(m[1])
if (m[2]) console.log(m[2])
});

How to match 2 separate numbers in Javascript

I have this regex that should match when there's two numbers in brackets
/(P|C\(\d+\,{0,1}\s*\d+\))/g
for example:
C(1, 2) or P(2 3) //expected to match
C(43) or C(43, ) // expect not to match
but it also matches the ones with only 1 number, how can i fix it?
You have a couple of issues. Firstly, your regex will match either P on its own or C followed by numbers in parentheses; you should replace P|C with [PC] (you could use (?:P|C) but [PC] is more performant, see this Q&A). Secondly, since your regex makes both the , and spaces optional, it can match 43 without an additional number (the 4 matches the first \d+ and the 3 the second \d+). You need to force the string to either include a , or at least one space between the numbers. You can do that with this regex:
[PC]\(\d+[ ,]\s*\d+\)
Demo on regex101
Try this regex
[PC]\(\d+(?:,| +) *\d+\)
Click for Demo
Explanation:
[PC]\( - matches either P( or C(
\d+ - matches 1+ digits
(?:,| +) - matches either a , or 1+ spaces
*\d+ - matches 0+ spaces followed by 1+ digits
\) - matches )
You can relax the separator between the numbers by allowing any combination of command and space by using \d[,\s]+\d. Test case:
const regex = /[PC]\(\d+[,\s]+\d+\)/g;
[
'C(1, 2) or P(2 3)',
'C(43) or C(43, )'
].forEach(str => {
let m = str.match(regex);
console.log(str + ' ==> ' + JSON.stringify(m));
});
Output:
C(1, 2) or P(2 3) ==> ["C(1, 2)","P(2 3)"]
C(43) or C(43, ) ==> null
Your regex should require the presence of at least one delimiting character between the numbers.
I suppose you want to get the numbers out of it separately, like in an array of numbers:
let tests = [
"C(1, 2)",
"P(2 3)",
"C(43)",
"C(43, )"
];
for (let test of tests) {
console.log(
test.match(/[PC]\((\d+)[,\s]+(\d+)\)/)?.slice(1)?.map(Number)
);
}

Javascript replace regex to accept only numbers, including negative ones, two decimals, replace 0s in the beginning, except if number is 0

The question became a bit long, but it explains the expected behaviour.
let regex = undefined;
const format = (string) => string.replace(regex, '');
format('0')
//0
format('00')
//0
format('02')
//2
format('-03')
//-3
format('023.2323')
//23.23
format('00023.2.3.2.3')
//23.23
In the above example you can see the expected results in comments.
To summarize. I'm looking for a regex not for test, for replace which formats a string:
removes 0s from the beginning if it's followed by any numbers
allows decimal digits, but just 2
allows negative numbers
allows decimal points, but just one (followed by min 1, max 2 decimal digits)
The last one is a bit difficult to handle as the user can't enter period at the same time, I'll have two formatter functions, one will be the input in the input field, and one for the closest valid value at the moment (for example '2.' will show '2.' in the input field, but the handler will receive the value '2').
If not big favour, I'd like to see explanation of the solution, why it works, and what's the purpose of which part.
Right now I'm having string.replace(/[^\d]+(\.\[^\d{1,2}])+|^0+(?!$)/g, ''), but it doesn't fulfill all the requirements.
You may use this code:
const arr = ['0', '00', '02', '-03', '023.2323', '00023.2.3.2.3', '-23.2.3.2.3']
var narr = []
// to remove leading zeroes
const re1 = /^([+-]?)0+?(?=\d)/
// to remove multiple decimals
const re2 = /^([+-]?\d*\.\d+)\.(\d+).*/
arr.forEach( el => {
el = el.replace(re1, '$1').replace(re2, '$1$2')
if (el.indexOf('.') >= 0)
el = Number(el).toFixed(2)
narr.push(el)
})
console.log(narr)
//=> ["0", "0", "2", "-3", "23.23", "23.23"]
If you aren't bound to the String#replace method, you can try this regex:
/^([+-])?0*(?=\d+$|\d+\.)(\d+)(?:\.(\d{1,2}))?$/
Inspect on regex101.com
It collects the parts of the number into capturing groups, as follows:
Sign: the sign of the number, +, - or undefined
Integer: the integer part of the number, without leading zeros
Decimal: the decimal part of the number, undefined if absent
This regex won't match if more then 2 decimal places present. To strip it instead, use this:
/^([+-])?0*(?=\d+$|\d+\.)(\d+)(?:\.(\d{1,2})\d*)?$/
Inspect on regex101.com
To format a number using one of the above, you can use something like:
let regex = /^([+-])?0*(?=\d+$|\d+\.)(\d+)(?:\.(\d{1,2}))?$/
const format = string => {
try{
const [, sign, integer, decimal = ''] = string.match(regex)
return `${(sign !== '-' ? '' : '-')}${integer}${(decimal && `.${decimal}`)}`
}catch(e){
//Invalid format, do something
return
}
}
console.log(format('0'))
//0
console.log(format('00'))
//0
console.log(format('02'))
//2
console.log(format('-03'))
//-3
console.log(format('023.23'))
//23.23
console.log(format('023.2323'))
//undefined (invalid format)
console.log(format('00023.2.3.2.3'))
//undefined (invalid format)
//Using the 2nd regex
regex = /^([+-])?0*(?=\d+$|\d+\.)(\d+)(?:\.(\d{1,2})\d*)?$/
console.log(format('0'))
//0
console.log(format('00'))
//0
console.log(format('02'))
//2
console.log(format('-03'))
//-3
console.log(format('023.23'))
//23.23
console.log(format('023.2323'))
//23.23
console.log(format('00023.2.3.2.3'))
//undefined (invalid format)
Another option is to use pattern with 3 capturing groups. In the replacement, use all 3 groups "$1$2$3"
If the string after the replacement is empty, return a single zero.
If the string is not empty, concat group 1, group 2 and group 3 where for group 3, remove all the dots except for the first one to keep it for the decimal and take the first 3 characters (which is the dot and 2 digits)
^([-+]?)0*([1-9]\d*)((?:\.\d+)*)|0+$
In parts
^ Start of string
( Capture group 1
[-+]? Match an optional + or -
) Close group
0* Match 0+ times a zero
( Capture group 2
[1-9]\d* Match a digit 1-9 followed by optional digits 0-9
) Close group
( Capture group 3
(?:\.\d+)* Repeat 0+ times matching a dot and a digit
) Close group
| Or
0+ Match 1+ times a zero
$ End of string
Regex demo
const strings = ['0', '00', '02', '-03', '023.2323', '00023.2.3.2.3', '-23.2.3.2.3', '00001234', '+0000100005.0001']
let pattern = /^([-+]?)0*([1-9]\d*)((?:\.\d+)*)|0+$/;
let format = s => {
s = s.replace(pattern, "$1$2$3");
return s === "" ? '0' : s.replace(pattern, (_, g1, g2, g3) =>
g1 + g2 + g3.replace(/(?!^)\./g, '').substring(0, 3)
);
};
strings.forEach(s => console.log(format(s)));

Masking phone number with regex in javascript

My application has a specific phone number format which looks like 999.111.222, which I have a regex pattern to mask it on front-end:
/[0-9]{3}\.[0-9]{3}\.([0-9]{3})/
But recently, the format was changed to allow the middle three digits to have one less digit, so now both 999.11.222 and 999.111.222 match. How can I change my regex accordingly?
"999.111.222".replace(/[0-9]{3}\.[0-9]{3}\.([0-9]{3})/, '<div>xxx.xxx.$1</div>')
expected output:
"999.111.222" // xxx.xxx.222
"999.11.222" // xxx.xx.222
Replace {3} with {2,3} to match two or three digits.
/[0-9]{3}\.[0-9]{2,3}\.([0-9]{3})/
For reference see e.g. MDN
Use
console.log(
"999.11.222".replace(/[0-9]{3}\.([0-9]{2,3})\.([0-9]{3})/, function ($0, $1, $2)
{ return '<div>xxx.' + $1.replace(/\d/g, 'x') + '.' + $2 + '</div>'; })
)
The ([0-9]{2,3}) first capturing group will match 2 or 3 digits, and in the callback method used as the replacement argument, all the digits from th first group are replaced with x.
You may further customize the pattern for the first set of digits, too.
In fact, you should change not only your regex but also your callback replace function:
const regex = /[0-9]{3}\.([0-9]{2,3})\.([0-9]{3})/;
const cbFn = (all, g1, g2) =>`<div>xxx.xx${(g1.length === 3 ? 'x' : '')}.${g2}</div>`;
const a = "999.11.222".replace(regex, cbFn);
const b = "999.111.222".replace(regex, cbFn);
console.log(a, b);
To change regex you could add a term with {2,3} quantifier, as already suggested, and create a new group. Then, in replace cb function, you can use length to know if you must put a new x.

regex to extract numbers starting from second symbol

Sorry for one more to the tons of regexp questions but I can't find anything similar to my needs. I want to output the string which can contain number or letter 'A' as the first symbol and numbers only on other positions. Input is any string, for example:
---INPUT--- -OUTPUT-
A123asdf456 -> A123456
0qw#$56-398 -> 056398
B12376B6f90 -> 12376690
12A12345BCt -> 1212345
What I tried is replace(/[^A\d]/g, '') (I use JS), which almost does the job except the case when there's A in the middle of the string. I tried to use ^ anchor but then the pattern doesn't match other numbers in the string. Not sure what is easier - extract matching characters or remove unmatching.
I think you can do it like this using a negative lookahead and then replace with an empty string.
In an non capturing group (?:, use a negative lookahad (?! to assert that what follows is not the beginning of the string followed by ^A or a digit \d. If that is the case, match any character .
(?:(?!^A|\d).)+
var pattern = /(?:(?!^A|\d).)+/g;
var strings = [
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
for (var i = 0; i < strings.length; i++) {
console.log(strings[i] + " ==> " + strings[i].replace(pattern, ""));
}
You can match and capture desired and undesired characters within two different sides of an alternation, then replace those undesired with nothing:
^(A)|\D
JS code:
var inputStrings = [
"A-123asdf456",
"A123asdf456",
"0qw#$56-398",
"B12376B6f90",
"12A12345BCt"
];
console.log(
inputStrings.map(v => v.replace(/^(A)|\D/g, "$1"))
);
You can use the following regex : /(^A)?\d+/g
var arr = ['A123asdf456','0qw#$56-398','B12376B6f90','12A12345BCt', 'A-123asdf456'],
result = arr.map(s => s.match(/(^A|\d)/g).join(''));
console.log(result);

Categories

Resources