Regex to match any currency in page - javascript

Would like to take a string filled with text and extract the prices from it. For example, here's what it should match:
$1,234.55
$90.99
$90
$100.30
$203
Regex help here would be amazing, thank you so much for you time! This will be used in either PHP or Javascript.

You could use the below regex to match all the price strings,
\$\d+(?:,\d+)*(?:\.\d+)?
DEMO
Explanation:
\$ Matches the literal $ symbol.
\d+ Matches one or more numbers.
(?:,\d+)* Matches a comma and the following digits zero or more times.
(?:\.\d+)? Matches a dot and the following digits. ? turns the whole match as optional one.

Tada:
\$[\d,]+(?:\.\d+)?
Here it is in practice

Try this one /^\$(\d{1,3}(?:,\d{3})*)(?:\.(\d{1,2}))?$/:
"$1,121,234.55".match(/^\$(\d{1,3}(?:,\d{3})*)(?:\.(\d{1,2}))?$/);
(JS code)
This should be work for following patterns:
$1
$2.21
$56,231
$12,212.12
$56,823,163.12
First group $1 would be the full dollars (e.g. without cents) and second group $2 would be cents.
Description in details:
\$ is dollar sign
(\d{1,3}(?:,\d{3})*) captures whole dollar value
\d{1,3} it's the first three digits
(?:,\d{3})* everything which goes after first three digits
(?:\.(\d{1,2}))? captures the cents, it's optional
(?:\. ignores the dot
(\d{1,2}) and captures cents
Good luck!

Related

RegEx for matching all chars except for comma separated digits

I have an input that I want to apply validation to. User can type any integer (positive or negative) numbers separated with a comma. I want to
Some examples of allowed inputs:
1,2,3
-1,2,-3
3
4
22,-33
Some examples of forbidden inputs:
1,,2
--1,2,3
-1,2,--3
asdas
[]\%$1
I know a little about regex, I tried lots of ways, they're not working very well see this inline regex checker:
^[-|\d][\d,][\d]
You can use
^(?:-?[0-9]+(?:,(?!$)|$))+$
https://regex101.com/r/PAyar7/2
-? - Lead with optional -
[0-9]+ - Repeat digits
(?:,(?!$)|$)) - After the digits, match either a comma, or the end of the string. When matching a comma, make sure you're not at the end of the string with (?!$)
As per your requirements I'd use something simple like
^-?\d+(?:,-?\d+)*$
at start ^ an optional minus -? followed by \d+ one or more digits.
followed by (?:,-?\d+)* a quantified non capturing group containing a comma, followed by an optional hyphen, followed by one or more digits until $ end.
See your updated demo at regex101
Another perhaps harder to understand one which might be a bit less efficient:
^(?:(?:\B-)?\d+,?)+\b$
The quantified non capturing group contains another optional non capturing group with a hyphen preceded by a non word boundary, followed by 1 or more digits, followed by optional comma.
\b the word boundary at the $ end ensures, that the string must end with a word character (which can only be a digit here).
You can test this one here at regex101

Regex for a valid hashtag

I need regular expression for validating a hashtag. Each hashtag should starts with hashtag("#").
Valid inputs:
1. #hashtag_abc
2. #simpleHashtag
3. #hashtag123
Invalid inputs:
1. #hashtag#
2. #hashtag#hashtag
I have been trying with this regex /#[a-zA-z0-9]/ but it is accepting invalid inputs also.
Any suggestions for how to do it?
The current accepted answer fails in a few places:
It accepts hashtags that have no letters in them (i.e. "#11111", "#___" both pass).
It will exclude hashtags that are separated by spaces ("hey there #friend" fails to match "#friend").
It doesn't allow you to place a min/max length on the hashtag.
It doesn't offer a lot of flexibility if you decide to add other symbols/characters to your valid input list.
Try the following regex:
/(^|\B)#(?![0-9_]+\b)([a-zA-Z0-9_]{1,30})(\b|\r)/g
It'll close up the above edge cases, and furthermore:
You can change {1,30} to your desired min/max
You can add other symbols to the [0-9_] and [a-zA-Z0-9_] blocks if you wish to later
Here's a link to the demo.
To answer the current question...
There are 2 issues:
[A-z] allows more than just letter chars ([, , ], ^, _, ` )
There is no quantifier after the character class and it only matches 1 char
Since you are validating the whole string, you also need anchors (^ and $)to ensure a full string match:
/^#\w+$/
See the regex demo.
If you want to extract specific valid hashtags from longer texts...
This is a bonus section as a lot of people seek to extract (not validate) hashtags, so here are a couple of solutions for you. Just mind that \w in JavaScript (and a lot of other regex libraries) equal to [a-zA-Z0-9_]:
#\w{1,30}\b - a # char followed with one to thirty word chars followed with a word boundary
\B#\w{1,30}\b - a # char that is either at the start of string or right after a non-word char, then one to thirty word (i.e. letter, digit, or underscore) chars followed with one to thirty word chars followed with a word boundary
\B#(?![\d_]+\b)(\w{1,30})\b - # that is either at the start of string or right after a non-word char, then one to thirty word (i.e. letter, digit, or underscore) chars (that cannot be just digits/underscores) followed with a word boundary
And last but not least, here is a Twitter hashtag regex from https://github.com/twitter/twitter-text/tree/master/js... Sorry, too long to paste in the SO post, here it is: https://gist.github.com/stribizhev/715ee1ee2dc1439ffd464d81d22f80d1.
You could try the this : /#[a-zA-Z0-9_]+/
This will only include letters, numbers & underscores.
A regex code that matches any hashtag.
In this approach any character is accepted in hashtags except main signs !##$%^&*()
(?<=(\s|^))#[^\s\!\#\#\$\%\^\&\*\(\)]+(?=(\s|$))
Usage Notes
Turn on "g" and "m" flags when using!
It is tested for Java and JavaScript languages via https://regex101.com and VSCode tools.
It is available on this repo.
Unicode general categories can help with that task:
/^#[\p{L}\p{Nd}_]+$/gu
I use \p{L} and \p{Nd} unicode categories to match any letter or decimal digit number. You can add any necessary category for your regex. The complete list of categories can be found here: https://unicode.org/reports/tr18/#General_Category_Property
Regex live demo:
https://regexr.com/5tvmo
useful and tested regex for detecting hashtags in the text
/(^|\s)(#[a-zA-Z\d_]+)/ig
examples of valid matching hashtag:
#abc
#ab_c
#ABC
#aBC
/\B(?:#|#)((?![\p{N}_]+(?:$|\b|\s))(?:[\p{L}\p{M}\p{N}_]{1,60}))/ug
allow any language characters or characters with numbers or _.
numbers alone or numbers with _ are not allowed.
It's unicode regex, so if you are using Python, you may need to install regex.
to test it https://regex101.com/r/NLHUQh/1

Regex lookaround for a group doesn't work

Happy Saturday,
I'm wondering if Stackoverflow's users could give me a clue about one specific Regex..
(^visite\d+)(?!\D)
The above regex works well..
It says that :
visite12345 --> is a good anwser (the string does match)
visite1a --> is not a good anwser (the string doesn't match)
However for:
visite12345a --> It doesn't work.
Indeed, the output is visite1234, whereas I'd like to get the same answer that for visite1a (string doesn't match)...
I use http://regexr.com/ to test my regexp.
Do you have any idea how to so?
Thank you very much.
The regex (^visite\d+)(?!\D) matches visite at the start of the string, followed with one or more digits that should not be followed with a non-digit.
The "issue" is that the engine can backtrack within \d+ pattern and it can match 2 digits if the third is not followed with a nondigit.
The best way to solve it is to check the actual requirements and adjust the pattern.
If the digits are the last characters in the string you just should replace the lookahead with the $ anchor.
A generic solution for this is making the subpattern atomic with a capturing group inside a positive lookahead and a backreference, and make sure the lookahead is changed to something like (?![a-zA-Z]) - fail if there is a letter):
/^visite(?=(\d+))\1(?![a-z])/i
See the regex demo
Or if a word boundary should follow the digits (i.e. digits should be followed with a letter, digit or an underscore), use \b instead of the lookahead:
/^visite\d+\b/
See another demo

Regex match being overwritten

I'm trying to sanitise a phone number using Regex.
I don't want any separating characters between digits and I don't want the local (0) part. Separators could be any non-digit character.
ie. the number could be:
+44 (00) 845 740 4404
+44-(00)-845-740-4404
+44-(00)-845-740=4404 (unlikely but could be a typo)
This matches the (0) part fine:
http://regex101.com/r/cB6hN4/3
But if I add |\D+ to match a non-digit character, it overwrites my first match:
http://regex101.com/r/cB6hN4/2
How do I keep both matches within in the one regex?
Instead of using |\D+ at the end try to use |[^()\d]+
The regex will be \((\d+)\)|[^()\d]+
DEMO
But take into account that the parenthesis could not be used as a separator as you can see in the demo
I think you want something like this,
\((\d+)\)|(?:(?!\(\d+\))\D)+
DEMO
(?:(?!\(\d+\))\D)+ matches one or more non-digit characters but not of (\d+)

JavaScript Regular Expression to match digits too

I use this regex /^[-.a-zA-Z\s]+$/ to match any string contains only English letters, dashes and dots.
I would like to modify it to make it match any digit too.
so all these strings will be accepted:
first
first-floor
1st floor
floor No. 1
how can I do this ?
Just add digits to your character class:
/^[-.a-zA-Z\d\s]+$/
you can also write:
/^[-.a-zA-Z0-9\s]+$/

Categories

Resources