check whether csv form or not [duplicate] - javascript

What is the regular expression to validate a comma delimited list like this one:
12365, 45236, 458, 1, 99996332, ......

I suggest you to do in the following way:
(\d+)(,\s*\d+)*
which would work for a list containing 1 or more elements.

This regex extracts an element from a comma separated list, regardless of contents:
(.+?)(?:,|$)
If you just replace the comma with something else, it should work for any delimiter.

It depends a bit on your exact requirements. I'm assuming: all numbers, any length, numbers cannot have leading zeros nor contain commas or decimal points. individual numbers always separated by a comma then a space, and the last number does NOT have a comma and space after it. Any of these being wrong would simplify the solution.
([1-9][0-9]*,[ ])*[1-9][0-9]*
Here's how I built that mentally:
[0-9] any digit.
[1-9][0-9]* leading non-zero digit followed by any number of digits
[1-9][0-9]*, as above, followed by a comma
[1-9][0-9]*[ ] as above, followed by a space
([1-9][0-9]*[ ])* as above, repeated 0 or more times
([1-9][0-9]*[ ])*[1-9][0-9]* as above, with a final number that doesn't have a comma.

Match duplicate comma-delimited items:
(?<=,|^)([^,]*)(,\1)+(?=,|$)
Reference.
This regex can be used to split the values of a comma delimitted list. List elements may be quoted, unquoted or empty. Commas inside a pair of quotation marks are not matched.
,(?!(?<=(?:^|,)\s*"(?:[^"]|""|\\")*,)(?:[^"]|""|\\")*"\s*(?:,|$))
Reference.

/^\d+(?:, ?\d+)*$/

i used this for a list of items that had to be alphanumeric without underscores at the front of each item.
^(([0-9a-zA-Z][0-9a-zA-Z_]*)([,][0-9a-zA-Z][0-9a-zA-Z_]*)*)$

You might want to specify language just to be safe, but
(\d+, ?)+(\d+)?
ought to work

I had a slightly different requirement, to parse an encoded dictionary/hashtable with escaped commas, like this:
"1=This is something, 2=This is something,,with an escaped comma, 3=This is something else"
I think this is an elegant solution, with a trick that avoids a lot of regex complexity:
if (string.IsNullOrEmpty(encodedValues))
{
return null;
}
else
{
var retVal = new Dictionary<int, string>();
var reFields = new Regex(#"([0-9]+)\=(([A-Za-z0-9\s]|(,,))+),");
foreach (Match match in reFields.Matches(encodedValues + ","))
{
var id = match.Groups[1].Value;
var value = match.Groups[2].Value;
retVal[int.Parse(id)] = value.Replace(",,", ",");
}
return retVal;
}
I think it can be adapted to the original question with an expression like #"([0-9]+),\s?" and parse on Groups[0].
I hope it's helpful to somebody and thanks for the tips on getting it close to there, especially Asaph!

In JavaScript, use split to help out, and catch any negative digits as well:
'-1,2,-3'.match(/(-?\d+)(,\s*-?\d+)*/)[0].split(',');
// ["-1", "2", "-3"]
// may need trimming if digits are space-separated

The following will match any comma delimited word/digit/space combination
(((.)*,)*)(.)*

Why don't you work with groups:
^(\d+(, )?)+$

If you had a more complicated regex, i.e: for valid urls rather than just numbers. You could do the following where you loop through each element and test each of them individually against your regex:
const validRelativeUrlRegex = /^(^$|(?!.*(\W\W))\/[a-zA-Z0-9\/-]+[^\W_]$)/;
const relativeUrls = "/url1,/url-2,url3";
const startsWithComma = relativeUrls.startsWith(",");
const endsWithComma = relativeUrls.endsWith(",");
const areAllURLsValid = relativeUrls
.split(",")
.every(url => validRelativeUrlRegex.test(url));
const isValid = areAllURLsValid && !endsWithComma && !startsWithComma

Related

How to split a string by a character not directly preceded by a character of the same type?

Let's say I have a string: "We.need..to...split.asap". What I would like to do is to split the string by the delimiter ., but I only wish to split by the first . and include any recurring .s in the succeeding token.
Expected output:
["We", "need", ".to", "..split", "asap"]
In other languages, I know that this is possible with a look-behind /(?<!\.)\./ but Javascript unfortunately does not support such a feature.
I am curious to see your answers to this question. Perhaps there is a clever use of look-aheads that presently evades me?
I was considering reversing the string, then re-reversing the tokens, but that seems like too much work for what I am after... plus controversy: How do you reverse a string in place in JavaScript?
Thanks for the help!
Here's a variation of the answer by guest271314 that handles more than two consecutive delimiters:
var text = "We.need.to...split.asap";
var re = /(\.*[^.]+)\./;
var items = text.split(re).filter(function(val) { return val.length > 0; });
It uses the detail that if the split expression includes a capture group, the captured items are included in the returned array. These capture groups are actually the only thing we are interested in; the tokens are all empty strings, which we filter out.
EDIT: Unfortunately there's perhaps one slight bug with this. If the text to be split starts with a delimiter, that will be included in the first token. If that's an issue, it can be remedied with:
var re = /(?:^|(\.*[^.]+))\./;
var items = text.split(re).filter(function(val) { return !!val; });
(I think this regex is ugly and would welcome an improvement.)
You can do this without any lookaheads:
var subject = "We.need.to....split.asap";
var regex = /\.?(\.*[^.]+)/g;
var matches, output = [];
while(matches = regex.exec(subject)) {
output.push(matches[1]);
}
document.write(JSON.stringify(output));
It seemed like it'd work in one line, as it did on https://regex101.com/r/cO1dP3/1, but had to be expanded in the code above because the /g option by default prevents capturing groups from returning with .match (i.e. the correct data was in the capturing groups, but we couldn't immediately access them without doing the above).
See: JavaScript Regex Global Match Groups
An alternative solution with the original one liner (plus one line) is:
document.write(JSON.stringify(
"We.need.to....split.asap".match(/\.?(\.*[^.]+)/g)
.map(function(s) { return s.replace(/^\./, ''); })
));
Take your pick!
Note: This answer can't handle more than 2 consecutive delimiters, since it was written according to the example in the revision 1 of the question, which was not very clear about such cases.
var text = "We.need.to..split.asap";
// split "." if followed by "."
var res = text.split(/\.(?=\.)/).map(function(val, key) {
// if `val[0]` does not begin with "." split "."
// else split "." if not followed by "."
return val[0] !== "." ? val.split(/\./) : val.split(/\.(?!.*\.)/)
});
// concat arrays `res[0]` , `res[1]`
res = res[0].concat(res[1]);
document.write(JSON.stringify(res));

Best way to remove thousand separators from string amount using a regex

I have variables that contain amounts and would like to remove the (US) thousand separators but also have to cover the scenario that there may be non-US formatted amounts where the comma is used for the decimals instead of for the thousands where I don't want to replace the comma.
Examples:
1,234,567.00 needs to become 1234567.00
1,234.00 needs to become 1234.00
but
1.234.567,00 needs to remain unchanged as not US format (i.e. comma here is used for decimals)
1.234,00 needs to remain unchanged as not US format (i.e. comma here is used for decimals)
I was thinking of using the following but wasn't sure about it as I am pretty new to Regex:
myVar.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");
What is best solution here? Note: I just need to cover normal amounts like the above examples, no special cases like letter / number combinations or things like 1,2,3 etc.
This one may suit your needs:
,(?=[\d,]*\.\d{2}\b)
Debuggex Demo
if (string.match(/\.\d{2}$/) {
string = string.replace(',', '');
}
or
string.replace(/,(?=.*\.\d+)/g, '');
Replace /,(?=\d*[\.,])/g with empty string?
http://regexr.com/39v2m
You can use replace() method to remove all the commas. They will be replaced with an empty string. I'm using reg exp with lookahead assertion to detect if a comma is followed by three digits, if so given comma will be removed.
string.replace(/,(?=\d{3})/g, '')
Examples:
'12,345,678.90'.replace(/,(?=\d{3})/g, '')
// '12345678.90'
'1,23,456.78'.replace(/,(?=\d{3})/g, '')
// '1,23456.78'
'$1,234.56'.replace(/,(?=\d{3})/g, '')
// '$1234.56'
This code is worked for me and you can use it in set amount val for remove separators
t.replace(/,(?=\d{3})/g, '')
myVar = myVar.replace(/([.,])(\d\d\d\D|\d\d\d$)/g,'$2');
Removes the period . or comma , when used as a thousand separator.

Match a string between two other strings with regex in javascript

How can I use regex in javascript to match the phone number and only the phone number in the sample string below? The way I have it written below matches "PHONE=9878906756", I need it to only match "9878906756". I think this should be relatively simple, but I've tried putting negating like characters around "PHONE=" with no luck. I can get the phone number in its own group, but that doesn't help when assigning to the javascript var, which only cares what matches.
REGEX:
/PHONE=([^,]*)/g
DATA:
3={STATE=, SSN=, STREET2=, STREET1=, PHONE=9878906756,
MIDDLENAME=, FIRSTNAME=Dexter, POSTALCODE=, DATEOFBIRTH=19650802,
GENDER=0, CITY=, LASTNAME=Morgan
The way you're doing it is right, you just have to get the value of the capture group rather than the value of the whole match:
var result = str.match(/PHONE=([^,]*)/); // Or result = /PHONE=([^,]*)/.exec(str);
if (result) {
console.log(result[1]); // "9878906756"
}
In the array you get back from match, the first entry is the whole match, and then there are additional entries for each capture group.
You also don't need the g flag.
Just use dataAfterRegex.substring(6) to take out the first 6 characters (i.e.: the PHONE= part).
Try
var str = "3={STATE=, SSN=, STREET2=, STREET1=, PHONE=9878906756, MIDDLENAME=, FIRSTNAME=Dexter, POSTALCODE=, DATEOFBIRTH=19650802, GENDER=0, CITY=, LASTNAME=Morgan";
var ph = str.match(/PHONE\=\d+/)[0].slice(-10);
console.log(ph);

What are elegant ways to pair characters in a string?

For example, if the initial string s is "0123456789", desired output would be an array ["01", "23", "45", "67", "89"].
Looking for elegant solutions in JavaScript.
What I was thinking (very non-elegantly) is to iterate through the string by splitting on the empty string and using the Array.forEach method, and insert a delimeter after every two characters, then split by that delimeter. This is not a good solution, but it's my starting point.
Edit: A RegExp solution has been posted. I'd love to see if there are any other approaches.
How about:
var array = ("0123456789").match(/\w{1,2}/g);
Here we use .match() on your string to match any two or single ({1,2}) word characters (\w) and return an array of the results.
Regarding your edit for a non-regex solution; you could do a far less elegant function like this:
String.prototype.getPairs = function()
{
var pairs = [];
for(var i = 0; i < this.length; i += 2)
{
pairs[pairs.length] = this.substr(i, 2);
}
return pairs;
}
var array = ("01234567890").getPairs();
If you want to use split (and why not), you could do the following:
s.split(/([^][^])/).filter(function(x){return x})
Which splits using two consecutive characters as a delimiter (but because they're in a capture group, they're also part of split's result. Filtering that with the identity function serves to eliminate the empty strings (between the "delimiters"). Note that in the case of an odd number of characters, the last character will be output as a split, not a delimiter, but it doesn't matter since it will still test truthy.
([^] is how you spell . in javascript if you really want to match any character. I had to look that up.)

Using Regular Expressions with Javascript replace method

Friends,
I'm new to both Javascript and Regular Expressions and hope you can help!
Within a Javascript function I need to check to see if a comma(,) appears 1 or more times. If it does then there should be one or more numbers either side of it.
e.g.
1,000.00 is ok
1,000,00 is ok
,000.00 is not ok
1,,000.00 is not ok
If these conditions are met I want the comma to be removed so 1,000.00 becomes 1000.00
What I have tried so is:
var x = '1,000.00';
var regex = new RegExp("[0-9]+,[0-9]+", "g");
var y = x.replace(regex,"");
alert(y);
When run the alert shows ".00" Which is not what I was expecting or want!
Thanks in advance for any help provided.
strong text
Edit
strong text
Thanks all for the input so far and the 3 answers given. Unfortunately I don't think I explained my question well enough.
What I am trying to achieve is:
If there is a comma in the text and there are one or more numbers either side of it then remove the comma but leave the rest of the string as is.
If there is a comma in the text and there is not at least one number either side of it then do nothing.
So using my examples from above:
1,000.00 becomes 1000.00
1,000,00 becomes 100000
,000.00 is left as ,000.00
1,,000.00 is left as 1,,000.00
Apologies for the confusion!
Your regex isn't going to be very flexible with higher orders than 1000 and it has a problem with inputs which don't have the comma. More problematically you're also matching and replacing the part of the data you're interested in!
Better to have a regex which matches the forms which are a problem and remove them.
The following matches (in order) commas at the beginning of the input, at the end of the input, preceded by a number of non digits, or followed by a number of non digits.
var y = x.replace(/^,|,$|[^0-9]+,|,[^0-9]+/g,'');
As an aside, all of this is much easier if you happen to be able to do lookbehind but almost every JS implementation doesn't.
Edit based on question update:
Ok, I won't attempt to understand why your rules are as they are, but the regex gets simpler to solve it:
var y = x.replace(/(\d),(\d)/g, '$1$2');
I would use something like the following:
^[0-9]{1,3}(,[0-9]{3})*(\.[0-9]+)$
[0-9]{1,3}: 1 to 3 digits
(,[0-9]{3})*: [Optional] More digit triplets seperated by a comma
(\.[0-9]+): [Optional] Dot + more digits
If this regex matches, you know that your number is valid. Just replace all commas with the empty string afterwards.
It seems to me you have three error conditions
",1000"
"1000,"
"1,,000"
If any one of these is true then you should reject the field, If they are all false then you can strip the commas in the normal way and move on. This can be a simple alternation:
^,|,,|,$
I would just remove anything except digits and the decimal separator ([^0-9.]) and send the output through parseFloat():
var y = parseFloat(x.replace(/[^0-9.]+/g, ""));
// invalid cases:
// - standalone comma at the beginning of the string
// - comma next to another comma
// - standalone comma at the end of the string
var i,
inputs = ['1,000.00', '1,000,00', ',000.00', '1,,000.00'],
invalid_cases = /(^,)|(,,)|(,$)/;
for (i = 0; i < inputs.length; i++) {
if (inputs[i].match(invalid_cases) === null) {
// wipe out everything but decimal and dot
inputs[i] = inputs[i].replace(/[^\d.]+/g, '');
}
}
console.log(inputs); // ["1000.00", "100000", ",000.00", "1,,000.00"]

Categories

Resources