Get numbers and characters after 3 expression with regex - javascript

I'm using a regular expression to get the next value of a particular word in some huge text.
Example:
Money: 0,00 0,00 0,00 50,00
Currently I'm taking the value of 0.00 with the following Regex:
var obj =
var text = 'HUGE TEXT HERE'
var reg = new RegExp('Money' + '.*?(\\d\\S*)');
var match = reg.exec(text);
if (match === null) {
obj[key] = '';
continue;
}
obj[key] = match[1];
output:
object.money = '0,00'
This value is dynamic and sometimes the word 'Money' changes. So i need to be able to pass the word name.
    
I would like to amplify my regular expression to ignore the next three expressions and then get the value. It is possible?
Thanks.

You can look for (and ignore) the three repeating numbers, then catch the last one:
(?:\d\S*\s+){3}(\d\S*)
Where:
(?:...)
means "don't include this group in the returned list of captured groups" and
{3}
means "match three of them".
We're including \s+ at the end to catch the whitespace between the numbers.
Something like (simplified):
var obj = {};
var key = 'Money';
var text = "Something something:\n" +
"Money: 1,00 2,00 3,00 3,00\n" +
"Something else";
var reg = new RegExp(key + '\\b.*?(?:\\d\\S*\\s+){3}(\\d\\S*)');
var match = reg.exec(text);
if (match === null)
obj[key] = '';
else
obj[key] = match[1];
console.log(obj[key]);

Related

string to dictionary in javascript

I have a string that I want to transform into a dictionary, the string looks like this:
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
I want to transform that string into a dictionary that looks like this:
const dic = {
"name": "this is the name",
"age": "21",
"faculty": "some faculty"
}
So the string is formated in this was ::name{parameters...} and can have any parameters, not only name, faculty, ... How can I format any string that looks like that and transform it into a dictionary?
Also Is there a way to check that the current string I'm parsing follow this structure ::name{parameters...}, that way I can throw an error when it does not.
Any help would be greatly appreciated!
Assuming that you only have alphanumeric and spaces or an empty string within the parentheses of the values, and there is never a space between the key and value in key="value" you can match with the following regular expression and then iterate over it to construct your desired object.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}'
const matches = str.match(/[\w]+\=\"[\w\s]*(?=")/g)
const result = {}
matches.forEach(match => {
const [key, value] = match.split('="')
result[key] = value
})
console.log(result)
The regex is composed of the following parts:
You can use https://regexr.com to experiment with your regular expressions. Depending on the string to process, you'll need to refine your regex.
This example uses exec to look for matches in the string using a regular expression.
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regex = /([a-z]+)="([a-z0-9 ?]+)"/g;
let match, output = {};
while (match = regex.exec(str)) {
output[match[1]] = match[2];
}
console.log(output);
If not exist a unique pattern, it is hard, but when pattern is the same, you can split the string:
var str = '::student{name="this is the name" age="21" faculty="some faculty"}'
console.log(createObj(str));
function createObj(myString){
if(myString.search("::") !== 0 && myString.search("{}") === -1 && myString.search("}") === -1){
console.log("format error")
return null;
}
var a = myString.split("{")[1];
var c = a.replace('{','').replace('}','');
if(c.search('= ""') !== -1){
console.log("format incorrect");
return null;
}
var d = c.split('="')
var keyValue = [];
for(item of d){
var e = item.split('" ')
if(e.length === 1){
keyValue.push(e[0].replace('"',''));
}else{
for(item2 of e){
keyValue.push(item2.replace('"',''));
}
}
}
var myObj = {}
if(keyValue.length % 2 === 0){
for(var i = 0; i<keyValue.length; i=i+2){
myObj[keyValue[i]] = keyValue[i+1]
}
}
return myObj;
}
You could use 2 patterns. The first pattern to match the format of the string and capture the content between the curly braces in a single capture group. The second pattern to get the key value pairs using 2 capture groups.
For the full match you can use
::\w+{([^{}]*)}
::\w+ match :: and 1+ word characters
{ Match the opening curly
([^{}]*) Capture group 1, match from opening till closing curly
} Match the closing curly
Regex demo
For the keys and values you can use
(\w+)="([^"]+)"
(\w+) Capture group 1, match 1+ word chars
=" Match literally
([^"]+) Capture group 2, match from an opening till closing double quote
" Match closing double quote
Regex demo
const str = '::student{name="this is the name" age="21" faculty="some faculty"}';
const regexFullMatch = /::\w+{([^{}]*)}/;
const regexKeyValue = /(\w+)="([^"]+)"/g;
const m = str.match(regexFullMatch);
if (m) {
dic = Object.fromEntries(
Array.from(m[1].matchAll(regexKeyValue), v => [v[1], v[2]])
);
console.log(dic)
}

Split and replace text by two rules (regex)

I trying to split text by two rules:
Split by whitespace
Split words greater than 5 symbols into two separate words like (aaaaawww into aaaaa- and www)
I create regex that can detect this rules (https://regex101.com/r/fyskB3/2) but can't understand how to make both rules work in (text.split(/REGEX/)
Currently regex - (([\s]+)|(\w{5})(?=\w))
For example initial text is hello i am markopollo and result should look like ['hello', 'i', 'am', 'marko-', 'pollo']
It would probably be easier to use .match: match up to 5 characters that aren't whitespace:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer';
console.log(
str.match(/[^ ]{1,5}/g)
)
My approach would be to process the string before splitting (I'm a big fan of RegEx):
1- Search and replace all the 5 consecutive non-last characters with \1-.
The pattern (\w{5}\B) will do the trick, \w{5} will match 5 exact characters and \B will match only if the last character is not the ending character of the word.
2- Split the string by spaces.
var text = "hello123467891234 i am markopollo";
var regex = /(\w{5}\B)/g;
var processedText = text.replace(regex, "$1- ");
var result = processedText.split(" ");
console.log(result)
Hope it helps!
Something like this should work:
const str = "hello i am markopollo";
const words = str.split(/\s+/);
const CHUNK_SIZE=5;
const out = [];
for(const word of words) {
if(word.length > CHUNK_SIZE) {
let chunks = chunkSubstr(word,CHUNK_SIZE);
let last = chunks.pop();
out.push(...chunks.map(c => c + '-'),last);
} else {
out.push(word);
}
}
console.log(out);
// credit: https://stackoverflow.com/a/29202760/65387
function chunkSubstr(str, size) {
const numChunks = Math.ceil(str.length / size)
const chunks = new Array(numChunks)
for (let i = 0, o = 0; i < numChunks; ++i, o += size) {
chunks[i] = str.substr(o, size)
}
return chunks
}
i.e., first split the string into words on spaces, and then find words longer than 5 chars and 'chunk' them. I popped off the last chunk to avoid adding a - to it, but there might be a more efficient way if you patch chunkSubstr instead.
regex.split doesn't work so well because it will basically remove those items from the output. In your case, it appears you want to strip the whitespace but keep the words, so splitting on both won't work.
Uses the regex expression of #CertainPerformance = [^\s]{1,5}, then apply regex.exec, finally loop all matches to reach the goal.
Like below demo:
const str = 'wqerweirj ioqwejr qiwejrio jqoiwejr qwer qwer'
let regex1 = RegExp('[^ ]{1,5}', 'g')
function customSplit(targetString, regexExpress) {
let result = []
let matchItem = null
while ((matchItem = regexExpress.exec(targetString)) !== null) {
result.push(
matchItem[0] + (
matchItem[0].length === 5 && targetString[regexExpress.lastIndex] && targetString[regexExpress.lastIndex] !== ' '
? '-' : '')
)
}
return result
}
console.log(customSplit(str, regex1))
console.log(customSplit('hello i am markopollo', regex1))

Javascript Regex to split line of log with key value pairs

I have a log like
t=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService
and I want to turn it into a javascript object like
{
t: 2016-08-03T18:47:26+0000,
lvl: dbug
msg: "Event Received"
Service: SomeService
}
But I am having trouble coming up with a regex that will detect the string "Event Received" in the log line.
I want to split the log line by space but because of the string it is much more difficult.
I am trying to come up with a regex that will detect the fields and parameters so that I can isolate them and split with the equal sign.
I suggest a regex without any lookahead:
var re = /(\w+)=(?:"([^"]*)"|(\S*))/g;
See the regex demo
The point is that the first group ((\w+)) captures the attribute name and the 2nd and 3rd are placed into a non-capturing "container" as alternative branches. Their values can be checked and then either one will be used to fill out the object.
Pattern details:
(\w+) - Group 1 (attribute name) matching 1+ word chars (from [a-zA-Z0-9_] ranges)
= - an equal sign
(?:"([^"]*)"|(\S*)) - a non-capturing "container" group matching either of the two alternatives:
"([^"]*)" - a quote, then Group 2 capturing 0+ chars other than ", and a quote
| - or
(\S*) - Group 3 capturing 0+ non-whitespace symbols.
var rx = /(\w+)=(?:"([^"]*)"|(\S*))/g;
var s = "t=2016-08-03T18:47:26+0000 lvl=dbug msg=\"Event Received\" Service=SomeService";
var obj = {};
while((m=rx.exec(s))!==null) {
if (m[2]) {
obj[m[1]] = m[2];
} else {
obj[m[1]] = m[3];
}
}
console.log(obj);
You can use this regex to capture various name=value pairs:
/(\w+)=(.*?)(?= \w+=|$)/gm
RegEx Demo
Code:
var re = /(\w+)=(.*?)(?= \w+=|$)/gm;
var str = 't=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService';
var m;
var result = {};
while ((m = re.exec(str)) !== null) {
if (m.index === re.lastIndex)
re.lastIndex++;
result[m[1]] = m[2];
}
console.log(result);
Use this pattern:
/^t=([^ ]+) lvl=([^ ]+) msg=(.*?[a-z]") Service=(.*)$/gm
Online Demo
To achieve expected result, use below
var x = 't=2016-08-03T18:47:26+0000 lvl=dbug msg="Event Received" Service=SomeService';
var y = x.replace(/=/g,':').split(' ');
var z = '{'+ y+'}';
console.log(z);
http://codepen.io/nagasai/pen/oLPRAy

Extracting a List of Decimal numbers

I have a textarea that will have one or more pairings that look like this,
Lat = 38.7970308
Long = -100.8665928
How can I extract the numbers using Javascript (negative sign included if there)? I want all the numbers listed.
You can use this regex:
var s ='Lat = 38.7970308\n' +
'Long = -100.8665928';
var m = s.match(/[+-]?(?:\d+(?:\.\d+)?|\.\d+)/g);
//=> ["38.7970308", "-100.8665928"]
Try this regex:
[\+\-]?\d+.?\d+$
/([+-]?(?:[0-9]+\.)?[0-9]+)/g
http://regex101.com/r/mC0yO2/1
You can use this pattern to basically extract all the numbers:
/-?\d+(?:\.\d+)?/g
If, as in your string example, you want to extract these numbers by pairs, you need to use capturing groups:
/^Lat = (-?\d+(?:\.\d+)?)\s*?\r?\nLong = (-?\d+(?:\.\d+)?)\s*?$/mg
for each pairs founded, the latitude is in the capture group 1 and the longitude in the capture group 2.
Note that this second pattern take in account that literal strings "Lat = " and "Long = " are at the start of a line, always in this order, and only separated by a newline character. (with the m modifier, ^ means "start of the line, and $ means "end of the line"). Trailing spaces are allowed with \s*? for each line.
Example:
var yourString = "Lat = 38.7970308\nLong = -100.8665928";
var myRe = /^Lat = (-?\d+(?:\.\d+)?)\s*?\r?\nLong = (-?\d+(?:\.\d+)?)\s*?$/mg;
var m;
while ((m = myRe.exec(yourString)) !== null) {
console.log(m[1] + "\t" + m[2]+ "\n\n");
}

How to "place-hold" regex matches in array using javascript

If I have a multiple match regex like:
var matches = string.match(/\s*(match1)?\s*(match2)?(match3)?\s*/i);
and if my string that I am testing is like this:
var string = "match1 match3";
is there a way to output this array values:
matches[1] = "match1";
matches[2] = "";
matches[3] = "match3";
Notice: what I would like is for the regex to match the entire thing but "place-hold" in the array the matches it does not find.
Hope this makes sense. Thanks for the help!
There already is a "placeholder". Unmatched groups pop an array index matching the group number with an undefined value. e.g.
var someString = "match2";
var matches = someString.match(/\s*(match1)?\s*(match2)?(match3)?\s*/i);
matches now has
["match2", undefined, "match2", undefined]
Element 0 is the complete regex match and elements 1-3 are the individual groups
So you can do for example...
// check if group1
if (typeof matches[1] != 'undefined')
When you want to compare the string to the regexes, just do an Array join.Something like,
matches[1] = "match1";
matches[2] = "";
matches[3] = "match3";
var mystring = mathches.join("");
The joining character could be anything. You could also do,
var mystring = mathches.join(" ");
EDIT:
Not sure from the problem description, but i think you want the regex to output an array.Something like
text = "First line\nSecond line";
var regex = /(\S+) line\n?/y;
would give,
var match = regex.exec(text);
print(match[1]); // prints "First"
print(regex.lastIndex); // prints 11
More about it here

Categories

Resources