Extract properties from a string value in GTM

Extract properties from a string value in GTM - javascript

I'm trying to pull out information from an old AWIN tag we have on the site with GTM. We're working on getting this pushed into the DataLayer, but that will take a while, so this is the next step for the time being.
Ive managed to pull the information into a string in GTM which is returning a value of the below (ive manually removed the values for this post), which is great:
'/* Do not change / var AWIN = {}; AWIN.Tracking = {};
AWIN.Tracking.Sale = {}; / Set your transaction parameters */
AWIN.Tracking.Sale.amount = "00.00"; AWIN.Tracking.Sale.channel =
"aw"; AWIN.Tracking.Sale.currency = "GBP"; AWIN
.Tracking.Sale.orderRef = "00000"; AWIN.Tracking.Sale.parts =
"DEFAULT:00.00" ; AWIN.Tracking.Sale.test = "0";
AWIN.Tracking.Sale.voucher = "";'
The only part i need is the value of
AWIN.Tracking.Sale.parts.
The script we've created to extract this is:
function() {
var awintrackstr = {{DOM - AWIN Image Full}};
return awintrackstr.match(/AWIN.Tracking.Sale.parts = \"(.*)\";$/)[1];
}
However, this is extracting everything past that the value we need:
'DEFAULT:00:00"; AWIN.Tracking.Sale.test = "0"; AWIN.Tracking.Sal....
All the tests we've created shows the above should work, but its not working in GTM
Has anyone got any ideas of how this should work in GTM? Again, all we're looking to exctract is the part that says DEFAULT:00.00.
Thanks in advance

This is because of the "(.*)" part in your regular expression.
.* will match anything, including other " characters, making it match up to the last " that is still followed by the rest of your regular expression.
Replace "(.*)" with "([^"]*)", this will match any character that is not ".
I can also recommend using regex101.com whenever you need to write a regular expression. Using this, you will also notice the " character has no special meaning in a javascript regular expression, so there is no need to escape it.
Edit: here is the modified version of your regular expression at work: https://regex101.com/r/TPUU6z/1

Related

How to get a value from page source code from a function tag?

here is the function from the source code
function dosubmit()
{
if (getObj("Frm_Username").value == "")
{
getObj("errmsg").innerHTML = "Username cannot be empty.";
getObj("myLayer").style.visibility = "visible" ;
return;
}
else
{
getObj("LoginId").disabled = true;
getObj("Frm_Logintoken").value = "3";
document.fLogin.submit();
}
}
i want to get the value of getObj("Frm_Logintoken") as i can't pull the value
from #Frm_Logintoken
using document.getElementById("#Frm_Logintoken") this gives me null
because Frm_Logintoken only gets it's value when i click submit .
<input type="hidden" name="Frm_Logintoken" id="Frm_Logintoken" value="">
full page code
i found this online /getObj\("Frm_Logintoken"\).value = "(.*)";/g but when i run it ... it gives me the same line again ! it's full code
another regular expression i found but don't even know how to use it
Example of a regular expression to search:
before_egrep='N1:getObj("Frm_Logintoken").value = "(\w+)"'
Here, N1 is assigned the value of the back reference - the
expression in parentheses. \w + denotes the main compound characters,
this is a synonym for "[_[:alnum:]]". Once again - pay attention to
the brackets - this is the back link. At the same time, there are also
parentheses in the source code fragment - they need to be escaped
i am trying to make an auto login script that works in the background like it
doesn't show the user the login form page but the only the page after it
and i have found this code online too but don't know what's about
it contains xhr .
the line that Attracted my attention is
/getObj\("Frm_Logintoken"\).value = "(.*)";/g
when i run it ... it gives me the line again !
some notes :
i have tried document.getElementById("Frm_Logintoken").value but it gives me empty "" because
Frm_Logintoken only gets it's value when i click submit .
the page will not even accept the correct password if the Frm_Logintoken token value isn't the same as one in the page.
the Frm_Logintoken is a token generated by the page and it basically increment by one on each successful login.

I'm not so sure about suggestion of an expression for helping or ameliorating to help solving your problem, yet if we wish to extract certain attributes and values from the proposed input tag, we would be likely starting with an expression similar to:
name="(.+?)"|id="(.+?)"|value="(.+)?"
which uses alternation to simultaneously collect values of certain attributes, if so would be desired.
Demo
RegEx
If this expression wasn't desired and you wish to modify it, please visit this link at regex101.com.
RegEx Circuit
jex.im visualizes regular expressions:

To get your value you might use a capturing group ([^"]+) and a negated character class:
\bgetObj\("Frm_Logintoken"\)\.value = "([^"]+)";
Regex demo | Javascript demo
For example:
let str = `getObj("Frm_Logintoken").value = "3";`;
let pattern =/\bgetObj\("Frm_Logintoken"\)\.value = "([^"]+)";/;
console.log(str.match(pattern)[1]); //3

Use only one of the characters in regular expression javascript

I guess that should be smth very easy, but I'm stuck with that for at least 2 hours and I think it's better to ask the question here.
So, I've got a reg expression /&t=(\d*)$/g and it works fine while it is not ?t instead of &t in url. I've tried different combinations like /\?|&t=(\d*)$/g ; /\?t=(\d*)$|/&t=(\d*)$/g ; /(&|\?)t=(\d*)$/g and various others. But haven't got the expected result which is /\?t=(\d*)$/g or /&t=(\d*)$/g url part (whatever is placed to input).
Thx for response. I think need to put some details here. I'm actually working on this peace of code
var formValue = $.trim($("#v").val());
var formValueTime = /&t=(\d*)$/g.exec(formValue);
if (formValueTime && formValueTime.length > 1) {
formValueTime = parseInt(formValueTime[1], 10);
formValue = formValue.replace(/&t=\d*$/g, "");
}
and I want to get the t value whether reference passed with &t or ?t in references like youtu.be/hTWKbfoikeg?t=82 or similar one youtu.be/hTWKbfoikeg&t=82

To replace, you may use
var formValue = "some?some=more&t=1234"; // $.trim($("#v").val());
var formValueTime;
formValue = formValue.replace(/[&?]t=(\d*)$/g, function($0,$1) {
formValueTime = parseInt($1,10);
return '';
});
console.log(formValueTime, formValue);
To grab the value, you may use
/[?&]t=(\d*)$/g.exec(formValue);
Pattern details
[?&] - a character class matching ? or &
t= - t= substring
(\d*) - Group 1 matching zero or more digits
$ - end of string

/\?t=(\d*)|\&t=(\d*)$/g
you inverted the escape character for the second RegEx.
http://regexr.com/3gcnu

I want to thank you all guys for trying to help. Special thanks to #Wiktor Stribiżew who gave the closest answer.
Now the piece of code I needed looks exactly like this:
/[?&]t=(\d*)$/g.exec(formValue);
So that's the [?&] part that solved the problem.
I use array later, so /\?t=(\d*)|\&t=(\d*)$/g doesn't help because I get an array like [t&=50,,50] when reference is & type and the correct answer [t?=50,50] when reference is ? type just because of the order of statements in RegExp.
Now, if you're looking for a piece of RegExp that picks either character in one place while the rest of RegExp remains the same you may use smth like this [?&] for the example where wanted characters are ? and &.

How to strip comments from Javascript using PHP

I want to remove the comments from these kind of scripts:
var stName = "MyName"; //I WANT THIS COMMENT TO BE REMOVED
var stLink = "http://domain.com/mydomain";
var stCountry = "United State of America";
What is (the best) ways of accomplish this using PHP?

The best way is to use an actual parser or write at least a lexer yourself.
The problem with Regex is that it gets enormously complex if you take everything into account that you have to.
For example, Cagatay Ulubay's suggested Regex'es /\/\/[^\n]?/ and /\/\*(.*)\*\// will match comments, but they will also match a lot more, like
var a = '/* the contents of this string will be matches */';
var b = '// and here you will even get a syntax error, because the entire rest of the line is removed';
var c = 'and actually, the regex that matches multiline comments will span across lines, removing everything between the first "/*" and here: */';
/*
this comment, however, will not be matched.
*/
While it is rather unlikely that strings contain such sequences, the problem is real with inline regex:
var regex = /^something.*/; // You see the fake "*/" here?
The current scope matters a lot, and you can't possibly know the current scope unless you parse the script from the beginning, character for character.
So you essentially need to build a lexer.
You need to split the code into three different sections:
Normal code, which you need to output again, and where the start of a comment could be just one character away.
Comments, which you discard.
Literals, which you also need to output, but where a comment cannot start.
Now the only literals I can think of are strings (single- and double-quoted), inline regex and template strings (backticks), but those might not be all.
And of course you also have to take escape sequences inside those literals into account, because you might encounter an inline regex like
/^file:\/\/\/*.+/
in which a single-character based lexer would only see the regex /^file:\/ and incorrectly parse the following /*.+ as the start of a multiline comment.
Therefore upon encountering the second /, you have to look back and check if the last character you passed was a \. The same goes for all kinds of quotes for strings.

I would go with preg_replace(). Assuming all comments are single line comments (// Comment here) you can start with this:
$JsCode = 'var stName = "MyName isn\'t \"Foobar\""; //I WANT THIS COMMENT TO BE REMOVED
var stLink = "http://domain.com/mydomain"; // Comment
var stLink2 = \'http://domain.com/mydomain\'; // This comment goes as well
var stCountry = "United State of America"; // Comment here';
$RegEx = '/(["\']((?>[^"\']+)|(?R))*?(?<!\\\\)["\'])(.*?)\/\/.*$/m';
echo preg_replace($RegEx, '$1$3', $JsCode);
Output:
var stName = "MyName isn't \"Foobar\"";
var stLink = "http://domain.com/mydomain";
var stLink2 = 'http://domain.com/mydomain';
var stCountry = "United State of America";
This solution is far from perfect and might have issues with strings containing "//" in them.

regular expression not matching on javascript but matches on other languages

so I've been running around regexp for a while now, and been using RegEx101 to test my patterns, and it never failed (yet).
So I am trying to replace android Emojicons strings to their appropriate HTML image tage via regex, the code seems to match without an issue in the site above, and even works with PHP, but somehow, it doesn't match at all in javascript... so here is my code:
function loadEmojisInMessage(message) {
var regExp = /({emoji:(.*?)})/g; //var regExp = new RegExp("({emoji:(.*?)})","g");
message.replace(regExp, '<img src="emojis/emoji_$2.png" id="$2" class="emojicon" />').toString();
return message;
}
at first I thought I am doing something wrong, so I changed the code to this, just for testing
function loadEmojisInMessage(message) {
var regExp = /({emoji:(.*?)})/g; //var regExp = new RegExp("({emoji:(.*?)})","g");
message.replace(regExp, 'test?').toString();
return message;
}
but even this does not replace at all! (my thought is that it is having an issue matching the pattern in the string :/ )
example strings to match :
{emoji:em_1f50f}
What I am trying to do here is replace the entire string (above) with image HTML tag, while using the second match [it is the second bracket () ] for the URL string
Best Regards
UPDATE :
I forgot to add first matching bracket, sorry!
Also, you can test the pattern here

You're not assigning the result of the replace() method call back to the variable message. If you don't to this, message remains unchanged.
message = message.replace(regExp, '<img src="emojis/emoji_$2.png" id="$2" class="emojicon" />');

JS / RegEx to remove characters grouped within square braces

I hope I can explain myself clearly here and that this is not too much of a specific issue.
I am working on some javascript that needs to take a string, find instances of chars between square brackets, store any returned results and then remove them from the original string.
My code so far is as follows:
parseLine : function(raw)
{
var arr = [];
var regex = /\[(.*?)]/g;
var arr;
while((arr = regex.exec(raw)) !== null)
{
console.log(" ", arr);
arr.push(arr[1]);
raw = raw.replace(/\[(.*?)]/, "");
console.log(" ", raw);
}
return {results:arr, text:raw};
}
This seems to work in most cases. If I pass in the string [id1]It [someChar]found [a#]an [id2]excellent [aa]match then it returns all the chars from within the square brackets and the original string with the bracketed groups removed.
The problem arises when I use the string [id1]It [someChar]found [a#]a [aa]match.
It seems to fail when only a single letter (and space?) follows a bracketed group and starts missing groups as you can see in the log if you try it out. It also freaks out if i use groups back to back like [a][b] which I will need to do.
I'm guessing this is my RegEx - begged and borrowed from various posts here as I know nothing about it really - but I've had no luck fixing it and could use some help if anyone has any to offer. A fix would be great but more than that an explanation of what is actually going on behind the scenes would be awesome.
Thanks in advance all.

You could use the replace method with a function to simplify the code and run the regexp only once:
function parseLine(raw) {
var results = [];
var parsed = raw.replace(/\[(.*?)\]/g, function(match,capture) {
results.push(capture);
return '';
});
return { results : results, text : parsed };
}

The problem is due to the lastIndex property of the regex /\[(.*?)]/g; not resetting, since the regex is declared as global. When the regex has global flag g on, lastIndex property of RegExp is used to mark the position to start the next attempt to search for a match, and it is expected that the same string is fed to the RegExp.exec() function (explicitly, or implicitly via RegExp.test() for example) until no more match can be found. Either that, or you reset the lastIndex to 0 before feeding in a new input.
Since your code is reassigning the variable raw on every loop, you are using the wrong lastIndex to attempt the next match.
The problem will be solved when you remove g flag from your regex. Or you could use the solution proposed by Tibos where you supply a function to String.replace() function to do replacement and extract the capturing group at the same time.

You need to escape the last bracket: \[(.*?)\].

Develop Reference

JavaScript is the programming language of the Web.

Extract properties from a string value in GTM - javascript

Related

How to get a value from page source code from a function tag?

Use only one of the characters in regular expression javascript

How to strip comments from Javascript using PHP

regular expression not matching on javascript but matches on other languages

JS / RegEx to remove characters grouped within square braces

Categories

Resources