Custom javascript match not working on specific variables

Custom javascript match not working on specific variables - javascript

I have a very specific issue on custom javascript code in gtm.
I have a regex match that defines the output variable.
It works for 99% of the tests, but bugs on a few ones, and I have no idea why.
The goal is to format output value to use as a content grouping in GA.
Here are 2 of the strings that don't trigger the regex:
Checkout-Login
__SYSTEM__Page-Render
And here is the code :
if({{page_template}}.test(/(__SYSTEM__Page-Render|Blog)/g)){grouping = "Content pages";}
else if({{page_template}}.test(/(Checkout|Shipping|Payment|Order)/g)){grouping = "Checkout";}
Would you have any idea why it doesn't work ?
Thanks

Related

How does elasticsearch find escaped character or reserved character correctly?

Given that I have a string such as 'message-ID: 1394.00 This is Henry.Lin',
I want to use elasticsearch to find all the phrase or word contains '.'. In this case, 1394.00 and Henry.Lin are the words I am looking for. However, when I index my document with standard analyzer is not working. I understand that standard analyzer will escape such character. Therefore, I change the analyzer to ngram. Unfortunately, it is still not working. It would be great if someone can help me out.

You can add a custom character filter for dot. Replace "." with "dot". Just use the custom mapping like this:
"char_filter": {
"&_to_and": {
"type": "mapping",
"mappings": [ ".=>dot"]
}}
Please check this documentation for more details.
Now about, why ngram is not working ?
The question is how are you using ngram - as a tokenizer or token filter with some other analyzer? What's the min_gram, max_gram size? Checkout this example to clear the difference between two.
Also to understand more about how your data is getting indexes in elasticsearch and why not matching your query - try using termvectors api.
Finally I would not suggest you to use ngrams for solving this issue for following reasons - 1) n-grams are going to make your index larger , 2) They have totally different use case.

Partial matching a string against a regex

Suppose that I have this regular expression: /abcd/
Suppose that I wanna check the user input against that regex and disallow entering invalid characters in the input. When user inputs "ab", it fails as an match for the regex, but I can't disallow entering "a" and then "b" as user can't enter all 4 characters at once (except for copy/paste). So what I need here is a partial match which checks if an incomplete string can be potentially a match for a regex.
Java has something for this purpose: .hitEnd() (described here http://glaforge.appspot.com/article/incomplete-string-regex-matching) python doesn't do it natively but has this package that does the job: https://pypi.python.org/pypi/regex.
I didn't find any solution for it in js. It's been asked years ago: Javascript RegEx partial match
and even before that: Check if string is a prefix of a Javascript RegExp
P.S. regex is custom, suppose that the user enters the regex herself and then tries to enter a text that matches that regex. The solution should be a general solution that works for regexes entered at runtime.

Looks like you're lucky, I've already implemented that stuff in JS (which works for most patterns - maybe that'll be enough for you). See my answer here. You'll also find a working demo there.
There's no need to duplicate the full code here, I'll just state the overall process:
Parse the input regex, and perform some replacements. There's no need for error handling as you can't have an invalid pattern in a RegExp object in JS.
Replace abc with (?:a|$)(?:b|$)(?:c|$)
Do the same for any "atoms". For instance, a character group [a-c] would become (?:[a-c]|$)
Keep anchors as-is
Keep negative lookaheads as-is
Had JavaScript have more advanced regex features, this transformation may not have been possible. But with its limited feature set, it can handle most input regexes. It will yield incorrect results on regex with backreferences though if your input string ends in the middle of a backreference match (like matching ^(\w+)\s+\1$ against hello hel).

As many have stated there is no standard library, fortunately I have written a Javascript implementation that does exactly what you require. With some minor limitation it works for regular expressions supported by Javascript.
see: incr-regex-package.
Further there is also a react component that uses this capability to provide some useful capabilities:
Check input as you type
Auto complete where possible
Make suggestions for possible input values
Demo of the capabilities Demo of use

I think that you have to have 2 regex one for typing /a?b?c?d?/ and one for testing at end while paste or leaving input /abcd/
This will test for valid phone number:
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^\d{0,3}-?\d{0,3}-?\d{0,3}$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^\d{3}-?\d{3}-?\d{3}-?$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">
And this is case for name surname
const input = document.getElementById('input')
let oldVal = ''
input.addEventListener('keyup', e => {
if (/^[A-Z]?[a-z]*\s*[A-Z]?[a-z]*$/.test(e.target.value)){
oldVal = e.target.value
} else {
e.target.value = oldVal
}
})
input.addEventListener('blur', e => {
console.log(/^[A-Z][a-z]+\s+[A-Z][a-z]+$/.test(e.target.value) ? 'valid' : 'not valid')
})
<input id="input">

This is the hard solution for those who think there's no solution at all: implement the python version (https://bitbucket.org/mrabarnett/mrab-regex/src/4600a157989dc1671e4415ebe57aac53cfda2d8a/regex_3/regex/_regex.c?at=default&fileviewer=file-view-default) in js. So it is possible. If someone has simpler answer he'll win the bounty.
Example using python module (regular expression with back reference):
$ pip install regex
$ python
>>> import regex
>>> regex.Regex(r'^(\w+)\s+\1$').fullmatch('abcd ab',partial=True)
<regex.Match object; span=(0, 7), match='abcd ab', partial=True>

You guys would probably find this page of interest:
(https://github.com/desertnet/pcre)
It was a valiant effort: make a WebAssembly implementation that would support PCRE. I'm still playing with it, but I suspect it's not practical. The WebAssembly binary weighs in at ~300K; and if your JS terminates unexpectedly, you can end up not destroying the module, and consequently leaking significant memory.
The bottom line is: this is clearly something the ECMAscript people should be formalizing, and browser manufacturers should be furnishing (kudos to the WebAssembly developer into possibly shaming them to get on the stick...)
I recently tried using the "pattern" attribute of an input[type='text'] element. I, like so many others, found it to be a letdown that it would not validate until a form was submitted. So a person would be wasting their time typing (or pasting...) numerous characters and jumping on to other fields, only to find out after a form submit that they had entered that field wrong. Ideally, I wanted it to validate field input immediately, as the user types each key (or at the time of a paste...)
The trick to doing a partial regex match (until the ECMAscript people and browser makers get it together with PCRE...) is to not only specify a pattern regex, but associated template value(s) as a data attribute. If your field input is shorter than the pattern (or input.maxLength...), it can use them as a suffix for validation purposes. YES -this will not be practical for regexes with complex case outcomes; but for fixed-position template pattern matching -which is USUALLY what is needed- it's fine (if you happen to need something more complex, you can build on the methods shown in my code...)
The example is for a bitcoin address [ Do I have your attention now? -OK, not the people who don't believe in digital currency tech... ] The key JS function that gets this done is validatePattern. The input element in the HTML markup would be specified like this:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="90"
pattern="^(bc(0([ac-hj-np-z02-9]{39}|[ac-hj-np-z02-9]{59})|1[ac-hj-np-z02-9]{8,87})|[13][a-km-zA-HJ-NP-Z1-9]{25,34})$"
data-entry-templates="['bc099999999999999999999999999999999999999999999999999999999999','bc1999999999999999999999999999999999999999999999999999999999999999999999999999999999999999','19999999999999999999999999999999999']"
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Credit goes to this post: "RegEx to match Bitcoin addresses?
" Note to old-school bitcoin zealots who will decry the use of a zero in the regex here -it's just an example for accomplishing PRELIMINARY validation; the server accepting the address passed off by the browser can do an RPC call after a form post, to validate it much more rigorously. Adjust your regex to suit.]
The exact choice of characters in the data-entry-template was a bit arbitrary; but they had to be ones such that if the input being typed or pasted by the user is still incomplete in length, it will use them as an optimistic stand-in and the input so far will still be considered valid. In the example there, for the last of the data-entry-templates ('19999999999999999999999999999999999'), that was a "1" followed by 39 nines (seeing as how the regex spec "{25,39}" dictates that a maximum of 39 digits in the second character span/group...) Because there were two forms to expect -the "bc" prefix and the older "1"/"3" prefix- I furnished a few stand-in templates for the validator to try (if it passes just one of them, it validates...) In each template case, I furnished the longest possible pattern, so as to insure the most permissive possibility in terms of length.
If you were generating this markup on a dynamic web content server, an example with template variables (a la django...) would be:
<input id="forward_address"
name="forward_address"
type="text"
maxlength="{{MAX_BTC_ADDRESS_LENGTH}}"
pattern="{{BTC_ADDRESS_REGEX}}" {# base58... #}
data-entry-templates="{{BTC_ADDRESS_TEMPLATES}}" {# base58... #}
onkeydown="return validatePattern(event)"
onpaste="return validatePattern(event)"
required
/>
[Keep in mind: I went to the deeper end of the pool here. You could just as well use this for simpler patterns of validation.]
And if you prefer to not use event attributes, but to transparently hook the function to the element's events at document load -knock yourself out.
You will note that we need to specify validatePattern on three events:
The keydown, to intercept delete and backspace keys.
The paste (the clipboard is pasted into the field's value, and if it works, it accepts it as valid; if not, the paste does not transpire...)
Of course, I also took into account when text is partially selected in the field, dictating that a key entry or pasted text will replace the selected text.
And here's a link to the [dependency-free] code that does the magic:
https://gitlab.com/osfda/validatepattern.js
(If it happens to generate interest, I'll integrate constructive and practical suggestions and give it a better readme...)
PS: The incremental-regex package posted above by Lucas Trzesniewski:
Appears not to have been updated? (I saw signs that it was undergoing modification??)
Is not browserified (tried doing that to it, to kick the tires on it -it was a module mess; welcome anyone else here to post a browserified version for testing. If it works, I'll integrate it with my input validation hooks and offer it as an alternative solution...) If you succeed in getting it browserfied, maybe sharing the exact steps that were needed would also edify everyone on this post. I tried using the esm package to fix version incompatibilities faced by browserify, but it was no go...

I strongly suspect (although I'm not 100% sure) that general case of this problem has no solution the same way as famous Turing's "Haltin problem" (see Undecidable problem). And even if there is a solution, it most probably will be not what users actually want and thus depending on your strictness will result in a bad-to-horrible UX.
Example:
Assume "target RegEx" is [a,b]*c[a,b]* also assume that you produced a reasonable at first glance "test RegEx" [a,b]*c?[a,b]* (obviously two c in the string is invalid, yeah?) and assume that the current user input is aabcbb but there is a typo because what the user actually wanted is aacbbb. There are many possible ways to fix this typo:
remove c and add it before first b - will work OK
remove first b and add after c - will work OK
add c before first b and then remove the old one - Oops, we prohibit this input as invalid and the user will go crazy because no normal human can understand such a logic.
Note also that your hitEnd will have the same problem here unless you prohibit user to enter characters in the middle of the input box that will be another way to make a horrible UI.
In the real life there would be many much more complicated examples that any of your smart heuristics will not be able to account for properly and thus will upset users.
So what to do? I think the only thing you can do and still get reasonable UX is the simplest thing you can do i.e. just analyze your "target RegEx" for set of allowed characters and make your "test RegEx" [set of allowed chars]*. And yes, if the "target RegEx" contains . wildcart, you will not be able to do any reasonable filtering at all.

create word search puzzle using jquery for non english characters

I am trying to create a word search puzzle game using this
http://jsfiddle.net/sirmarcio/jtn0mchd/
this is the problem area
$(document).ready( function () {
var words = "Raúl,María,heißt,TOUCH";
$("#cuadricula").wordsearchwidget({"wordlist" : words,"gridsize" : 10});
});
i am not pasting whole code as it is available on JSfiddle, its not my code i am just using it for reference
and referring this
Placing words in table grid in word search puzzle?
as you can see i am trying few non english words, while words like
Raúl,María work properly
heißt
ends up as HEISST ,
does javascript and for that matter jquery support such words , and how can i use heißt as it is, any reference so that these words donot end up splitting the way they are.

The HTTP headers must indicate UTF-8, if you are using UTF-8.
This appears to be a problem with jsfiddle not recognizing that you have UTF-8 in your copy/paste snippet and then sending you the JS responses without the header.
Naturally, if you have problems with headers, you're not going to want to be testing it using third party vendors, definitely not jsfiddle, at least.

How to filter emojis from string jquery/javascript?

I'm using the following to exclude emojis/emoticons from a string in php. How do I do the same with javascript or jQuery?
preg_replace('/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F6FF}][\x{FE00}-\x{FEFF}]?/u', '', $text);
This is what I try to do
$('#edit.popup .btn.save').live('click',function(e) {
var item_id = $(this).attr('id');
var edited_text = $('#edit.popup textarea').val().replace(/([0-9|#][\x{20E3}])|[\x{00ae}|\x{00a9}|\x{203C}|\x{2047}|\x{2048}|\x{2049}|\x{3030}|\x{303D}|\x{2139}|\x{2122}|\x{3297}|\x{3299}][\x{FE00}-\x{FEFF}]?|[\x{2190}-\x{21FF}][\x{FE00}-\x{FEFF}]?|[\x{2300}-\x{23FF}][\x{FE00}-\x{FEFF}]?|[\x{2460}-\x{24FF}][\x{FE00}-\x{FEFF}]?|[\x{25A0}-\x{25FF}][\x{FE00}-\x{FEFF}]?|[\x{2600}-\x{27BF}][\x{FE00}-\x{FEFF}]?|[\x{2900}-\x{297F}][\x{FE00}-\x{FEFF}]?|[\x{2B00}-\x{2BF0}][\x{FE00}-\x{FEFF}]?|[\x{1F000}-\x{1F6FF}][\x{FE00}-\x{FEFF}]?/u, '');
$('#grid li.image#' + item_id + ' img').attr('data-text', edited_text);
});
I found this suggestion in another post on Stack Overflow, but it's not working. It's still allowing emojis from ex ios.
.replace(/([\uE000-\uF8FF]|\uD83C[\uDF00-\uDFFF]|\uD83D[\uDC00-\uDDFF])/g, '')
What I try to achieve is to not allow emojis in textfield, and if an emoji is inserted (from ex ios keyboard) it will be replaced by nothing. It works with php. Someone here who can help me out with this?

Based on the answer from mb21, this regex did the job. No loop required!
/[\uD800-\uDBFF][\uDC00-\uDFFF]/g

As pointed out in this answer, JavaScript doesn't support Unicode code points outside the Basic Multilingual Plane (where iOS emojis lie).
I highly recommend reading The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!). Then you'll understand what was meant with:
So some indirect approach is needed. Cf. to JavaScript strings outside of the BMP.
For example, you could look for code points in the range [\uD800-\uDBFF] (high surrogates), and when you find one, check that the next code point in the string is in the range [\uDC00-\uDFFF] (if not, there is a serious data error), interpret the two as a Unicode character, and replace them by whatever you wish to put there. This looks like a job for a simple loop through the string, rather than a regular expression.

Is AngularJS parsing the value incorrectly?

I have an extremely simple example here: http://jsfiddle.net/daylight/hqHSr/
To try it, just go to the fiddle and click the [Add Item] button to add some rows.
Next, click the check box next to any item and you'll see something similar to the following:
Problem: Only Displays Numeric Part
The problem is that the value should display the entire string shown in the row. In the example that means it should display: 86884-LLMUL.
Notice that it only displays the numeric part of the value.
If you look at the control you'll see that I'm using an input of type="text".
Also, if you look at the model (simpleItem) object you'll see that the one property it has is a string.
The JavaScript for the model class looks like:
function simpleItem(id) {
this.id = id;
}
My Attempt To Force to String Type
When I generate each of the simpleItems I even go so far as to set them to a character when I call the constructor (just to force the id to be set to a string type).
Here's the code that initializes each of the simpleItem ids:
currentItem.id = getRandom(100000).toString() + "-" + getRandomLetters(5).toUpperCase();
You can see the rest of the code in the fiddle, but the thing is I generate a random value and concatenate the value together with a hyphen and 5 letters. It's just a silly little piece of code for this sample.
But now, here is the part where it gets really odd.
If I simply change the hyphen - to another character like an uppercase X I get an error each time I click on the checkbox.
Here's the changed code and the new output, which you can see at the revised fiddle: fiddle version 2
currentItem.id = getRandom(100000).toString() + "X" + getRandomLetters(5).toUpperCase();
Also, now if you open Dev Tools in your browser you'll see in the console that Angular is now reporting an error each time you click the [Add Item] button. It looks like:
Adding Single-quotes ?Fixes? It
If you go up to the HTML and alter the following line from this:
ng-init="itemId ={{l.id.toString()}}"
to this
ng-init="itemId ='{{l.id.toString()}}'"
Now when you run it, the error will go away and it will work as you can see at the updated fiddle here: fiddle Version 3
Angular : Converts Hyphen to Minus Sign?
You see, Angular seems to be converting it to a numeric, attempting to do math on it (parsing the hyphen as a minus sign) and then truncating the string portion. That all seems to work when I use a hyphen, but when I use a X then Angular chokes.
Here's what it looks like when you add the single-quotes - of course the angular errors in Dev Tools console go away too.
Angular Forces to Numeric Type?
Why would this occur in Angular? It's as if it is attempting to force my string value to a numeric even though the INPUT element is type text and the JavaScript var is type string.
Anyone have ideas about this?
What About the Asterisk (multiplication symbol)?
Right as I was completing this I wondered what would happen if I changed the - to a * and ran it again. This time I saw the error below, which is indicative that something is attempting to convert to numeric.

This is the expected behavior. Angular is merely interpolating the text you have in your scope into the ng-init expression using scope.$eval and then executing that expression. This has very little to do with what is the type of the input box of the rest of the surrounding context.
It is definitely not desirable that Angular should wrap any interpolation it does in quotes, it'll break its use in all other places such as class="my-class {{dynamic-class}}".

Replace your ng-init with
ng-init="itemId =l.id.toString()"
In following with the docs, you should only use init in special circumstances anyway, you should rely on your controller for this. http://docs.angularjs.org/api/ng.directive:ngInit

I think we're just getting confused with Angular's weirdness. Basically, you're giving angular a string which it's turning into a javascript expression because it's in a {{}}. It's already, explicitly, a string (between the double-quotes):
ng-init="itemId ={{l.id.toString()}}"
It's apparently ignoring the fact that you're saying "hey, no really, this is a real string" with your l.id.toString(). It doesn't care. It's already a string and is going to evaluate it.
Just use the single quotes?
If you ng-init itemId={{undefined===undefined}}, what would you expect to happen? (it prints "true" in the alert).
Same with this: (undefined === undefined is in quotes) ng-init itemId={{'undefined===undefined'}}; prints true in the alert.

ng-init expects an angular expression. You don't have to use curly brackets there. You can simply write it like this:
ng-init="itemId=l.id" ng-click="checkBoxClicked(itemId)"

Develop Reference

JavaScript is the programming language of the Web.