Backreference each character - javascript

For the sake of simplicity & learning something new, please don't suggest using two separate replace functions. I know that's an option but I would rather also know how to do this (or if it's not possible).
'<test></test>'.replace(/<|>/g,'$&'.charCodeAt(0))
This is what I've got so far. This sample code is, as you can tell, for another piece of code to escape HTML entities while still using innerHTML (because I do intend to include a few HTML entities such as small images, so again please don't suggest textContent).
Since I'm trying to replace both < and >, the problem is converting each individual one to their respective character codes. Since regular expressions allow for this "OR" condition as well as backreferences to each one, I'm hoping there's a way to get the reference of each individual character as they're replaced. $& will return <><> (because they're replaced in that order), but I don't know how to get them as they're replaced and take their character codes for the HTML entities. The problem is, I don't know what to use in this case if anything.
If that explanation wasn't clear enough, I want it to be something like this (and this is obviously not going to work, it'll best convey what I mean):
Assuming x is the index of the character being replaced,
'<test></test>'.replace(/<|>/g,'$&'.charCodeAt(x))
Hopefully that makes more sense. So, is this actually possible in some way?

'<test></test>'.replace(/[<>]/g,function(a) {return '&#'+a.charCodeAt(0)+';';});
I've put the characters in a square-bracket-thing (don't know it's proper name). That way you can add whatever characters you want.
The above will return:
<test></test>

Related

Find all urls except markdown link to autolink

I'm looking for a way to find all urls to replace them with a markdown link. However, in my string I already have some URLs already wrapped by the markdown syntax. So I need to ignore those cases. Is it possible with JS regex to do that?
Here's what I currently have:
(?:[^\]\)]|^)(htt\ps?:\/\/(www\.)?[-a-zA-Z0-9#:%._\+~#=]{1,256}\.[a-zA-Z0-9]{1,6}\b([-a-zA-Z0-9#:%_\+.~#?&//=]*))
Only the last line of text shouldn't be captured here:
https://regex101.com/r/qgm8jN/1
I truncated your regex a little bit (I was trying to remove variables), feel free to add stuff back as you see fit.
https://regex101.com/r/iVQvEu/1
primarily what you need to know is negative look behinds (http://www.regular-expressions.info/lookaround.html)
(?<!not)website
will match website as long as it is not preceded by not
so in your case, i used (?<!\]\() which looks for ]( and will return the website as long as the markdown symbols are not in front of it.
Again feel free to add/subtract to it, but that should at least point you in the right direction.

Can't figure out regex match, how far off am I?

Super basic question (which I admittedly probably shouldn't post on SO, but here goes): I'm trying to create a regex to match (and exclude from a function) all links that begins with //images, but I cannot for the life of me get it to work.
Right now I have:
$('a:not([href^=/\/\images])').each(function(){
But since that clearly isn't correct, and I really don't know where to begin looking to set it straight, I'm asking for advice. I'm used to escaping strings etc., but this seems to be quite different(?).
Also, I tried using http://regexr.com/ and https://txt2re.com/ (didn't even understand how to work the latter...) but honestly I feel like I have no clue as to what to put where at this point since I'm basically guessing.
Again, sorry for posting such a basic question, but right now this is all gibberish to me, sadly.
This has little to do with jQuery or regular expressions. What you're asking about is a CSS selector which has different behaviour altogether. Your problem is with /\/\ vs \/\/.
Note that you don't actually have to escape the slashes here.
let images = document.querySelectorAll('a:not([href^="//images"])');
console.log(images);
cat
dog
fox
fox
That is not regex, that is JQuery selector. This is how you can match them:
$("a:not([href^='//images'])").each(function(){
console.log($(this).attr('href'));
});

RegExp must have \w+ and \s+ characters

I've been trying to create a RegExp that makes sure a sure has entered at least one word and at least one space. I tried to use this:
/\w+\s+/
But that makes sure that there is a word AFTER a space. I just want to make sure there is both in a string. They don't need to be in the order of the above RegExp.
How can I make the RegExp work, but without matching the order?
/(?=.*?\w)(?=.*?\s)/
?= means "look-ahead", and .* means "any number of characters"
So "find any number of characters then a \w", "find any number of characters and a \s"
Another thing to note about how this works, look-aheads are "non-matching", making it so that this can match in any order.
You have two things:
Is there a word character?
Is there a space?
Two things.
str.match(/\w/)
str.match(/\s/)
So why are you trying to do them as one step?
if( str.match(/\w/) && str.match(/\s/))
There are a lot of answers to my question. However, I do not want to simply pick the one that is upvoted. Please give a detailed explanation of why your regex works, and maybe why mine doesn't.
My answer provides the simplest solution. It is very clear to anyone reading it that we are checking "if it has a word character, and if it contains a space character". It is also very easy to expand on, such as if you want to add another check.
zyklus' answer (/(?=.*?\w)(?=.*?\s)/) is the fastest when speed-tested on a 50Kb string of input. In more common cases (ie. 100 character at most), this speed difference will be practically non-existent. It is twice as fast as my answer, but "2 * very small number = very small number". It's easy enough to add new test cases (just add another (?=.*something) block) but is less humanly-obvious as to what it does.
Jacob's answer ((\w+.*\s+)|(\s+.*\w+)) does quite literally what you asked, checking first if there is a word character and then a space character, then checks the other way around before failing. It works, however it is slower. Furthermore, if you decide to add a new test case, you'd get something like (\w+.*\s+.*\d+)|(\w+.*\d+.*\s)|(\s+.*\w+.*\d+)|(\s+.*\d+.*\w+)|(\d+.*\w+.*\s+)|‌​(\d+.*\s+.*\w+). It only gets worse if you add a fourth test (24 arrangements to check) and is unreadably ugly. Do not use this answer.
Other answers are variants of existing ones.
If you need to do it in one RegEx for some reason:
(\w+.*\s+)|(\s+.*\w+)
Can be handy if you're working with a library that only enables you to use a single regular expression.

Javascript RegExp parse URL make hyperlinks ignore img src

I am not very good with Regular Expressions, some times I can figure them out but...
I need to parse text strings (for a chat room project).
So as you would imagine any pasted URLs need to be converted to click-able hyper links.
I use this RegExp for that, cobbled together from examples I have found on the net. It appears to work quite well :
/[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:~;#'#%&.=\]\[\*\$\!\?\/\,]+/g
Now another part of my project has to insert images in other words :
<img src="http://path/to/image" alt="alt" />
So I need the reg exp to ignore those, and I tried this :
/(?!src=")[A-Za-z]+:\/\/[A-Za-z0-9-_]+\.[A-Za-z0-9-_:~;#'#%&.=\]\[\*\$\!\?\/\,]+/g
But it doesn't work. Perhaps my expression is faulty or I am going about it the wrong way.
I may just mask out 'src="http' and run my expression then reapply what I masked out.
But before I do that I thought I would see if anyone here has any ideas.
Many thanks.
(?!src=")
is a negative lookahead, what you want there is a lookbehind, which javascript does not support.

Urdu characters joining problem

I'm trying to write Urdu language characters in a div using javascript. The problem is that they dont change their shape when i write two characters that should have different shape when written together. For example ﺝ and ا when written together should look as جا. They dont merge with each other. Similar is the problem with other characters. Please help!
I got the answer. Actually I was copying other characters that visually looked the same but where not the one I needed. For example, one character from urdu and other from arabic will not join properly. So whenever copying characters, even if they look same, do consider that they may have different unicodes for different languages.

Categories

Resources