Match a words, which contains specific symbols - javascript

It seems that i'm stuck with something simple, but I was unable to find quite similar question on stack.
Using JavaScript/jQuery/regexp I want to match a words that contains specific symbols in string .
I.e in given string 'check out mydomain/folder/#something' if i run this kind of search with symbols 'folder/#' it must return whole mydomain/folder/#something.
In fact I want to use this to replace whole link in a string with some kind of widget button, but as those links are pretty specific (i.e i know that they will contain folder/#) using some kind of library for this task would be overkill.

Here is the regexp you are looking for: /[\w\/]*\/folder\/#[\w\/]*/
var str = 'check out mydomain/folder/#something';
// returns ["mydomain/folder/#something"]
str.match(/[\w\/]*\/folder\/#[\w\/]*/)
Or for a more robust version or it: /[\w\/\.]*\/folder\/#(?:[\w\/]|\.\w+)*/
That last one will accept dots in the file names but ignore the last one.
For instance 'check out my.domain/foo/folder/#some.thing.'will return ["my.domain/foo/folder/#some.thing"]

Related

How do I replace string within quotes in javascript?

I have this in a javascript/jQuery string (This string is grabbed from an html ($('#shortcode')) elements value which could be changed if user clicks some buttons)
[csvtohtml_create include_rows="1-10"
debug_mode="no" source_type="visualizer_plugin" path="map"
source_files="bundeslander_staple.csv" include cols="1,2,4" exclude cols="3"]
In a textbox (named incl_sc) I have the value:
include cols="2,4"
I want to replace include_cols="1,2,4" from the above string with the value from the textbox.
so basically:
How do I replace include_cols values here? (include_cols="2,4" instead of include_cols="1,2,4") I'm great at many things but regex is not one of them. I guess regex is the thing to use here?
I'm trying this:
var s = $('#shortcode').html();
//I want to replace include cols="1,2,4" exclude cols="3"
//with include_cols="1,2" exclude_cols="3" for example
s.replace('/([include="])[^]*?\1/g', incl_sc.val() );
but I don't get any replacement at all (the string s is same string as $("#shortcode").html(). Obviously I'm doing something really dumb. Please help :-)
In short what you will need is
s.replace(/include cols="[^"]+"/g, incl_sc.val());
There were a couple problems with your code,
To use a regex with String.prototype.replace, you must pass a regex as the first argument, but you were actually passing a string.
This is a regex literal /regex/ while this isn't '/actually a string/'
In the text you supplied in your question include_cols is written as include cols (with a space)
And your regex was formed wrong. I recomend testing them in this website, where you can also learn more if you want.
The code above will replace the part include cols="1,2,3" by whatever is in the textarea, regardless of whats between the quotes (as long it doesn't contain another quote).
First of all I think you need to remove the quotes and fix a little bit the regex.
const r = /(include_cols=\")(.*)(\")/g;
s.replace(r, `$1${incl_sc.val()}$3`)
Basically, I group the first and last part in order to include them at the end of the replacement. You can also avoid create the first and last group and put it literally in the last argument of the replace function, like this:
const r = /include_cols=\"(.*)\"/g;
s.replace(r, `include_cols="${incl_sc.val()}"`)

JS: Check if word "handover" contains "hand"

I'm working on this simple, straightforward text content filtering mechanism on our post commenting module where people are prohibited from writing foul, expletive words.
So far I'm able to compare (word-by-word, using .include()) comment contents against the blacklisted words we have in the database. But to save space, time and effort in entering database entries for each word such as 'Fucking' and 'Fuck', I want to create a mechanism where we check if a word contains a blacklisted word.
This way, we just enter 'Fuck' in the database. And when visitor's comment contains 'Fucking' or 'Motherfucker', the function will automatically detect that there is a word in the comment that contain's 'fuck' in it and then perform necessary actions.
I've been thinking of integrating .substring() but I guess that's not what I need.
Btw, I'm using React (in case you know of any built-in functions). Much as possible, I wanna deviate from using libraries for this mechanism.
Thanks a heap!
"handover".indexOf("hand")
It will return index if it exists otherwise -1
To ignore cases you can define all your blacklisted words in lower case and then use this
"HANDOVER".toLowerCase().indexOf("hand")
To detect if a string has another string inside of it you can simply use the .includes method, it does not work on a word by word basis but checks for a sequence of characters so it should meet you requirements. It returns a boolean value for if the string is inside the other string
var sentence = 'Stackoverflow';
console.log(sentence.includes("flow"));
You were on the right track with .includes()
console.log('handover'.includes('hand'));
Returns true

JavaScript split string by specific character string

I have a text box with a bunch of comments, all separated by a specific character string as a means of splitting them to display each comment individually.
The string in question is | but I can change this to accommodate whatever will work. My only requirement is that it is not likely to be a string of characters someone will type in an everyday sentence.
I believe I need to use the split method and possibly some regex but all the other questions I've seen only seem to mention splitting by one character or a number of different characters, not a specific set of characters in a row.
Can anyone point me in the right direction?
.split() should work for that purpose:
var comments = "this is a comment|and here is another comment|and yet another one";
var parsedComments = comments.split('|');
This will give you all comments in an array which you can then loop over or do whatever you have to do.
Keep in mind you could also change | to something like <--NEWCOMMENT--> and it will still work fine inside the split('<--NEWCOMMENT-->') method.
Remember that split() removes the character it's splitting on, so your resulting array won't contain any instances of <--NEWCOMMENT-->

Matching a JS string with regex

I have a long xml raw message that is being stored in a string format. A sample is as below.
<tag1>val</tag><tag2>val</tag2><tagSomeNameXYZ/>
I'm looking to search this string and find out if it contains an empty html tag such as <tagSomeNameXYZ/>. This thing is, the value of SomeName can change depending on context. I've tried using Str.match(/tagSomeNameXYZ/g) and Str.match(/<tag.*.XYZ\/>/g) to find out if it contains exactly that string, but am able to get it return anything. I'm having trouble in writing a reg ex that matches something like <tag*XYZ/>, where * is going to be SomeName (which I'm not interested in)
Tl;dr : How do I filter out <tagSomeNameXYZ/> from the string. Format being : <constant variableName constant/>
Example patterns that it should match:
<tagGetIndexXYZ/>
<tagGetAllIndexXYZ/>
<tagGetFooterXYZ/>
The issue you have with Str.match(/<tag.*.XYZ\/>/g) is the .* takes everything it sees and does not stop at the XYZ as you wish. So you need to find a way to stop (e.g. the [^/]* means keep taking until you find a /) and then work back from there (the slice).
Does this help
testString = "<tagGetIndexXYZ/>"
res = testString.match(/<tag([^/]*)\/\>/)[1].slice(0,-3)
console.log(res)

regex replace on JSON is removing an Object from Array

I'm trying to improve my understanding of Regex, but this one has me quite mystified.
I started with some text defined as:
var txt = "{\"columns\":[{\"text\":\"A\",\"value\":80},{\"text\":\"B\",\"renderer\":\"gbpFormat\",\"value\":80},{\"text\":\"C\",\"value\":80}]}";
and do a replace as follows:
txt.replace(/\"renderer\"\:(.*)(?:,)/g,"\"renderer\"\:gbpFormat\,");
which results in:
"{"columns":[{"text":"A","value":80},{"text":"B","renderer":gbpFormat,"value":80}]}"
What I expected was for the renderer attribute value to have it's quotes removed; which has happened, but also the C column is completely missing! I'd really love for someone to explain how my Regex has removed column C?
As an extra bonus, if you could explain how to remove the quotes around any value for renderer (i.e. so I don't have to hard-code the value gbpFormat in the regex) that'd be fantastic.
You are using a greedy operator while you need a lazy one. Change this:
"renderer":(.*)(?:,)
^---- add here the '?' to make it lazy
To
"renderer":(.*?)(?:,)
Working demo
Your code should be:
txt.replace(/\"renderer\"\:(.*?)(?:,)/g,"\"renderer\"\:gbpFormat\,");
If you are learning regex, take a look at this documentation to know more about greedyness. A nice extract to understand this is:
Watch Out for The Greediness!
Suppose you want to use a regex to match an HTML tag. You know that
the input will be a valid HTML file, so the regular expression does
not need to exclude any invalid use of sharp brackets. If it sits
between sharp brackets, it is an HTML tag.
Most people new to regular expressions will attempt to use <.+>. They
will be surprised when they test it on a string like This is a
first test. You might expect the regex to match and when
continuing after that match, .
But it does not. The regex will match first. Obviously not
what we wanted. The reason is that the plus is greedy. That is, the
plus causes the regex engine to repeat the preceding token as often as
possible. Only if that causes the entire regex to fail, will the regex
engine backtrack. That is, it will go back to the plus, make it give
up the last iteration, and proceed with the remainder of the regex.
Like the plus, the star and the repetition using curly braces are
greedy.
Try like this:
txt = txt.replace(/"renderer":"(.*?)"/g,'"renderer":$1');
The issue in the expression you were using was this part:
(.*)(?:,)
By default, the * quantifier is greedy by default, which means that it gobbles up as much as it can, so it will run up to the last comma in your string. The easiest solution would be to turn that in to a non-greedy quantifier, by adding a question mark after the asterisk and change that part of your expression to look like this
(.*?)(?:,)
For the solution I proposed at the top of this answer, I also removed the part matching the comma, because I think it's easier just to match everything between quotes. As for your bonus question, to replace the matched value instead of having to hardcode gbpFormat, I used a backreference ($1), which will insert the first matched group into the replacement string.
Don't manipulate JSON with regexp. It's too likely that you will break it, as you have found, and more importantly there's no need to.
In addition, once you have changed
'{"columns": [..."renderer": "gbpFormat", ...]}'
into
'{"columns": [..."renderer": gbpFormat, ...]}' // remove quotes from gbpFormat
then this is no longer valid JSON. (JSON requires that property values be numbers, quoted strings, objects, or arrays.) So you will not be able to parse it, or send it anywhere and have it interpreted correctly.
Therefore you should parse it to start with, then manipulate the resulting actual JS object:
var object = JSON.parse(txt);
object.columns.forEach(function(column) {
column.renderer = ghpFormat;
});
If you want to replace any quoted value of the renderer property with the value itself, then you could try
column.renderer = window[column.renderer];
Assuming that the value is available in the global namespace.
This question falls into the category of "I need a regexp, or I wrote one and it's not working, and I'm not really sure why it has to be a regexp, but I heard they can do all kinds of things, so that's just what I imagined I must need." People use regexps to try to do far too many complex matching, splitting, scanning, replacement, and validation tasks, including on complex languages such as HTML, or in this case JSON. There is almost always a better way.
The only time I can imagine wanting to manipulate JSON with regexps is if the JSON is broken somehow, perhaps due to a bug in server code, and it needs to be fixed up in order to be parseable.

Categories

Resources