Regex get the middle section of each word javascript - javascript

So essentially what I'm trying to do is loop through every word in a html document and replace the first letter of each word with 'A', the second - second last letter with 'b' and the last letter with 'c', completely replacing the word. I'm not sure if regular expressions are the way to go about doing this (should I instead be using for loops and checking each character?) however I'll ask anyway.
Currently I'm doing:
document.body.innerHTML = document.body.innerHTML.replace(/\b(\w)/g, 'A'); to get the first letter of each word
document.body.innerHTML = document.body.innerHTML.replace(/\w\b/g, 'c'); to get the last letter of each word
So if I had the string: Lorem ipsum dolor sit amet I can currently make it Aorec Apsuc Aoloc Aic Amec but I'd like to do Abbbc Abbbc Abbbc Abc Abbc in javascript.
Any help is much appreciated - regular expressions really confuse me.

You almost got it.
str = "Lorem ipsum dolor sit amet"
str = str
.replace(/\w/g, 'b')
.replace(/\b\w/g, 'A')
.replace(/\w\b/g, 'c')
document.write(str);
Fancier replacement rules can be handled with a callback function, e.g.
str = "Lorem ipsum dolor sit amet"
str = str.replace(/\w+/g, function(word) {
if (word === "dolor")
return word;
return 'A' + 'b'.repeat(word.length - 2) + 'c';
});
document.write(str);

Related

Replace a character of a string

I have a string that looks like this: [TITLE|prefix=a].
From that string, the text |prefix=a is dynamic. So it could be anything or empty. I would like to replace (in that case) [TITLE|prefix=a] with [TITLE|prefix=a|suffix=z].
So the idea is to replace ] from a string that starts with [TITLE with |suffix=z].
For instance, if the string is [TITLE|prefix=a], it should be replaced with [TITLE|prefix=a|suffix=z]. If it's [TITLE], it should be replaced with [TITLE|suffix=z] and so on.
How can I do this with RegEx?
I have tried it this way but it gives an error:
let str = 'Lorem ipsum [TITLE|prefix=a] dolor [sit] amet [consectetur]';
const x = 'TITLE';
const regex = new RegExp(`([${x})*]`, 'gi');
str = str.replace(regex, "$1|suffix=z]");
console.log(str);
I have also tried to escape the characters [ and ] with new RegExp(`(\[${x})*\]`, 'gi'); but that didn't help.
You need to remember to use \\ in a regular string literal to define a single literal backslash.
Then, you need a pattern like
/(\[TITLE(?:\|[^\][]*)?)]/gi
See the regex demo. Details:
(\[TITLE\|[^\][]*) - Capturing group 1:
\[TITLE - [TITLE text
(?:\|[^\][]*)? - an optional occurrence of a | char followed with 0 or more chars other than ] and [
] - a ] char.
Inside your JavaScript code, use the following to define the dynamic pattern:
const regex = new RegExp(`(\\[${x}\\|[^\\][]*)]`, 'gi');
See JS demo:
let str = 'Lorem ipsum [TITLE|prefix=a] dolor [sit] amet [consectetur] [TITLE]';
const x = 'TITLE';
const regex = new RegExp(`(\\[${x}(?:\\|[^\\][]*)?)]`, 'gi');
str = str.replace(regex, "$1|suffix=z]");
console.log(str);
// => Lorem ipsum [TITLE|prefix=a|suffix=z] dolor [sit] amet [consectetur]
I think the solution to your problem would look similar to this:
let str = 'Lorem ipsum [TITLE|prefix=a] dolor [sit] amet [consectetur]';
str = str.replace(/(\[[^\|\]]+)(\|[^\]]*)?\]/g, "$1$2|suffix=z]");
console.log(str);

Replace a pattern with a value in javascript [duplicate]

This question already has an answer here:
javascript replace with submatch as array index
(1 answer)
Closed last year.
I'm new to JavaScript and I'm working on an application that have something like that in a string format
"lorem ipsum dolor {#variable#} sit amet {#variable2#}"
How to remove {# and #} and replace the word variable with value and for the second replaces the word variable2 and replaces it with value2
I really appreciate the help.
Thank you in advance
You could use the regex {#(.*?)#} to find the substrings you want to replace. Then, use a map object to get the corresponding value for the captured variable:
let str = `lorem ipsum dolor {#variable#} sit amet {#variable2#}`
let map = {
variable: "value1",
variable2: "value2",
}
let replaced = str.replace(/{#(.*?)#}/g, (m, p1) => map[p1])
console.log(replaced)
Regex demo

Regexp to match words two by two (or n by n)

I'm looking for a regexp which is able to match words n by n. Let's say n := 2, it would yield:
Lorem ipsum dolor sit amet, consectetur adipiscing elit
Lorem ipsum, ipsum dolor, dolor sit, sit amet (notice the comma here), consectetur adipiscing, adipiscing elit.
I have tried using \b for word boundaries to no avail. I am really lost trying to find a regex capable of giving me n words... /\b(\w+)\b(\w+)\b/i can't cut it, and even tried multiple combinations.
Regular expressions are not really what you need here, other than to split the input into words. The problem is that this problem involves matching overlapping substrings, which regexp is not very good at, especially the JavaScript flavor. Instead, simply break the input into words, and a quick piece of JavaScript will generate the "n-grams" (which is the correct term for your n-word groups).
const input = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";
// From an array of words, generate n-grams.
function ngrams(words, n) {
const results = [];
for (let i = 0; i < words.length - n + 1; i++)
results.push(words.slice(i, i + n));
return results;
}
console.log(ngrams(input.match(/\w+./g), 2));
A word boundary \b does not consume any characters, it is a zero-width assertion, and only asserts the position between a word and non-word chars, and between start of string and a word char and between a word char and end of string.
You need to use \s+ to consume whitespaces between words, and use capturing inside a positive lookahead technique to get overlapping matches:
var n = 2;
var s = "Lorem ipsum dolor sit amet, consectetur adipiscing elit";
var re = new RegExp("(?=(\\b\\w+(?:\\s+\\w+){" + (n-1) + "}\\b))", "g");
var res = [], m;
while ((m=re.exec(s)) !== null) { // Iterating through matches
if (m.index === re.lastIndex) { // This is necessary to avoid
re.lastIndex++; // infinite loops with
} // zero-width matches
res.push(m[1]); // Collecting the results (group 1 values)
}
console.log(res);
The final pattern will be built dynamically since you need to pass a variable to the regex, thus you need a RegExp constructor notation. It will look like
/(?=(\b\w+(?:\s+\w+){1}\b))/g
And it will find all locations in the string that are followed with the following sequence:
\b - a word boundary
\w+ - 1 or more word chars
(?:\s+\w+){n} - n sequences of:
\s+ - 1 or more whitespaces
\w+ - 1 or more word chars
\b - a trailing word boundary
Not a pure regex solution, but it works and is easy to read and understand:
let input = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit';
let matches = input.match(/(\w+,? \w+)/g)
.map(str => str.replace(',', ''));
console.log(matches) // ['Lorem ipsum', 'dolor sit', 'amet consectetur', 'adipiscing elit']
Warning: Does not check for no matches (match() returns null)

Regex BBcode: ignore (escape) all the markup in special tag (in JavaScript)

I'm parsing some text with a set of tags and replaces. For example, to make text **surrounded by double astersks** bold I use /\*\*([\s\S]+?)\*\*/gm for the pattern and "<b>$1</b>" for the replace. But what I want to also provide raw text like I did in this very question. So I need an expression which "matches any character including whitespaces and newlines surrounded by double asterisks but not surrounded by backticks with (optional) characters/whitespaces/newlines in between the backtick and double asterisks"
Example.
Input string: "Lorem ``ipsum **dolor** sit`` amet, **consectetur** adipisicing elit"
Result: "Lorem ipsum **dolor** sit amet, consectetur adipisicing elit"
I tried non-matching groups and lookaheads but for no avail. I know it can be done by for example replacing characters with html entities or just use some Markdown parser, but just for the sake of interest, how can this be done via pure Regex magic?
Life would be simpler with lookbehind assertions.
/((`)[\s\S]*?)?\*\*([\s\S]+?)\*\*([\s\S]*?\2)/gm
 
((`)[\s\S]*?)? #capture any characters (or none) preceded by a backtick (captured for a later use in the RE). Optionnal - non-greedy.
\*\*([\s\S]+?)\*\* #capture any characters surrounded by asterisks.
([\s\S]*?\2) #capture any characters (including empty string) followed by the capture #2 (empty or backtick).
If the first group is empty, the last one will match an empty string.
Then we filter our result.
var str = "Lorem `ipsum **dolor** sit` amet, **consectetur** adipisicing elit dolor `**sit amet**` adi";
str = str.replace(/((`)[\s\S]*?)?\*\*([\s\S]+?)\*\*([\s\S]*?\2)/gm, function(m, p1, p2, p3, p4){
return p1 && p4 ? m : "<b>" + p3 + "</b>";
});
 
return p1 && p4 ? m : "<b>" + p3 + "</b>";
If p1 and p4 are not empty/undefined, that means our matched string starts and ends with backticks. We return it without changes.
This example outputs :
Lorem `ipsum **dolor** sit` amet, consectetur adipisicing elit
dolor `**sit amet**` adi
It's a bit tricky, imo. But as you point out, it's just for the sake of interest. ;)

Remove from String (from Array) in Javascript [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Remove a word from a string
I have a simple string var mystr = "Lorem ipsum dolor sit amet, consectetur adipiscing elit", and I have an array var lipsums = new Array("dolor","consectetur","elit"); Now, I want a simple function which will remove any same word in the string.
So, in the above example, it should remove the words "dolor", "consectetur", and "elit" and my string mystr should be "Lorem ipsum sit amet, adipiscing"
This script should be in Javascript (no jQuery). Any help would be appreciated.
Loop over the array of words to remove, removing all occurances via split/join:
for (var i = 0; i < lipsums.length; i++) {
mystr = mystr.split(lipsums[i]).join('');
}
http://jsfiddle.net/9Rgzd/
You may also want to clean up your whitespace afterwards, which you can do with a regex:
// Note: don't do this in the loop!
mystr = mystr.replace(/\s{2,}/g, ' ');
http://jsfiddle.net/9Rgzd/1/
Like this:
for(i=0; i<lipsums.length; i++) {
mystr = mystr.replace(new RegExp(lipsums[i],"g"), "");
}
Add this AFTER the loop to remove double white space's:
mystr = mystr.replace(/\s{2,}/g, ' ');

Categories

Resources