Remove text between random symbols - JavaScript - javascript

I have this input text with a pattern * + [*]*[/*] + *:
Input: [tag]random text EXCLUDE[/tag] text here
Output: text here
Input: [tag_1]another random text EXCLUDE[/tag_1] another text here
Output: another text here
Input: [tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]
Output: another text here text
What I want is to remove the text, between [*] and [/*], like replacing it with ''.
The problem is that the symbols, between [] are random, but if there is an open [*], there is and closed [/*], without nesting.

This should do the trick:
let OUTPUT = INPUT
.split('[')
.filter(s => s.indexOf(']') == -1 || s.startsWith('/'))
.map(s => s
.split(']')
.reverse()[0])
.join('')
The main point is, the text inside [] doesn't actually matter, all we need are the square brackets to act as "anchors".
I tried and failed to write a concise step-by-step explanation... My suggestions is to copy-paste the code in a console, feed it some data and watch what comes out of each step, it's self-evident.

This looks like a nice job for regular expressions:
const theStrings = [ '[tag]random text EXCLUDE[/tag] text here',
'[tag_1]another random text EXCLUDE[/tag_1] another text here',
'[random]another random text EXCLUDE[/random] another text here',
'[this]is not to be [/filtered] as they don\'t match',
'[tag_1]another random text EXCLUDE[/tag_1] another text here [tag_1] another random text EXCLUDE[/tag_1] text [tag_3]another random text EXCLUDE[/tag_3]'
]
let replaced = '';
let prev_len = 0;
theStrings.forEach(str => {
replaced = str;
do {
prev_len = replaced.length ;
replaced = replaced.replace(/\[(.*)\].*?\[\/\1\](.*)/,'$2')
} while (replaced.length < prev_len);
console.log("before:",str,"\nafter:", replaced)
} )
https://regex101.com/r/cE6NZm/1
basically it's capturing anything between [this] and [/this] (notice they are the same but the second one has to be preceded by a / ) and letting out that portion of the string

Related

dicordjs string manipulation

if(msg.content.includes("[mid]")) {
let str = msg.content
let pokeID = str.substring(
str.indexOf("[mid]") + 5,
str.lastIndexOf("[/mid") //get the unique-code for a pokemon
);
msg.channel.send({ content: `Here is your Pokemon:`, embeds: [midData(pokeID)] });
this code works fine, I'm would to be able to put in any user text that is before or after the [mid]code[/mid]
example user inputs "this text can be of any length or even null [mid]unique-code[/mid] this text can also be of any text or null"
the output should be :
this text can be of any length or even null this text can also be of any text or null
[mid]unique-code[/mid] (which is a link)
I have tried this: https://imgur.com/a/uq8CVpn //image of output
I need 3 strings from the user input.
string1 = all text, if any before [mid]unique-code[/mid] // pokemon code
string2 = [mid]unique-code[/mid]
string3 = all text if any behind [mid]unique-code[/mid] //pokemon code
using node v16 and discord v13
thank you
You just need to use lastIndexOf and substring like this (I also assumed you meant after and not behind in string3) :
also for clarification substring works like this:
substring(startIndex, endIndex /*defaults to the end of the string if not specified*/ )
const message = "this text can be of any length or even null [mid]unique-code[/mid] this text can also be of any text or null"
const beforeCode = message.substring(0, message.lastIndexOf("[mid]")).trim()
const afterCode = message.substring(message.lastIndexOf("[/mid]") + 6).trim()
const code = message.substring(
message.lastIndexOf("[mid]") + 5,
message.lastIndexOf("[/mid]")
).trim();
console.log("beforeCode:", beforeCode, "\nafterCode:", afterCode, "\ncode:", code)
to get the text before the unique code we substring the message from index 0 (the message's start) -> start index of "[mid]"
to get the text after the unique code we substring the message from end index of "[mid/]" -> end of the string
to get the unique code we substring the message from end index of "[mid]" -> start index of [mid/]
I also use trim to remove any spaces from the start or end of the string

Javascript RegExp verify string finishes with html open tag + anything but a tag

I would like to verify that a string ends with any open Tag followed or not by a text
for example:
<div>some text here
or
<span>some text here
should match
but
<div>some text here</div>
or
<h1>some text here</h1>
should not
I tried to come up with a solution (sorry I'm not a regex pro)
let anyOpenTag = '<([^/>][^>])*>';
let anyCloseTag = '</[^/>][^>]*>';
let neitherOpenNorCloseTag = `[^(${anyOpenTag}|${anyCloseTag})]`;
let regex = new RegExp(
escapeRegExp(`${anyOpenTag}${neitherOpenNorCloseTag}*$`),
'gi');
I set a variable "anyOpenTag" to a regex that verifies if it's an open tag like (<p>, <div>, <span> etc...
I set a variable "anyCloseTag" to a regex that verifies if it's a closing tag like (</p>, </div>, etc...)
I set a variable "neitherOpenNorCloseTag" that tries to combine the two and check if it's not one of them using the [^....]
finally I check if the regex match anyOpenTag + neitherOpenNorCloseTag
unfortunately it doesn't work for me, precisely the part that verifies "neitherOpenNorCloseTag"
your help is appreciated, even if you have a better regex I would be grateful
Based on your examples, this snippet works:
const regex = /<.*>.*<\/.*>/g
const regexString = (regex) => {
return (s) => {
return !s.match(regex)
}
}
const validateString = regexString(regex)
const strings = [
'<div>some text here',
'<span>some text here',
'<div>some text here</div>',
'<h1>some text here</h1>',
]
const validated = strings.map(validateString)
console.log('Result:', validated)

Regex - Extract string between square bracket tag [duplicate]

This question already has answers here:
Parsing BBCode in Javascript
(2 answers)
Closed 3 years ago.
I've tags in string like [note]some text[/note] where I want to extract the inside text between the tags.
Sample Text:
Want to extract data [note]This is the text I want to extract[/note]
but this is not only tag [note]Another text I want to [text:sample]
extract[/note]. Can you do it?
From the given text, extract following:
This is the text I want to extract
Another text I want to [text:sample] extract
We can try matching using the following regex pattern:
\[note\]([\s\S]*?)\[\/note\]
This says to just capture whatever comes in between [note] and the closest closing [/note] tag. Note that we use [\s\S]* to potentially match desired content across newlines, should that be necessary.
var re = /\[note\]([\s\S]*?)\[\/note\]/g;
var s = 'Want to extract data [note]This is the text I want to extract[/note]\nbut this is not only tag [note]Another text I want to [text:sample]\n extract[/note]. Can you do it?';
var m;
do {
m = re.exec(s);
if (m) {
console.log(m[1]);
}
} while (m);
I wrote this while Tim posted his answer and it's quite like but I figured I'll post it anyway since it's extracted to a re-usable function you can use for any tag.
const str = `Want to extract data [note]This is the text I want to extract[/note]
but this is not only tag [note]Another text I want to [text:sample]
extract[/note]. Can you do it?`;
function extractTagContent(tag, str) {
const re = new RegExp(`\\[${tag}\\](.*?)\\[\\/${tag}\\]`, "igs");
const matches = [];
let found;
while ((found = re.exec(str)) !== null) {
matches.push(found[1]);
}
return matches;
}
const content = extractTagContent("note", str);
// content now has:
// ['This is the text I want to extract', 'Another text I want to [text:sample] extract. Can you do it?']
Demo: https://codesandbox.io/s/kw006560oo

regex replace first element

I have the need to replace a HTML string's contents from one <br> to two. But what I can't achieve is when I have one tag following another one:
(<br\s*\/?>)
will match all the tags in this text:
var text = 'text<BR><BR>text text<BR>text;'
will match and with the replace I will have
text = text.replace.replace(/(<br\s*\/?>)>/gi, "<BR\/><BR\/>")
console.log(text); //text<BR/><BR/><BR/><BR/>text text<BR/><BR/>text;"
Is there a way to only increment one tag with the regex? And achieve this:
console.log(text); //text<BR/><BR/><BR/>text text<BR/><BR/>text;"
Or I only will achieve this with a loop?
You may use either
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(<br\s*\/?>)+/gi, "$&$1");
console.log(text); // => text<BR><BR><BR>text text<BR><BR>text;
Here, (<br\s*\/?>)+/gi matches 1 or more sequences of <br>s in a case insensitive way while capturing each tag on its way (keeping the last value in the group beffer after the last it, and "$&$1" will replace with the whole match ($&) and will add the last <br> with $1.
Or
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(?:<br\s*\/?>)+/gi, function ($0) {
return $0.replace(/<br\s*\/?>/gi, "<BR/>") + "<BR/>";
})
console.log(text); // => text<BR/><BR/><BR/>text text<BR/><BR/>text;
Here, the (?:<br\s*\/?>)+ will also match 1 or more <br>s but without capturing each occurrence, and inside the callback, all <br>s will get normalized as <BR/> and a <BR/> will get appended to the result.
You can use negative look ahead (<br\s*\/?>)(?!<br\s*\/?>)/ to increment only the last tag if there are any consecutive:
var text = 'text<BR><BR>text text<BR>text;'
text = text.replace(/(<br\s*\/?>)(?!<br\s*\/?>)/gi, "<BR\/><BR\/>")
console.log(text);

How to regex square brackets?

The problem: I want to get all of the square brackets' content, and then delete them, but only if the brackets are at the beginnig of the string.
For example, a given string [foo][asd][dsa] text text text will return array with all of the three brackets' content (["foo", "asd", "dsa"]), and the string will become text text text.
But if the string looks like that: [foo] text [asd][dsa] text text, it'll take only the [foo], and the string will become: text [asd][dsa] text text.
How can I do that using JavaScript?
The loop checks the start of the string for anything in square brackets, takes the contents of the brackets, and removes the whole lot from the start.
var haystack = "[foo][asd][dsa] text text text";
var needle = /^\[([^\]]+)\](.*)/;
var result = new Array();
while ( needle.test(haystack) ) { /* while it starts with something in [] */
result.push(needle.exec(haystack)[1]); /* get the contents of [] */
haystack = haystack.replace(needle, "$2"); /* remove [] from the start */
}
Something like var newstring = oldstring.replace(/\[\w{3}]/, "");
You could proceed using a while, taking the first, adding it to an array, remove it and then do all again. This would give this :
var t1 = "[foo][asd][dsa] text text text";
var rule = /^(?:\[([^\]]*)\])/g;
var arr = new Array();
while(m = rule.exec(t1)){
arr.push(m[1]);
t1 = t1.replace(rule, "")
}
alert(arr); // foo,asd,dsa
alert(t1); // text text text

Categories

Resources