Find and replace string text between two other string texts JS - javascript

I'm trying to find and replace a word in a string
Example:
let string =
`
Title: Hello World
Authors: Michael Dan
`
I need to find the Hellow World and replace with whatever I want, here is my attempt:
const replace = string.match(new RegExp("Title:" + "(.*)" + "Authors:")).replace("Test")

When you replace some text, it is not necessary to run String#match or RegExp#exec explicitly, String#replace does it under the hood.
You can use
let string = "\nTitle: Hello World\nAuthors: Michael Dan\n"
console.log(string.replace(/(Title:).*(?=\nAuthors:)/g, '$1 Test'));
The pattern matches
(Title:) - Group 1: Title: fixed string
.* - the rest of the line, any zero or more chars other than line break chars, CR and LF (we need to consume this text in order to remove it)
(?=\nAuthors:) - a positive lookahead that matches a location that is immediately followed with an LF char and Authors: string.
See the regex demo.
If there can be a CRLF line ending in your string, you will need to replace (?=\nAuthors:) with (?=\r?\nAuthors:) or (?=(?:\r\n?|\n)Authors:).

You might be better off converting to an object first and then just defining the title property:
let string =
`
Title: Hello World
Authors: Michael Dan
`
const stringLines = string.split('\n');
let stringAsObject = {};
stringLines.forEach(
(line) => {
if (line.includes(':')) {
stringAsObject[line.split(':')[0]] = line.split(':')[1];
}
}
);
stringAsObject.Title = 'NewValue';

You can use replace method like that:
string.replace("Hello World", "Test");

I can achieve this without regex. All you need is knowing the index of the string that you need to find.
var original = `
Title: Hello World
Authors: Michael Dan
`;
var stringToFind = "Hello World";
var indexOf = original.indexOf(stringToFind);
original = original.replace(original.substring(indexOf, indexOf + stringToFind.length), "Hey Universe!");
console.log(original)

Related

Regex that allows a pattern to start with a an optional, specific character, but no other character

How can I write a regex that allows a pattern to start with a specific character, but that character is optional?
For example, I would like to match all instances of the word "hello" where "hello" is either at the very start of the line or preceded by an "!", in which case it does not have to be at the start of the line. So the first three options here should match, but not the last:
hello
!hello
some other text !hello more text
ahello
I'm specfically interested in JavaScript.
Match it with: /^hello|!hello/g
The ^ will only grab the word "hello" if it's at the beginning of a line.
The | works as an OR.
var str = "hello\n!hello\n\nsome other text !hello more text\nahello";
var regex = /^hello|!hello/g;
console.log( str.match(regex) );
Edit:
If you're trying to match the whole line beginning with "hello" or containing "!hello" as suggested in the comment below, then use the following regex:
/^.*(^hello|!hello).*$/gm
var str = "hello\n!hello\n\nsome other text !hello more text\nahello";
var regex = /^.*(^hello|!hello).*$/gm;
console.log(str.match(regex));
Final solution (hopefully)
Looks like, catching the groups is only available in ECMAScript 2020. Link 1, Link 2.
As a workaround I've found the following solution:
const str = `hello
!hello
some other text !hello more text
ahello
this is a test hello !hello
JvdV is saying hello
helloing or helloed =).`;
function collectGroups(regExp, str) {
const groups = [];
str.replace(regExp, (fullMatch, group1, group2) => {
groups.push(group1 || group2);
});
return groups;
}
const regex = /^(hello)|(?:!)(hello\b)/g;
const groups = collectGroups(regex, str)
console.log(groups)
/(?=!)?(\bhello\b)/g should do it. Playground.
Example:
const regexp = /(?=!)?(\bhello\b)/g;
const str = `
hello
!hello
some other text !hello more text
ahello
`;
const found = str.match(regexp)
console.log(found)
Explanation:
(?=!)?
(?=!) positive lookahead for !
? ! is optional
(\bhello\b): capturing group
\b word boundary ensures that hello is not preceded or succeeded by a character
Note: If you also make sure, that hello should not be succeeded by !, then you could simply add a negative lookahead like so /(?=!)?(\bhello\b)(?!!)/g.
Update
Thanks to the hint of #JvdV in the comment, I've adapted the regex now, which should meet your requirements:
/(^hello\b)|(?:!)(hello\b)/gm
Playground: https://regex101.com/r/CXXPHK/4 (The explanation can be found on the page as well).
Update 2:
Looks like the non-capturing group (?:!) doesn't work well in JavaScript, i.e. I get a matching result like ["hello", "!hello", "!hello", "!hello"], where ! is also included. But who cares, here is a workaround:
const regex = /(^hello\b)|(?:!)(hello\b)/gm;
const found = (str.match(regex) || []).map(m => m.replace(/^!/, ''));

How to keep only the doube quote part in a string in Javascript?

I want to keep only the double quote part of a string in javascript. Suppose this is my string:
const str = 'This is an "example" of js.'
I want my result like that:
output = example
Means I want to keep only the example part which is in the double quote.
I can remove the double quote from string but I haven't found any good way to keep only the double quote part of a string.
Get the start and last index of " and then use slice.
const str = 'This is an "example" of js';
const startIdx = str.indexOf('"');
const lastIdx = str.lastIndexOf('"');
const output = str.slice(startIdx+1, lastIdx);
console.log(output);
As noted in the comments, this is not a valid string, you need to escape inner double quotes const str = "This is an \"example\" of js."
After that you can extract the value inside the quotes with a regex:
const matches = str.match(/"(.*?)"/);
return matches ? matches[1] : str;
you could use a regex capturing group like so:
const captured = str.match(/\"(.*)\"/)
but you'll need to declare the string with single quotes and then double quotes inside like this:
const str = 'This is an "example" of js.'
try it here: https://regexr.com/4hfh3

How to remove underscore from beginning of string

I need to remove underscores from the beginning of a string(but only the beginning),
For example:
__Hello World
Should be converted to :
Hello World
But Hello_World should stay as Hello_World.
Tricky thing is I don't know how may underscores there could be 1,2 or 20.
You can pass a regex to replace(). /^_+/, says find any number of _ after at the beginning of the string:
let texts = ["__Hello World", "Hello_World", 'jello world_', '_Hello_World_', '___________Hello World']
let fixed = texts.map(t => t.replace(/^_+/, ''))
console.log(fixed)
Regex is pretty suited for this task:
let str = "__h_e_l_l_o__"
console.log(str.replace(/^_*/, ""));
Method 01:
var str = '__Hello World';
str = str.replace(/^_*/, "");
Method 02:
var str = '__Hello World';
while(str.startsWith('_')){
str = str.replace('_','');
}
console.log(str);
// Hello World

Remove all urls in a string using Javascript

How can I remove all urls within a string regardless of where they appear using Javascript?
For example, for the following tweet-
"...Ready For It?" (#BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q
I would like to get back
"...Ready For It?" (#BloodPop ® Remix) out now -
To remove all urls from the string, you can use regex to identify all the urls that are there in the string and then use String.prototype.replace to replace all the urls with empty characters.
This is John Grubber's Regex which can be used to match all urls.
/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g
So to replace all the urls just run a replace with the above regex
let originalString = '"...Ready For It?" (#BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q'
let newString = originalString.replace(/\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))/g,'')
console.log(newString)
If your urls do not contain a literal whitespace, you could use a regex https?.*?(?= |$) to match from http with an optional s to the next whitespace or end of the string:
var str = '...Ready For It?" (#BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q';
str = str.replace(/https?.*?(?= |$)/g, "");
console.log(str);
Or split on a whitespace and check if the parts start with "http" and if so remove them.
var string = "...Ready For It?\" (#BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q";
string = string.split(" ");
for (var i = 0; i < string.length; i++) {
if (string[i].substring(0, 4) === "http") {
string.splice(i, 1);
}
}
console.log(string.join(" "));
First you can split it by white space
var givenText = '...Ready For It?" https://example2.com/rsKdAQzd2q (#BloodPop ® Remix) out now - https://example.com/rsKdAQzd2q'
var allWords = givenText.split(' ');
Than you can filter out non url words using your own implementation for checking URL, here we can check index of :// for simplicity
var allNonUrls = allWords.filter(function(s){ return
s.indexOf('://')===-1 // you can call custom predicate here
});
So you non URL string will be:
var outputText = allNonUrls.join(' ');
// "...Ready For It?" (#BloodPop ® Remix) out now - "
You can use a regular expression replace on the string to do this, however, finding a good expression to match all URLs is awkward. However something like:
str = str.replace(regex, '');
The correct regex to use has been the subject of many StackOverflow questions, it depends on whether you need to match only http(s)://xxx.yyy.zzz or a more general pattern such as www.xxx.yyy.
See this question for regex patterns to use: What is the best regular expression to check if a string is a valid URL?
function removeUrl(input) {
let regex = /http[%\?#\[\]#!\$&'\(\)\*\+,;=:~_\.-:\/ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789]*/;
let result = input.replace(regex, '');
return result;
}
let result = removeUrl('abc http://helloWorld" sdfsewr');

JavaScript Regex - Splitting a string into an array by the Regex pattern

Given an input field, I'm trying to use a regex to find all the URLs in the text fields and make them links. I want all the information to be retained, however.
So for example, I have an input of "http://google.com hello this is my content" -> I want to split that by the white space AFTER this regex pattern from another stack overflow question (regexp = /(ftp|http|https)://(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(/|/([\w#!:.?+=&%#!-/]))?/) so that I end up with an array of ['http://google.com', 'hello this is my content'].
Another ex: "hello this is my content http://yahoo.com testing testing http://google.com" -> arr of ['hello this is my content', 'http://yahoo.com', 'testing testing', 'http://google.com']
How can this be done? Any help is much appreciated!
First transform all the groups in your regular expression into non-capturing groups ((?:...)) and then wrap the whole regular expression inside a group, then use it to split the string like this:
var regex = /((?:ftp|http|https):\/\/(?:\w+:{0,1}\w*#)?(?:\S+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:.?+=&%#!-/]))?)/;
var result = str.split(regex);
Example:
var str = "hello this is my content http://yahoo.com testing testing http://google.com";
var regex = /((?:ftp|http|https):\/\/(?:\w+:{0,1}\w*#)?(?:\S+)(?::[0-9]+)?(?:\/|\/(?:[\w#!:.?+=&%#!-/]))?)/;
var result = str.split(regex);
console.log(result);
You had few unescaped backslashes in your RegExp.
var str = "hello this is my content http://yahoo.com testing testing http://google.com";
var captured = str.match(/(ftp|http|https):\/\/(\w+:{0,1}\w*#)?(\S+)(:[0-9]+)?(\/|\/([\w#!:.?+=&%#!-/]))?/g);
var nonCaptured = [];
str.split(' ').map((v,i) => captured.indexOf(v) == -1 ? nonCaptured.push(v) : null);
console.log(nonCaptured, captured);

Categories

Resources