Match Both text with Regex Lookahead on javascript - javascript

With this regular expression :
/lorem(?=[\s,;\[\]\(\)]*ipsum)/ig
It matches "lorem" that is followed by "ipsum" with/without " ,;][)(" characters.
Example text: Lorem ipsum dolor sit amet, Lorem; ipsum dolor sit amet, Lorem,; (ipsum) dolor sit amet, Lorem dolor sit amet, Lorem amet.
if I use ?: instead of ?= it matches whole text from "lorem" to end of "ipsum" such as "Lorem ipsum", "Lorem; ipsum", "Lorem,; (ipsum" , etc... .
Now I want to Regex match both "lorem" and "ipsum" without matching " ,;][)(" characters. How I modify the expression to do this?

/lorem(?=[\s,;\[\]\(\)]*(ipsum))/gmi
demo here

Related

How can I read from a local txt file and check when it changes in javascript?

I want to make my js program read from a txt file whose location is hardcoded. Every time the txt file is update, I want to store the new data as a new variable.
For example, when the txt file changes from blank to the following:
Lorem ipsum dolor sit amet
it will store newInfo = "Lorem ipsum dolor sit amet"
Then if the txt file is updated to the following:
Lorem ipsum dolor sit amet
consectetur adipiscing elit
It will store newInfo = "consectetur adipiscing elit"

Regex Match everything except words of particular flag format

I need a regex that can match everything else except the random flags..
All flags have this format, starts with FLAG and ends in ;
FLAG:random_token;
Example:
hello
hello world
lorem ipsum dolor sit amet
FLAG:xyz6767abcd45xyz; and lorem
lorem ipsum dolor
FLAG:abc123; and hello there,..
hello there....
output Im trying to obtain:
hello
hello world
lorem ipsum dolor sit amet
and lorem
lorem ipsum dolor
and hello there,..
hello there....
So far I've tried:
^(?!FLAG:(.*?);).*
and
(?!.*\bFLAG:.*$)^.*$
But it fails to extract the strings after the semicolon in FLAG:random_token;
Any help would be appreciated
And I've tried deleting all Flags from the block, but I needed the token values later and Also thought regex would be the best fit.
One way to do this would be to remove the flags from the input string, using String.replace and a regex to match the FLAG: and random token (everything to the next ;), you can then use a callback function to store the tokens as they are found:
str = `hello
hello world
lorem ipsum dolor sit amet
FLAG:xyz6767abcd45xyz; and lorem
lorem ipsum dolor
FLAG:abc123; and hello there,..
hello there....`;
const tokens = [];
str = str.replace(/FLAG:([^;]+);/g, (_, p1) => {
tokens.push(p1);
return '';
});
console.log(str);
console.log(tokens);

How to select all multiple spaces except last two before a specific character with a Regex?

Here is an example string:
Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam.
What would be the most elegant regex to select all extra spaces EXCEPT for two spaces before the "-" to make an elegant list?
Here is an example desired result:
Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam.
Here is my best guess: / {2,}(?! {2}-)/g.
Sadly, it also selects the two spaces before the "-".
Edit:
I think I'll go with the folowwing:
let str = ` Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam. `;
str = str.replace(/ {2,}/g, "");
str = str.replace(/-/g, " -");
console.log(str);
You could select all spaces or tabs from the start and the end of the string and replace them with an empty string. Then replace the strings that start with a hyphen with 2 spaces.
const regex = /^[\t ]+|[\t ]+$/mg;
const str = ` Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam. `;
const subst = ``;
const result = str.replace(regex, subst).replace(/^-/gm, " -");
console.log(result);
You could also you a combination of map and trim:
let str = ` Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam. `;
str = str.split("\n").map(s => s.trim()).map(x => x.replace(/^-/, " -")).join("\n");
console.log(str);
(^( +)[a-zA-Z])|(( +)(( {2}-)|\n|$))
(^( +)[a-zA-Z]): This group matches the characters before Lorem Ipsum.
(( +)(( {2}-)|\n|$)) This group matches the characters before two spaces and a -, or before a newline \n, or before the end of string $.
https://regex101.com/r/i4ppG7/5
You can use capturing group
let str = `Lorem ipsum
- dolor sit amet consectetur
- adipisicing elit. Adipisci, quam. `
let finalList = str.replace(/^\s*(\s{2}.*)$/gm, '$1')
console.log('original list\n',str)
console.log('New list\n',finalList)

Javascript reg exp between closing tag to opening tag

How do I select with Regular Expression the text after the </h2> closing tag until the next <h2> opening tag
<h2>my title here</h2>
Lorem ipsum dolor sit amet <b>with more tags</b>
<h2>my title here</h2>
consectetur adipisicing elit quod tempora
In this case I want to select this text: Lorem ipsum dolor sit amet <b>with more tags</b>
Try this: /<\/h2>(.*?)</g
This finds a closing tag, then captures anything before a new opening tag.
in JS, you'd do this to get just the text:
substr = str.match(/<\/h2>(.*?)<h2/)[1];
Regex101
var str = '<h2>my title here</h2>Lorem ipsum <b>dolor</b> sit amet<h2>my title here</h2>consectetur adipisicing elit quod tempora';
var substr = str.match(/<\/h2>(.*?)<h2/)[1].replace(/<.*?>/g, '');
console.log(substr);
//returns: Lorem ipsum dolor sit amet
Try
/<\/h2>((?:\s|.)*)<h2/
And you can see it in action on this regex tester.
You can see it in this example below too.
(function() {
"use strict";
var inString, regEx, res, outEl;
outEl = document.getElementById("output");
inString = "<h2>my title here</h2>\n" +
"Lorem ipsum dolor sit amet <b>with more tags</b>\n" +
"<h2> my title here </h2>\n" +
"consectetur adipisicing elit quod tempora"
regEx = /<\/h2>((?:\s|.)*)<h2/
res = regEx.exec(inString);
console.log(res);
res.slice(1).forEach(function(match) {
var newEl = document.createElement("pre");
newEl.innerHTML = match.replace(/</g, "<").replace(/>/g, ">");
outEl.appendChild(newEl);
});
}());
<main>
<div id="output"></div>
</main>
I added \n to your example to simulate new lines. No idea why you aren't just selecting the <h2> with a querySelector() and getting the text that way.
Match the tags and remove them, by using string replace() function. Also this proposed solution removes any single closure tags like <br/>,<hr/> etc
var htmlToParse = document.getElementsByClassName('input')[0].innerHTML;
var htmlToParse = htmlToParse.replace(/[\r\n]+/g,""); // clean up the multiLine HTML string into singleline
var selectedRangeString = htmlToParse.match(/(<h2>.+<h2>)/g); //match the string between the h2 tags
var parsedString = selectedRangeString[0].replace(/((<\w+>(.*?)<\/\w+>)|<.*?>)/g, ""); //removes all the tags and string within it, Also single tags like <br/> <hr/> are also removed
document.getElementsByClassName('output')[0].innerHTML += parsedString;
<div class='input'>
<i>Input</i>
<h2>my title here</h2>
Lorem ipsum dolor sit amet <br/> <b>with more tags</b>
<hr/>
<h2>my title here</h2>
consectetur adipisicing elit quod tempora
</div>
<hr/>
<div class='output'>
<i>Output</i>
<br/>
</div>
Couple of things to remember in the code.
htmlToParse.match(/(<h2>.+<h2>)/g); returns an array of string, ie all the strings that was matched from this regex.
selectedRangeString[0] I am just using the first match for demo purspose. If you want to play with all the strings then you can just for loop it with the same logic.

Separately processing wrapped lines using jQuery

I am looking for a way to separately process the lines in a <div> that are wrapped due to a narrow width. That is, if my text is "Lorem ipsum dolor sit amet lorem \n ipsum dolor sit amet" and it is seen as below:
Lorem ipsum dolor
sit amet lorem
ipsum dolor sit
amet
Then I should be able to encapsulate each 'line' in a, say, <span> tag, such as:
<span id="line0">Lorem ipsum dolor<span>
<span id="line1">sit amet lorem</span>
... etc.
Edit: We can assume that the width and height of the div is fixed and known.
I couldn't find a proposed solution, if any exists; although there is a good suggestion for counting the lines for a fixed line-height: How to check for # of lines using jQuery
Starting with this:
<div class="narrow">Lorem ipsum dolor sit amet lorem ipsum dolor sit amet</div>
css:
.narrow {
width:60px;
}
Insert some placeholders where there are spaces:
$('.narrow').html($('.narrow').html().replace(/ /g,"<span class='count'> </span>"))
Determine the y-position of each placeholder:
$('.narrow .count') .each(function() {
var myPos = $(this).position()
alert(myPos.top)
})
From there you should be able to figure out where the start/end points of each line based on its y-position.

Categories

Resources