Advanced lookarounds in regex - javascript

Hi I have the this html:
<div class="c-disruption-item c-disruption-item--line">
<h3 class="c-disruption-item__title" id="11e62827-9f9c-48b2-8807-09f6b6ebeec6" name="11e62827-9f9c-48b2-8807-09f6b6ebeec6"> <a>Closure of London Road</a> </h3>
<ul class="c-disruption__affected-entities">
<li>Affected routes:</li>
<li> <a href="/services/RB/X4#disruptions" class="line-block" style="background-color: #A38142; color:#FFFFFF">
<div class="line-block__contents">
X4
</div> </a> </li>
</ul>
<p>The left turn from Wiltshire Road on to London Road will be closed between 10.00pm and 5.00am on the nights of 27/28 and 28/29 April 2020.<br> <br> Lion X4 affected as follows:-<br> <br> Journeys towards Bracknell will be diverted and unable to serve the Seaford Road bus stop. Please use the Three Frogs bus stop instead.<br> <br> Journeys towards Reading are not affected and should follow normal route.<br> <br> We are sorry for the inconvenience caused.</p>
</div>
And I want to select whatever comes before and after the <ul></ul> section meaning not this:
<ul class="c-disruption__affected-entities">
<li>Affected routes:</li>
<li> <a href="/services/RB/X4#disruptions" class="line-block" style="background-color: #A38142; color:#FFFFFF">
<div class="line-block__contents">
X4
</div> </a> </li>
</ul>
But! if this section does not exist i want to select all.
I tried this selection ([\W\w]+(?=\<ul)|(?<=ul>)[\W\w]+) but it doesn't work if the <ul><\ul> not exist.
The selection have to be regax alone.
Does somebody have an idea?
thanks

Regex is the last resort (at least when using JavaScript). Your objective is done by traversing the DOM not scanning a huge string trying to match error prone patterns.
Finding an unordered list with the className of ".c-disruption__affected-entities" and then excluding said <ul>.
Regex
String is the only data type regex is equipped to deal with. So all of the HTML (which is much more than just string) needs to be converted into a string.
let htmlString = document.body.innerHTML;
Valid HTML may use double and single quotes, multiple white-spaces may occur, multiple empty-lines, etc. A regex must be written to be able to handle such inconsistencies or written to target a pattern so specific that its usefulness outside of that particular situation makes it worthless. The htmlString will most likely be a hot mess of thickly layered HTML sporting huge attribute values like: "c-disruption-item c-disruption-item--line" Anyways, here's a statement using the regex method .replace(). It's untested because it's not efficient, nor practical to use, a complete waste of time:
let result = htmlString.replace(/<ul\s[\s\S]*c-disruption__affected-entities[\s\S]*ul>/i, '');
DOM
A value like this: ul.c-disruption__affected-entities has more meaning as HTML, and is accessible as a DOM Object several standard ways. The following demo features a function that easily meets OP's objective.
Demo
Note: Details are commented in demo.
/**
* Create a documentFragment and move the excluded node
* (or nodes if it has descendants) to it. Although the
* excluded node is no longer part of the DOM, a
* documentFragment allows any of its descendant nodes to
* reattach to the DOM however and whenever.
***
* #param {String} selector -- A CSS selector string of a
* tag that needs to be
* returned without the
* excluded tag.
* {String} exclusion - A CSS selector string of the
* tag that needs to be
* removed from the returned
* value.
*/
const excludeNode = (selector, exclusion) => {
const frag = document.createDocumentFragment();
const area = document.querySelector(selector);
const excl = area.querySelector(exclusion);
frag.appendChild(excl);
return area.outerHTML;
};
console.log(excludeNode(".c-disruption-item.c-disruption-item--line", ".c-disruption__affected-entities"));
:root {
overflow-y: scroll;
height: 200vh
}
<div class="c-disruption-item c-disruption-item--line">
<h3 class="c-disruption-item__title" id="11e62827-9f9c-48b2-8807-09f6b6ebeec6" name="11e62827-9f9c-48b2-8807-09f6b6ebeec6"> <a>Closure of London Road</a> </h3>
<ul class="c-disruption__affected-entities">
<li>Affected routes:</li>
<li>
<a href="/services/RB/X4#disruptions" class="line-block" style="background-color: #A38142; color:#FFFFFF">
<div class="line-block__contents">
X4
</div>
</a>
</li>
</ul>
<p>The left turn from Wiltshire Road on to London Road will be closed between 10.00pm and 5.00am on the nights of 27/28 and 28/29 April 2020.<br> <br> Lion X4 affected as follows:-<br> <br> Journeys towards Bracknell will be diverted and unable to serve
the Seaford Road bus stop. Please use the Three Frogs bus stop instead.<br> <br> Journeys towards Reading are not affected and should follow normal route.<br> <br> We are sorry for the inconvenience caused.</p>
</div>

Related

Why using .filter() together with .match() is only returning the first element matching the condition?

I have some HTML code where at the most nested level there is some text I'm interested in:
<div class="main">
<div class="container">
<div class="output_area">
<pre>WHITE 34</pre>
</div>
<div class="output_area">
<pre>RED 05</pre>
</div>
<div class="output_area">
<pre>WHITE 16</pre>
</div>
<div class="output_area">
<pre>BLACK</pre>
</div>
</div>
</div>
What I need to do is I need to return the output_area elements only when their nested <PRE> element contains a word + a number (for example WHITE 05, and not just BLACK).
So this is what I did:
I made an array from all output_area elements:
output_areas = Array.from(document.getElementsByClassName('output_area'));
I filtered the output_areas array to only return those output_area elements whose nested <PRE> satisfies my condition of a word + a number, using a regexp, like so:
output_areas.filter(el => el.textContent.match(/^WHITE \d+$/g));
Now, what happens is this function will only return the first matching result, so I will get an object of length 1 containing just :
<div class="output_area">
<pre>WHITE 34</pre>
</div>
and the output_area element containing <PRE> with "WHITE 16" is not returned.
As you can see at the end of the regular expression I put a "g" to request a global search and not just stop at the first result.
Not understanding why this did not work, I tried to verify what would happen if I would use includes() to perform a search:
output_areas.filter(el => el.textContent.includes('WHITE')
(let's just forget about the numbers now, it's not important)
And what happens? This will also return only the first output_area...
But why??? What am I doing wrong?
I am not ashamed to say I've been banging my head on this for the last couple of hours... and at this point I just want to understand what is not working.
The only clue I think I got is that if I simplify my search using just a == or !=, for example:
output_areas.filter(el => el.textContent != "")) // return all not empty elements
I get back all output_area elements and not just the first one!
So I suspect there must be some kind of problem when using together filter() & match(), or filter() & includes(), but with relation to that my google searches did not take me anywhere useful...
So I hope you can help!
You should use trim here to remove space before and after the text
output_areas.filter( el => el.textContent.trim().match( /^WHITE \d+$/g ))
const output_areas = Array.from(document.getElementsByClassName('output_area'));
const result = output_areas.filter(el => el.textContent.trim().match(/^WHITE \d+$/g));
console.log(result);
<div class="main">
<div class="container">
<div class="output_area">
<pre> WHITE 34 </pre>
</div>
<div class="output_area">
<pre> RED 05 </pre>
</div>
<div class="output_area">
<pre> WHITE 16 </pre>
</div>
<div class="output_area">
<pre> BLACK </pre>
</div>
</div>
</div>
Answering myself as for some reason it then begin to work without any changes from my side... Yes, just one of those typical IT cases we all know... :)
Jokes aside, I think for some reason the webpage (the DOM) got stuck...
Probably the Jupyter Runtime (which was serving the page) had crashed without me noticing, and this caused somehow the kind of inconsistency I was looking at.
Moral of the story: if you see weird behaviour in the interaction with a Python Notebook, always go check the Jupyter Runtime status before getting stupid at trying to fix impossible errors.
I'm not sure what the issue with the Jupyter notebooks is, but generally speaking - based only on the HTML in the question - what I believe you are trying to do can be achieved using xpath instead of css selectors:
html = `[your html above]
`
domdoc = new DOMParser().parseFromString(html, "text/html")
const areas = domdoc.evaluate('//div[contains(./pre," ")]', domdoc, null, XPathResult.ORDERED_NODE_SNAPSHOT_TYPE, null);
for (let i = 0; i < areas.snapshotLength; i++) {
console.log(areas.snapshotItem(i).outerHTML)
}
The output should be the 3 divs meeting the condition.

How to give scope to a html element

<article class=​"product_summary" data-product-id=​"1" data-product-price=​"45.00">​
<img class=​"product_image" src=​"images/​cufflinks.jpg" alt=​"cufflinks">​
<h1 class=​"product_title">​Personalised cufflinks​</h1>​
<span class=​"product_price">​£45.00​</span>​
<a class=​"add_to_basket" href=​"/​add-to-basket?product_id=1">​Add to Basket​</a>​
</article>​
Say I have an element like the one above. how can i give it scope so that i can access the different elements inside it?
I essentially want to be able to get values from within it. i.e. price, title, etc
You have some zero-width spaces in that HTML snippet which will complicate things. That character occurs after every = and > and several other places. As it takes no horizontal space it is not easy to spot it.
This character badly affects the attribute values, so better remove it from the source.
Here is code that will list some of the content. I have removed those zero space characters from the HTML:
for (let article of document.querySelectorAll("article")) {
console.log('Article ID:', article.dataset.productId);
console.log('Price:', article.dataset.productPrice);
console.log('Title:', article.querySelector("h1").textContent);
}
<article class="productsummary" data-product-id="1" data-product-price="45.00">​
<img class="product_image" src="images/​cufflinks.jpg" alt="cufflinks">​
<h1 class="product_title">Personalised cufflinks</h1>
<span class="product_price">£45.00</span>
<a class="add_to_basket" href="/add-to-basket?product_id=1">Add to Basket</a>
</article>

Javascript reveal text animations?

I am a bit stumped on this. I just started working with javascript and I am having a little trouble getting this to work. Basically I have links within paragraphs that expand when clicked using javascript. However, I would like to add an effect to this expansion such as fading or scrolling. Previously, I have only added effects to div classes but this isn't a div. Anyway here is my code thanks!
Javascript:
function reveal(a){
var e=document.getElementById(a);
if(!e) return true;
if(e.style.display=="none"){
e.style.display="inline"
} else {
e.style.display="none"
}
return true;
}
html:
<title>Project Star in a Jar</title>
<link rel="stylesheet" type="text/css" href="/../default.css"/>
<script type="text/javascript" src="/jscript/function.js"></script>
</head>
<link rel="shortcut icon" href= "/../images/favicon.png"/>
<link rel="icon" href="/../images/favicon.png" type="image/x-icon">
<body>
<div class="wrapOverall">
<div class="header">
<div class="logo">
<img src="/../images/starJar.png" href="index.php" style="width:100px;height:100px">
<div class="moto">
<h1> Project Star in a Jar</h1>
</div>
</div>
<div class="nav_main">
<ul>
<li>Home</li>
<li>Tutorial</li>
<li>Videos</li>
<li>Other Resources</li>
</ul>
</div>
</div>
<div class="sideContent">
<div class="sideTitle">
<h3>Table of Contents</h3>
</div>
<div class="sideLinks">
<ul>
<li>i. Introduction</li>
<li>1. What is Fusion?</li>
<li>2. Hazards and Safety</li>
<li>3. Vacuum Chamber</li>
<li>4. Inner Grid</li>
<li>5. Outer Grid</li>
<li>6. Vacuum System</li>
<li>7. Electrical System</li>
<li>8. Achieving Fusion </li>
<li>9. Putting it all Together</li>
<li>10. Great, Now What?</li>
</ul>
</div>
</div>
<div class="Content">
<div class="contentTitle">
<h1>Star in a Jar - A How-To Guide</h1>
<h2>Introduction</h3>
</div>
<div class="contentText">
<!--Paragraph One-->
<p>
Why would anyone want to build a <q>star in a jar</q>? Is it because they want to feel like a mad scientist? Because they want to impress their friends or peers? Although these are all possible reasons, the main reason why people have been building and researching these incredible devices is because quite frankly, we are running out of energy solutions. If we don't have a working solution in the next 20-50 years, we either won't have energy or the little energy we produce will be outrageously expensive. The energy that we consume is almost directly matched with the human exponential growth model and we simply cannot be sustained with conventional energy production methods. The more people that are aware or interested in this technology, the faster we will be able to develop fusion based energy solutions.
</p>
<!--Paragraph Two-->
<p>
Since this is an advanced topic and this writing will use highly technical lexis, I will write in such a way as to target multiple audiences. I understand that some of you reading this are doing so because you are planning on building or already built a fusion device and are looking for more useful information to further develop and experiment with your device. On the other hand, some of you may be reading this from a purely academic standpoint and have nor the intention or means to build such a device. That is perfectly fine! If the former, and you are familiar with the terms and already have an understanding of the concepts in this text, then you can read it without expanding the text for a better tailored experience. However, if the latter, and you are unfamiliar with the terms of this field, I have developed the writing on the site to be dynamic and interactive. Every word or phrase you see that is in orange, can be expanded into an explanation when clicked. Try it out with the following phrase!
What is Tritium?.
<a id="para1" style="display:none">
(Tritium is a radioactive isotope of hydrogen with one proton and two neutrons)
</a>
I implemented this feature because I realize that this writing will be read by many different audiences with their own unique purposes for reading. As the writer, I strive to make this writing dynamic to fit their needs independently without compromising convince or enjoyment. Although the main purpose of this tutorial is to give a detailed analysis of the device and its workings in a tutorial based format, it is also to educate and address the needs and wants of the reader and hopefully, in the process raise awareness to a phenomenal technology that will change the world.
</p>
<!--Content Image-->
<div class="contentImage">
<img src="/../images/intro1.jpg" style="width:300px;height:300px">
</div>
<!--Paragraph Three-->
<p>
This tutorial is not going over any new exotic technology. The particular machine described uses very basic principles of classical physics and has been around since the mid 60's. Depending on the materials you have on hand, your results will vary. I can guarantee at the bare minimum, you will have a working demo fusor if you follow this tutorial. A demo fusor essentially does everything a normal fusor does, with the exception that no fusion of atoms is occurring. It is called demo because it is typically much easier to build and operate safely and is used to demonstrate the operation of a fusor. The picture on the right is of a preliminary run of my first demo fusor.
</p>
<!--Paragraph Four-->
<p>
<b>WARNING:</b>
</p>
<!--Paragraph Five-->
<p>
Before we begin the tutorial, I would like to point out that although this machine is of a very simple design and can be built from essentially junk, does NOT mean it is by any means safe. The minimum operating voltage for most demo fusors are 2-6kv and 10-30kv for fusors achieving fusion reactions. High voltages are extremely dangerous and the high voltage supplies discussed can and most likely will kill you if an accident occurs or you misuse them. The Hazards and Safety section will go into more detail about the all possible hazards presented and how to deal with them accordingly.
</p>
<!--Paragraph Six-->
<p>
Understand that this is a dangerous experiment that if done improperly, has the potential to harm or kill you or others who do not follow proper safety measures. I do not take any responsibility for death, injury, property damage, potential outrageous energy bills, blown breakers, glowing in the dark, fecaled pants, becoming a green hulk when angry, or failure of experimenter to hold sufficient health, liability or property insurance.
</p>
</div>
<div class="contentSelector">
>>
</div>
</div>
</div>
<div class="footer">
<a>
Copyright # 2014 Project Star In A Jar. All Rights Reserved.<br>
Website and Content Created by Joshua Hess <a target="blank" href="http://www.youtube.com/s28400"><u>(s28400)</u></a>
</a>
</div>
</body>
Your question isn't entirely clear... but it seems like you're just asking how to do animations in JavaScript.
Here's some code you can use for fading in (taken from http://youmightnotneedjquery.com/):
function fadeIn(el) {
el.style.opacity = 0;
var last = +new Date();
var tick = function() {
el.style.opacity = +el.style.opacity + (new Date() - last) / 400;
last = +new Date();
if (+el.style.opacity < 1) {
(window.requestAnimationFrame && requestAnimationFrame(tick)) || setTimeout(tick, 16)
}
};
tick();
}
fadeIn(el);
or, if you are using jQuery (not sure if the tag was listed correctly or not), a similar function has already been made for fading in the library:
$(el).fadeIn();
Do note that a fade in in this way requires you setting opacity values (not display values).
You also mentioned in your question "Previously, I have only added effects to div classes but this isn't a div". Classes are just one way to implement CSS styles, if you've used CSS animations before, the same CSS animation will work for nearly any element (there are cases where they won't), you just have to give that element a class with the animation you want. As this wasn't asking for CSS in specific, you can take a look here for more information: https://developer.mozilla.org/en-US/docs/Web/Guide/CSS/Using_CSS_animations or at this question for fade in specifically: Using CSS for fade-in effect on page load

Get javascript accordion to work with 1 ID

I'm using a nice accordion script from TYMPANUS for my website though the difference is that I am using it multiple times on 1 page like:
<div id="st-accordion" class="st-accordion">
<ul>
<li>
Flowers <span class="st-arrow">View <b>Details</b></span>
<div class="st-content">
<p>She packed her seven versalia, put her initial into the belt and made
herself on the way. When she reached the first hills of the Italic Mountains, she had
a last view back on the skyline of her hometown Bookmarksgrove, the headline of
Alphabet Village and the subline of her own road, the Line Lane.</p>
</div>
</li>
</ul>
</div>
<div id="st-accordion" class="st-accordion">
<ul>
<li>
Flowers <span class="st-arrow">View <b>Details</b></span>
<div class="st-content">
<p>She packed her seven versalia, put her initial into the belt and made
herself on the way. When she reached the first hills of the Italic Mountains, she had
a last view back on the skyline of her hometown Bookmarksgrove, the headline of
Alphabet Village and the subline of her own road, the Line Lane.</p>
</div>
</li>
</ul>
</div>
Please see this FIDDLE 01
As you can see I have 2 separate elements (I need it to be separate) using that accordion but because the ID is the same, you can clearly see that the second accordion isn't working.
Javascript for this fiddle:
$(function() {
$('#st-accordion').accordion();
});
So to get it working, I created separate ID's for each element like in this FIDDLE 02
Javascript for this fiddle:
$(function() {
$('#st-accordion-01, #st-accordion-02').accordion();
});
BUT I don't like having to always create an extra / different ID, so it there a way to get FIDDLE 01 working without having to resort to FIDDLE 02? ... Or is this just not possible?
*The Javascripts posted above are at the very bottom of the javascript section in jsfiddle
There can not / must not be elements with the same id name in one page.
http://www.w3.org/TR/html5/elements.html#the-id-attribute
The id attribute specifies its element's unique identifier (ID). The
value must be unique amongst all the IDs in the element's home subtree
and must contain at least one character. The value must not contain
any space characters.
So to fix your problem, just remove the same ids and just leave the class st-accordion
http://jsfiddle.net/5h559dyj/3/
$(function() {
$('.st-accordion').accordion();
});

JavaScript Regular Expression - grouping, one-or-more characters, excluding set character strings

I'm trying to match and replace broken HTML using a regex, but I've done a couple of full circles with grouping and lookbacks and quantifiers. I'm struggling to match every scenario.
JavaScript, because the issue is triggered in a Web client browser HTML editor.
The broken HTML is specific - any text between a closing LI and the closing list UL or OL, that is not properly formed as a list item.
For instance, this piece here, from the greater example underneath:
</li>
bbb<strong>bbbb</strong><strong>bbb <span style="text-decoration: underline;"><em>bbbbb</em></span></strong>=0==
</ul>
Here is the full example of where the issue could exist:
<ul>
<li>1111</li>
<li>Could be anything here</li>
<li>aaaa</li>
bbb<strong>bbbb</strong><strong>bbb <span style="text-decoration: underline;"><em>bbbbb</em></span></strong>=0==
</ul>
<ol>
<li>more?<li>
<li>echo</li>
</ol>
This is what I intend the HTML to look like using a match + replace.
<ul>
<li>1111</li>
<li>Could be anything here</li>
<li>aaaabbb<strong>bbbb</strong><strong>bbb <span style="text-decoration: underline;"><em>bbbbb</em></span></strong>=0==
</ul>
<ol>
<li>more?<li>
<li>echo</li>
</ol>
A few expressions I've tried are the following, but depending on these (or slight variations), I'm matching too much or not correctly or something:
/<\/li>.*?<\/[ou]l>/mig
/<\/li>([\s\n]*[\w!\.?;,<:>&\\\-\{\}\[\]\(\)~#'"=/]+[\s\n]*)+<\/[ou]l>/mig
/<\/li>([\s\n]*[^\s\n]+[\s\n]*)+<\/[ou]l>/i
Searched for a couple of days on and off, no luck.. I realise I'm probably asking something answered hundreds of times before.
it's recommended to use a dom based approach to pocessing html
using jQuery:
$('ul>:not(li)').wrapAll('<li></li>');

Categories

Resources