How do I retrieve data from a class in a span? - javascript

I need to retrieve some portion of data from HTML code. Here it is :
<span
class="Z3988" style="display:none;"
title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&
rfr_id=info%3Asid%2Focoins.info%3Agenerator&rft.genre=article&
rft.atitle=Parliamentarism Rationalized&
rft.title=East European Constitutional Review&
rft.stitle=E. Eur. Const. Rev.&rft.date=1993&
rft.volume=2&rft.spage=33&rft.au=Tanchev, Evgeni&
rft_id=http://heinonline.org/HOL/Page?handle%3Dhein.journals/eeurcr2%26id%3D33%26div%3D%26collection%3D">
</span>
I tried to use e.g.:
document.querySelector("span.Z3988").textContent
document.getElementsbyClassName("Z3988")[0].textContent
My final aim is to get what comes after:
rft.atitle (Parliamentarism Rationalized)
rft.title (East European Constitutional Review)
rft.date
rft.volume
rft.spage
rft.au
How do I do that? I'd like to avoid RegEx.

Get the title text of span,
Spit it at = , join using character that will not appear in the string I prepared ^, do same for ;, and split at unique character used ^ in this case and then pick value at every even index. If you need string just join it.
Example Sinppet:
var spanTitle = document.getElementsByClassName("Z3988")["0"].getAttribute("title");
var data = spanTitle.split("=").join("^").split(";").join("^").split("^")
var finaldata = data.filter(function(d, index) {
return !!index % 2;
})
console.log(finaldata)
<span class="Z3988" style="display:none;" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&
rfr_id=info%3Asid%2Focoins.info%3Agenerator&rft.genre=article&
rft.atitle=Parliamentarism Rationalized&
rft.title=East European Constitutional Review&
rft.stitle=E. Eur. Const. Rev.&rft.date=1993&
rft.volume=2&rft.spage=33&rft.au=Tanchev, Evgeni&
rft_id=http://heinonline.org/HOL/Page?handle%3Dhein.journals/eeurcr2%26id%3D33%26div%3D%26collection%3D">
</span>

What you have in your title looks to be a url search query...
var elm = document.querySelector('.Z3988')
var params = new URLSearchParams(elm.title) // parse everything
console.log(...params) // list all
console.log(params.get('rft.title')) // getting one example
<span class="Z3988" style="display:none;" title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rfr_id=info%3Asid%2Focoins.info%3Agenerator&rft.genre=article&rft.atitle=Parliamentarism Rationalized&rft.title=East European Constitutional Review&rft.stitle=E. Eur. Const. Rev.&rft.date=1993&rft.volume=2&rft.spage=33&rft.au=Tanchev, Evgeni&rft_id=http://heinonline.org/HOL/Page?handle%3Dhein.journals/eeurcr2%26id%3D33%26div%3D%26collection%3D"></span>

If you're trying to grab the title attribute:
document.getElementsByClassName("Z3988")[0].getAttribute("title");

The way you're outputting content as text is a really bad method. You could try to print each section of your text into element attributes and retrieve each part with element.getAttribute().
Ex:
<span id='whatever' stitle='content' spage='content'></span>
and retrieve from the selected element.
For the way you have it you might want to try to put that text into a variable and split the values like:
var element_text = document.getElementsbyClassName("Z3988")[0].textContent;
var element_specifics = element_text.split(';'); // Separate the text into array splitting by the ';'

Not sure how this is going to process down with browser compatibilities or JavaScript versions, but you can definitely sub out the arrow functions for vanilla anonymous functions, and "let" for "var". Otherwise, it fits the parameters of no regex, and even creates a nice way to index for your various keywords.
My steps:
Grab the attribute block
Split it up into array elements containing the desired keywords and contents
Split up the desired keywords and contents into sub-arrays
Trim down the contents of each keyword block for symbols and non alphanumerics
Construct the objects for convenient indexing
Obviously the last portion is just to print out the array of objects in a nice readable format. Hope this helps you out!
window.onload = function() {
let x = document.getElementsByClassName('Z3988')[0].getAttribute('title')
let a = x.split('rft.').map((y) => y.split('='))
a = a.map((x, i) => {
x = x.map((y) => {
let idx = y.indexOf('&')
return y = (idx > -1) ? y.slice(0, idx) : y
})
let x1 = x[0], x2 = x[1], obj = {}
obj[x1] = x2
return a[i] = obj
})
a.forEach((x) => {
let div = document.createElement('div')
let br = document.createElement('br')
let text = document.createTextNode(JSON.stringify(x))
div.appendChild(text)
div.appendChild(br)
document.body.appendChild(div)
})
}
<span
class="Z3988" style="display:none;"
title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&
rfr_id=info%3Asid%2Focoins.info%3Agenerator&rft.genre=article&
rft.atitle=Parliamentarism Rationalized&
rft.title=East European Constitutional Review&
rft.stitle=E. Eur. Const. Rev.&rft.date=1993&
rft.volume=2&rft.spage=33&rft.au=Tanchev, Evgeni&
rft_id=http://heinonline.org/HOL/Page?handle%3Dhein.journals/eeurcr2%26id%3D33%26div%3D%26collection%3D">
</span>

Related

Find a string in structured HTML and replace it while maintaining the structure

Let's say I have the following text:
...at anyone who did not dress...
, where I need to find and replace "did not" with "did't" The task is simple until the text has styles and it becomes:
...at anyone who <span style='color: #ff0000;'>did</span>not dress...
If I just do
obj.innerText.replace("did not", "did't"),
then the style will not be saved, and
obj.innerHtml.replace("did not", "did't")
will not find
Is there an elegant solution?
UDP: there is a position in the text of the beginning and end of a phrase/word, as well as an index in case of repetition
const html = document.getElementsByTagName('p')[0].innerHTML;
const tags = html.match(/<\/?[^>]+(>|$)/g) || [];
const textTrue = html
.replace(/<\/?[^>]+(>|$)/g, '')
.replace('did not', "didn't");
var lastIndex = 0;
const tagsIndexs = tags.map((item) => {
lastIndex = html.indexOf(item, lastIndex);
return lastIndex;
});
const output = tags ? tags.reduce((result, tag, index) => {
return (
result.substr(0, tagsIndexs[index]) +
tag+
result.substr(tagsIndexs[index])
);
}, textTrue): textTrue;
document.getElementById('result').innerHTML = output;
<p>d<span style="color: #FF0000">id </span>not</p>
<div id='result'></div>
if 'not' is not styled(as shown in the example) the best approach I can think of is find all 'did' occurrences and then check if there is 'not' in the neighborhood. If yes remove the 'not' and replace the did with didn't. It is however performance intensive since you can not go for replace, but use indexOf in a while loop and manipulate the html string manually. Additionally if the styling varies(<span>,<b>,<i>..) it will be very difficult(if not impossible) to come with a valid criteria to evaluate the existence of 'not' in the neighborhood of the 'did'. The same approach can be used on 'not' instead of did, but again it really depends on the styling you need to preserve.

Looking for the easiest way to extract an unknown substring from within a string. (terms separated by slashes)

The initial string:
initString = '/digital/collection/music/bunch/of/other/stuff'
What I want: music
Specifically, I want any term (will never include slashes) that would come between collection/ and /bunch
How I'm going about it:
if(initString.includes('/digital/collection/')){
let slicedString = initString.slice(19); //results in 'music/bunch/of/other/stuff'
let indexOfSlash = slicedString.indexOf('/'); //results, in this case, to 5
let desiredString = slicedString.slice(0, indexOfSlash); //results in 'music'
}
Question:
How the heck do I accomplish this in javascript in a more elegant way?
I looked for something like an endIndexOf() that would replace my hardcoded .slice(19)
lastIndexOf() isn't what I'm looking for, because I want the index at the end of the first instance of my substring /digital/collection/
I'm looking to keep the number of lines down, and I couldn't find anything like a .getStringBetween('beginCutoff, endCutoff')
Thank you in advance!
your title says "index" but your example shows you wanting to return a string. If, in fact, you are wanting to return the string, try this:
if(initString.includes('/digital/collection/')) {
var components = initString.split('/');
return components[3];
}
If the path is always the same, and the field you want is the after the third /, then you can use split.
var initString = '/digital/collection/music/bunch/of/other/stuff';
var collection = initString.split("/")[2]; // third index
In the real world, you will want to check if the index exists first before using it.
var collections = initString.split("/");
var collection = "";
if (collections.length > 2) {
collection = collections[2];
}
You can use const desiredString = initString.slice(19, 24); if its always music you are looking for.
If you need to find the next path param that comes after '/digital/collection/' regardless where '/digital/collection/' lies in the path
first use split to get an path array
then use find to return the element whose 2 prior elements are digital and collection respectively
const initString = '/digital/collection/music/bunch/of/other/stuff'
const pathArray = initString.split('/')
const path = pathArray.length >= 3
? pathArray.find((elm, index)=> pathArray[index-2] === 'digital' && pathArray[index-1] === 'collection')
: 'path is too short'
console.log(path)
Think about this logically: the "end index" is just the "start index" plus the length of the substring, right? So... do that :)
const sub = '/digital/collection/';
const startIndex = initString.indexOf(sub);
if (startIndex >= 0) {
let desiredString = initString.substring(startIndex + sub.length);
}
That'll give you from the end of the substring to the end of the full string; you can always split at / and take index 0 to get just the first directory name form what remains.
You can also use regular expression for the purpose.
const initString = '/digital/collection/music/bunch/of/other/stuff';
const result = initString.match(/\/digital\/collection\/([a-zA-Z]+)\//)[1];
console.log(result);
The console output is:
music
If you know the initial string, and you have the part before the string you seek, then the following snippet returns you the string you seek. You need not calculate indices, or anything like that.
// getting the last index of searchString
// we should get: music
const initString = '/digital/collection/music/bunch/of/other/stuff'
const firstPart = '/digital/collection/'
const lastIndexOf = (s1, s2) => {
return s1.replace(s2, '').split('/')[0]
}
console.log(lastIndexOf(initString, firstPart))

How do I extract and split coordinates from jQuery values inside brackets?

I'm getting coordinates via jQuery like this and fill them into a form:
$('#location').val(pos);
The problem is that the value is filled in like this:
(40.00000, 150.00000)
How do I extract them from the brackets and "split" them into latitude & longitude values like:
pos_lat = 40.00000;
pos_long = 150.00000;
https://jsfiddle.net/ojcoj74y/
var pos = "(40.00000, 150.00000)";
var pos_segs = pos.slice(1,-1).split(', ');
var pos_lat = pos_segs[0];
var pos_long = pos_segs[1];
UPDATE:
Thanks! Is it possible to run this within a function, too? –
user1996496 1 hour ago
https://jsfiddle.net/ojcoj74y/1/
function getPos(strPos) {
var pos_segs = strPos.slice(1, -1).split(', ');
return {
posLat: pos_segs[0],
posLong: pos_segs[1]
};
}
Remove brackets and empty space and then split by comma. Finally (if you need) parse strings to floats:
positionString.replace(/\(|\)|\s/g, '').split(',')).map(parseFloat);

Make a mountain out of a molehill by replacing it with JavaScript

I want to replace multiple words on a website with other words. That is, I am interested in finding all instances of a source word and replacing it with a target word.
Sample Cases:
Source | Target
Molehill => Mountain
Green => Grey
Google => <a href="http://google.com">
Sascha => Monika
Football => Soccer
This is somewhat of a half answer. It shows the basic process, but also illustrates some of the inherent difficulties in a process like this. Detecting capitalization and properly formatting the replacements would be a bit intensive (probably utilizing something like this on a case-by-case basis How can I test if a letter in a string is uppercase or lowercase using JavaScript?). Also, when dealing with text nodes, innerHTML isn't an option, so the google replacement comes out as plain text instead of HTML.
TLDR - If you have another way to do this that doesn't involve javascript, do it that way.
var body = document.querySelector('body')
function textNodesUnder(el){
var n, a=[], walk=document.createTreeWalker(el,NodeFilter.SHOW_TEXT,null,false);
while(n=walk.nextNode()) a.push(n);
return a;
}
function doReplacements(txt){
txt = txt.replace(/sascha/gi, 'monika')
txt = txt.replace(/mountain/gi, 'molehill')
txt = txt.replace(/football/gi, 'soccer')
txt = txt.replace(/google/gi, 'google')
console.log(txt)
return txt
}
var textnodes = textNodesUnder(body),
len = textnodes.length,
i = -1, node
console.log(textnodes)
while(++i < len){
node = textnodes[i]
node.textContent = doReplacements(node.textContent)
}
<div>Mountains of Sascha</div>
<h1>Playing football, google it.</h1>
<p>Sascha Mountain football google</p>
Here is the JS:
function replaceWords () {
var toReplace = [
["Green","Grey"],
["Google","<a href='http://google.com'>"]
];
var input = document.getElementById("content").innerHTML;
console.log("Input: " + input);
for (var i = 0; i < toReplace.length; i++) {
var reg = new RegExp(toReplace[i][0],"g");
input = input.replace(reg,toReplace[i][1]);
}
document.getElementById("content").innerHTML = input;
};
replaceWords();

I would like to get the value of a given token with in a string

I am currently working on a project that will allow me to bring in a string that would have a designated token that I will grab, get the designated value and remove the token and push to an array. I have the following condition which I am using split in JavaScript but it is not splitting on the designated ending token.
This is the beginning string
"~~/Document Heading 1~~<div>This is a test <b>JUDO</b> TKD</div>~~end~~<div class="/Document Heading 1">This is a test <b>JUDO</b> TKD</div>"
Current Code Block
var segmentedStyles = [];
var contentToInsert = selectedContent.toString();
var indexValue = selectedContent.toString().search("~~");
if (indexValue <= 0) {
var insertionStyle = contentToInsert.split("~~");
segmentedStyles.push(insertionStyle);
}
The designated token is enclosed by a "~~ .... ~~". In this code Block it is going through the condition but the string it is not splitting correctly. I am currently getting the Following string pushed to my array.
This is my current result
[,/Document Heading 1<div>This is a test <b>JUDO</b> TKD</div>end,
<div class="/Document Heading 1">This is a test <b>JUDO</b> TKD</div>]
My Goal
I would like to split a string that is coming in if a token is present. For example I would like to split a string starting from ~~.....~~ through ~~end~~. The array should hold two values like the following
segmentedStyles = [<div>This is a test <b>JUDO</b> TKD</div>],[<div class="/Document Heading 1">This is a test <b>JUDO</b> TKD</div>]
You could use a regular expression for matching the parts.
var string = '~~/Document Heading 1~~<div>This is a test <b>JUDO</b> TKD</div>~~end~~<div class="/Document Heading 1">This is a test <b>JUDO</b> TKD</div>',
array = string.split('~~').filter(function (_, i) {
return i && !(i % 2); // just get element 2 and 4 or all other even indices
});
console.log(array);
Assuming the string always starts with ~~/ you could use the following regex to get the array you want
~~([^\/].*)~~end~~(.*)
https://regex101.com/r/hJ0vM4/1
I honestly didn't quite understand what you're trying to accomplish haha, but I sort of understood what you're trying to do :)
First, just trying to make it clear some stuff. If you split() your string using /~~/ as the Regular Expression for splitting you'll get all the bits surrounded by "~~" in an array, like you did.
Second, if you change the tokens to ~~START~~ and ~~END~~ (tokens that never change) you can accomplish what you want by simply doing string.split(/~~(START|END)~~/) - Much shorter and quicker ;)
Third is the string always in the format ~~<something>~~THE STUFF YOU WANT~~end~~MORE STUFF YOU WANT? If it is, I'd suggest doing this:
function splitTheTokens(str) {
var result = [];
var parts = str.split(/~~end~~/);
for (var i = 0; i < parts.length; i++) {
if (!parts[i]) { continue; } // Skips blanks
if (parts[i].indexOf("~~") == 0) {
// In case you want to do something with the name thing:
var thisPartName = parts[i].substring(2, parts[i].indexOf("~~", 2));
// What (I think) you actually want
var thisPartValue = parts[i].substring(thisPartName.length + 4);
result.push(thisPartValue);
}
else {
result.push(parts[i]);
}
}
return result;
}
Hope this helps :D

Categories

Resources