Getting JQuery from given HTML text? - javascript

I got a question while I parse html using JQuery.
Let me have a simple example for my question.
As you might definitely know, when I need to parse ...
<li class="info"> hello </li>
I get text by
$(".info").text()
my question is.. for given full html and token of text ,can I find query string ?
in case of above example, what I want to get is.
var queryStr = findQuery(html,"hello") // queryStr = '.info'
I know there might be more than one result and there would be various type of expression according to DOM hierarchy.
So.. generally... If some text (in this example, 'hello' ) is unique in the whole HTML, I might guess there must be a unique and shortest 'query' string which satisfies $(query).text() = "hello"
My question is.. If my guess is valid, How can I get unique (and if possible, shortest ) query string for each given unique text.
any hint will be appreciated, and thx for your help guys!

I create a little function that may help you:
function findQuery(str) {
$("body").children().each(function() {
if ( $.trim($(this).text()) == $.trim(str) ) {
alert("." + $(this).attr("class"))
}
});
}
See working demo

I am not sure what you're actually trying to achieve, but, based on your specific question, you could do the following.
var queryStr = findQuery($('<li class="info"> hello </li>'), "hello"); // queryStr = '.info'
// OR
var queryStr = findQuery('<li class="info"> hello </li>', "hello"); // queryStr = '.info'
alert (queryStr); // returns a string of class names without the DOT. You may add DOT(s) if need be.
function findQuery(html, str) {
var $html = html instanceof jQuery && html || $(html);
var content = $html.text();
if ( content.match( new RegExp(str, "gi") ) ) {
return $html.attr("class");
}
return "no matching string found!";
}

Hope this demo helps you!!
$(document).ready(function() {
var test = $("li:contains('hello')").attr('class');
alert(test);
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<ul>
<li class="info">hello</li>
</ul>
Have used the jQuery attribute ":contains".

Related

Find a string in structured HTML and replace it while maintaining the structure

Let's say I have the following text:
...at anyone who did not dress...
, where I need to find and replace "did not" with "did't" The task is simple until the text has styles and it becomes:
...at anyone who <span style='color: #ff0000;'>did</span>not dress...
If I just do
obj.innerText.replace("did not", "did't"),
then the style will not be saved, and
obj.innerHtml.replace("did not", "did't")
will not find
Is there an elegant solution?
UDP: there is a position in the text of the beginning and end of a phrase/word, as well as an index in case of repetition
const html = document.getElementsByTagName('p')[0].innerHTML;
const tags = html.match(/<\/?[^>]+(>|$)/g) || [];
const textTrue = html
.replace(/<\/?[^>]+(>|$)/g, '')
.replace('did not', "didn't");
var lastIndex = 0;
const tagsIndexs = tags.map((item) => {
lastIndex = html.indexOf(item, lastIndex);
return lastIndex;
});
const output = tags ? tags.reduce((result, tag, index) => {
return (
result.substr(0, tagsIndexs[index]) +
tag+
result.substr(tagsIndexs[index])
);
}, textTrue): textTrue;
document.getElementById('result').innerHTML = output;
<p>d<span style="color: #FF0000">id </span>not</p>
<div id='result'></div>
if 'not' is not styled(as shown in the example) the best approach I can think of is find all 'did' occurrences and then check if there is 'not' in the neighborhood. If yes remove the 'not' and replace the did with didn't. It is however performance intensive since you can not go for replace, but use indexOf in a while loop and manipulate the html string manually. Additionally if the styling varies(<span>,<b>,<i>..) it will be very difficult(if not impossible) to come with a valid criteria to evaluate the existence of 'not' in the neighborhood of the 'did'. The same approach can be used on 'not' instead of did, but again it really depends on the styling you need to preserve.

Finding and replacing string if present in a JSONObj Javascript

I have a JSONObj which contains various elements. I want to perform a Regex (or some type of search) on the text data of this object and find various strings, and replace them with some other text, for example, I want to replace the string "content" with the string "http://www.example.com/content" :
description = jsonObj.channel.item[jsonCounter].description.replace(/\/content/g, "http://www.example.com/content");
This works perfectly, but I want to first check if a string is present, and then replace it, I tried :
if (holder.indexOf("about-us") !== -1) {
description = jsonObj.channel.item[jsonCounter].description.replace(/\/about-us/g, "http://www.example.com/about-us");
} else {
description = jsonObj.channel.item[jsonCounter].description.replace(/\/content/g, "http://www.example.com/content");
}
But this doesn't seem to work. Can anyone help me solve this issue?
As you said :
holder is my JSONObj converted to a string :
var holder = jsonObj.toString();
var holderJSON = {url:"http://www.example.com/about-us"}
alert(holderJSON.toString()); **// this returns [Object Object]**
if (holder.indexOf("about-us") !== -1) **// is never true.**
Hope this helps!!

Proper Use Of YouTube Url Regex

I found this regex on stack overflow to get the youtube video id.
function ytVidId(youtubeurl) {
var p = /^(?:https?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:embed\/|v\/|watch\?v=|watch\?.+&v=))((\w|-){11})(?:\S+)?$/;
return (url.match(p)) ? RegExp.$1 : false;
}
I feel like I'm missing something very obvious, but I just don't understand how to actually use it. How do I get this to affect my text input field named "youtubeurl" before it's prepared for the database?
Thanks a lot... Any help appreciated!
First you need to get the text from the textbox, perhaps with document.getelementbyid().value, then you could do (e.g if your textbox's id is #txt1 you can use this code:
function ytVidId(youtubeurl) {
var p = /^(?:https?:\/\/)?(?:www\.)?(?:youtu\.be\/|youtube\.com\/(?:embed\/|v\/|watch\?v=|watch\?.+&v=))((\w|-){11})(?:\S+)?$/;
return (url.match(p)) ? RegExp.$1 : false;
}
var ytURL = document.getelementbyid("txt1").value;
var ytID = ytVidID(ytURL);
now the variable ytID contains the youtube video ID and you can add it to the database how you want

How to get first text node of a string while containing bold and italic tags?

String(s) is dynamic
It is originated from onclick event when user clicks anywhere in dom
if string(s)'s first part that is:
"login<b>user</b>account"
is enclosed in some element like this :
"<div>login<b>user</b>account</div>",
then I can get it with this:
alert($(s).find('*').andSelf().not('b,i').not(':empty').first().html());
// result is : login<b>user</b>account
But how can i get the same result in this condition when it is not enclosed in any element .i.e. when it is not enclosed in any element?
I tried this below code which works fine when first part do not include any <b></b> but it only gives "login" when it does include these tags.
var s = $.trim('login<b>user</b> account<tbody> <tr> <td class="translated">Lorem ipsum dummy text</td></tr><tr><td class="translated">This is a new paragraph</td></tr><tr><td class="translated"><b>Email</b></td></tr><tr><td><i>This is yet another text</i></td> </tr></tbody>');
if(s.substring(0, s.indexOf('<')) != ''){
alert(s.substring(0, s.indexOf('<')));
}
Note:
Suggest a generic solution that is not specific for this above string only. It should work for both the cases when there is bold tags and when there ain't any.
So it's just a b or a i, heh?
A recursive function is always the way to go. And this time, it's probably the best way to go.
var s = function getEm(elem) {
var ret = ''
// TextNode? Great!
if (elem.nodeType === 3) {
ret += elem.nodeValue;
}
else if (elem.nodeType === 1 &&
(elem.nodeName === 'B' || elem.nodeName === 'I')) {
// Element? And it's a B or an I? Get his kids!
ret += getEm(elem.firstChild);
}
// Ain't nobody got time fo' empty stuff.
if (elem.nextSibling) {
ret += getEm(elem.nextSibling);
}
return ret;
}(elem);
Jsfiddle demonstrating this: http://jsfiddle.net/Ralt/TZKsP/
PS: Parsing HTML with regex or custom tokenizer is bad and shouldn't be done.
You're trying to retrieve all of the text up to the first element that's not a <b> or <i>, but this text could be wrapped in an element itself. This is SUPER tricky. I feel like there's a better way to implement whatever it is you're trying to accomplish, but here's a solution that works.
function initialText(s){
var test = s.match(/(<.+?>)?.*?<(?!(b|\/|i))/);
var match = test[0];
var prefixed_element = test[1];
// if the string was prefixed with an element tag
// remove it (ie '<div> blah blah blah')
if(prefixed_element) match = match.slice(prefixed_element.length);
// remove the matching < and return the string
return match.slice(0,-1);
}
You're lucky I found this problem interesting and challenging because, again, this is ridiculous.
You're welcome ;-)
Try this:
if (s.substring(0, s.indexOf('<')) != '') {
alert(s.substring(0, s.indexOf('<tbody>')));
}

Dojo Toolkit: how to escape an HTML string?

A user of my HTML 5 application can enter his name in a form, and this name will be displayed elsewhere. More specifically, it will become the innerHTML of some HTML element.
The problem is that this can be exploited if one enters valid HTML markup in the form, i.e. some sort of HTML injection, if you will.
The user's name is only stored and displayed on the client side so in the end the user himself is the only one who is affected, but it's still sloppy.
Is there a way to escape a string before I put it in an elements innerHTML in Dojo? I guess that Dojo at one point did in fact have such a function (dojo.string.escape()) but it doesn't exist in version 1.7.
Thanks.
dojox.html.entities.encode(myString);
Dojo has the module dojox/html/entities for HTML escaping. Unfortunately, the official documentation still provides only pre-1.7, non-AMD example.
Here is an example how to use that module with AMD:
var str = "<strong>some text</strong>"
require(['dojox/html/entities'], function(entities) {
var escaped = entities.encode(str)
console.log(escaped)
})
Output:
<strong>some text</strong>
As of Dojo 1.10, the escape function is still part of the string module.
http://dojotoolkit.org/api/?qs=1.10/dojo/string
Here's how you can use it as a simple template system.
require([
'dojo/string'
], function(
string
){
var template = '<h1>${title}</h1>';
var message = {title: 'Hello World!<script>alert("Doing something naughty here...")</script>'}
var html = string.substitute(
template
, message
, string.escape
);
});
I tried to find out how other libraries implement this function and I stole the idea of the following from MooTools:
var property = (document.createElement('div').textContent == null) ? 'innerText': 'textContent';
elem[property] = "<" + "script" + ">" + "alert('a');" + "</" + "script" + ">";
So according to MooTools there is either the innerText or the textContent property which can escape HTML.
Check this example of dojo.replace:
require(["dojo/_base/lang"], function(lang){
function safeReplace(tmpl, dict){
// convert dict to a function, if needed
var fn = lang.isFunction(dict) ? dict : function(_, name){
return lang.getObject(name, false, dict);
};
// perform the substitution
return lang.replace(tmpl, function(_, name){
if(name.charAt(0) == '!'){
// no escaping
return fn(_, name.slice(1));
}
// escape
return fn(_, name).
replace(/&/g, "&").
replace(/</g, "<").
replace(/>/g, ">").
replace(/"/g, """);
});
}
// that is how we use it:
var output = safeReplace("<div>{0}</div",
["<script>alert('Let\' break stuff!');</script>"]
);
});
Source: http://dojotoolkit.org/reference-guide/1.7/dojo/replace.html#escaping-substitutions

Categories

Resources