Javascript to Parse Reddit Comment Page for Usernames - javascript

I'm working on a bookmarklet for a Sub reddit and I'm trying to grab all the usernames on a comments page so I can parse them, then come back and update info next to them, similar to what RES does. The author of each comment has a class that is prefixed with Author but then has different stuff at the end of the class name. How would I go about grabbing all the usernames?
Then once I have the list, how would I update each one with an additional icon essentially?
Any suggestions/Tutorials that do similar things would be great.
Edit: I'm not really sure what portions of the markup would be helpful without giving a huge block. Here's the same question I asked in the Javascript Subreddit. http://www.reddit.com/r/javascript/comments/yhp7j/best_way_to_find_all_the_usernames_on_a_reddit/
You should be able to Inspect the name elements and See what I'm working with.
Currently working with this: http://net.tutsplus.com/tutorials/javascript-ajax/create-bookmarklets-the-right-way/
So I've got a Hello World Style Bookmarklet working that checks for Jquery and loads it if it's not present and just throws an alert.

From a quick look at the page you linked to in your question, it seems as if the mark-up surrounding user-names is as follows (using, presumably, your user-name as an example):
<a href="http://www.reddit.com/user/DiscontentDisciple" class="author id-t2_4allq" >DiscontentDisciple</a>
If that's the case, and the jQuery library is available (again, from your question), one approach is to simply use:
var authors = [];
$('a.author').html(
function(i, h) {
var authorName = $(this).text();
if ($.inArray(authorName, authors) == -1) {
authors.push(authorName); // an array of author-names
}
return '<img src="path/to/' + encodeURIComponent(authorName) + '-image.png" / >' + h;
});
console.log(authors);
JS Fiddle proof-of-concept.
Or, similarly just use the fact that the user-name seems to be predictably the last portion of the URL in the a element's href attribute:
var authors = [];
$('a.author').html(
function(i, h) {
var authorName = this.href.split('/').pop();
if ($.inArray(authorName, authors) == -1) {
authors.push(authorName);
}
return '<img src="http://www.example.com/path/to/' + authorName+ '-image.png" />' + h;
});
console.log(authors);
JS Fiddle proof-of-concept.
Both of these approaches put the img within the a element. If you want it before the a element, then simply use:
// creates an 'authors' variable, and sets it to be an array.
var authors = [];
$('a.author').each( // iterates through each element returned by the selector
function() {
var that = this, // caches the this variable, rather than re-examining the DOM.
// takes the href of the current element, splits it on the '/' characters,
// and returns the *last* of the elements from the array formed by split()
authorName = that.href.split('/').pop();
// looks to see if the current authorName is in the authors array, if it *isn't*
// the $.inArray returns -1 (like indexOf())
if ($.inArray(authorName, authors) == -1) {
// if authorName not already in the array it's added to the array using
// push()
authors.push(authorName);
}
// creates an image element, concatenates the authorName variable into the
// src attribute-value
$('<img src="http://www.example.com/path/to/' + authorName+ '-image.png" />')
// inserts the image before the current (though converted to a jQuery
// object in order to use insertBefore()
.insertBefore($(that));
});
console.log(authors);
​
JS Fiddle proof-of-concept.
References:
each().
$.inArray().
insertBefore().

Related

I am getting empty values for sports-title and third

i am new to js.
can you tell me why I am getting empty values for sports-title and third.
since we have one div with content in it.
sports-title---->{"0":{}}
third---->{}
providing my code below.
findStringInsideDiv() {
/*
var str = document.getElementsByClassName("sports-title").innerHTML;
*/
var sportsTitle = document.getElementsByClassName("sports-title");
var third = sportsTitle[0];
var thirdHTML = third.innerHTML
//str = str.split(" ")[4];
console.log("sports-title---->" + JSON.stringify(sportsTitle));
console.log("third---->" + JSON.stringify(third));
console.log("thirdHTML---->" + JSON.stringify(thirdHTML));
if ( thirdHTML === " basketball football swimming " ) {
console.log("matching basketball---->");
var menu = document.querySelector('.sports');
menu.classList.add('sports-with-basketball');
// how to add this class name directly to the first div after body.
// but we are not rendering that div in accordion
//is it possible
}
else{
console.log("not matching");
}
}
When you call an object in the Document Object Model (DOM) using any of the GetElement selectors, it returns an object that can be considered that HTML element. This object includes much more than just the text included in the HTML element. In order to access the text of that element, you want to use the .textContent property.
In addition, an HTML class can potentially be assigned to several elements and therefore GetElementsByClassName returns an array so you would have to do the following, for example:
console.log("sports-title---->" + JSON.stringify(sportsTitle[0].textContent));
You can find a brief introduction to the DOM on the W3Schools Website. https://www.w3schools.com/js/js_htmldom.asp If you follow along it gives an overview of different aspects of the DOM including elements.
Maybe this would be helpful
As you see sportsTitle[0].textContent returns full heading and 0 is the index thus you get "0" when you stringify (serialize) sportsTitle. Why 0? Because you have one <h1> element . See this fiddle http://jsfiddle.net/cqj6g7f0/3/
I added second h1 and see the console.log and you get two indexes 0 and 1
if you want to get a word from element so get substring use substr() method https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/substr
One way is to change <h1> class attr to id and do sportsTitle.textContent;
and use substr() on this string
or
2nd way is to remain class attr and do sportsTitle[0].textContent;
and substr() on this string
The 2nd is the better way

How can I get back to the original DOM after being affected by javascript

Imagine I have a loaded HTML page which has been already affected by javascript adding/deleting dynamic elements or new classes/attributes/id to elements while initializing(e.g: original source code [html] tag has no classes, after javascript loads [html] tag has class="no-responsive full-with"). Imagine after that I add/amend some id values manually (through my app). And imagine I need to be able to save in database the original source code (without any amends) but with the id attributes I added manually.
Basically I need to add a given id attribute to an element within the source code of an HTML, loaded through PHP.
Do you guys have any idea of how to do such a thing?
There's no simple solution here. The exact nature of the complex solution will be determined by your full set of requirements.
Updated Concept
You've said that in addition to changing things, you'll also be adding elements and removing them. So you can't relate the changed elements to the originals purely structurally (e.g., by child index), since those may change.
So here's how I'd probably approach it:
Immediately after the page is loaded, before any modifications are made, give every element in the a unique identifier. This is really easy with jQuery (and not particularly hard without it):
var uniqueId = 0;
$("*").attr("data-uid", function() {
return ++uniqueId;
});
Now every element on the page has a unique identifier. Next, copy the DOM and get a jQuery wrapper for it:
var clone = $("html").clone();
Now you have a reliable way to relate elements in the DOM with their original versions (our clones), via the unique IDs. Allow the user to make changes.
When you're ready to find out what changes were made, you do this:
// Look for changes
clone.find("*").addBack().each(function() {
// Get this clone's unique identifier
var uid = $(this).attr("data-uid");
// Get the real element corresponding to it, if it's
// still there
var elm = $("[data-uid=" + uid + "]")[0];
// Look for changes
if (!elm) {
// This element was removed
}
else {
if (elm.id !== this.id) {
// This element's id changed
}
if (elm.className !== this.className) {
// This element's className changed
}
// ...and so on...
}
});
That will tell you about removed and changed elements. If you also want to find added elements, just do this:
var added = $(":not([data-uid])");
...since they won't have the attribute.
You can use the information in clone to reconstruct the original DOM's string:
clone.find("[data-uid]").addBack().removeAttr("data-uid");
var stringToSend = clone[0].outerHTML;
(outerHTML is supported by any vaguely modern browser, the latest to add it was Firefox in v11.)
...and of course the information above to record changes.
Live proof of concept
HTML:
<p class="content">Some content</p>
<p class="content">Some further content</p>
<p>Final content</p>
<input type="button" id="makeChange" value="Make Change">
<input type="button" id="seeResults" value="See Results">
JavaScript:
// Probably unnecessary, but I wanted a scoping
// function anyway, so we'll give the parser time
// to completely finish up.
setTimeout(function() {
// Assign unique identifer to every element
var uniqueId = 0;
$("*").attr("data-uid", function() {
return ++uniqueId;
});
// Clone the whole thing, get a jQuery object for it
var clone = $("html").clone();
// Allow changes
$("#makeChange").click(function() {
this.disabled = true;
$("p:eq(1)").attr("id", "p1");
$("p:eq(2)").addClass("foo");
alert("Change made, set an id on one element and added a class to another");
});
// See results
$("#seeResults").click(function() {
this.disabled = true;
// Look for changes
clone.find("*").addBack().each(function() {
// Get this clone's unique identifier
var uid = $(this).attr("data-uid");
// Get the real element corresponding to it, if it's
// still there
var elm = $("[data-uid=" + uid + "]")[0];
// Look for changes
if (!elm) {
display("Element with uid " + uid + ": Was removed");
}
else {
if (elm.id !== this.id) {
display("Element with uid " + uid + ": <code>id</code> changed, now '" + elm.id + "', was '" + this.id + "'");
}
if (elm.className !== this.className) {
display("Element with uid " + uid + ": <code>className</code> changed, now '" + elm.className + "', was '" + this.className + "'");
}
}
});
});
function display(msg) {
$("<p>").html(String(msg)).appendTo(document.body);
}
}, 0);
Earlier Answer
Assuming the server gives you the same text for the page every time it's asked, you can get the unaltered text client-side via ajax. That leaves us with the question of how to apply the id attributes to it.
If you need the original contents but not necessarily identical source (e.g., it's okay if tag names change case [div might become DIV], or attributes gain/lose quotes around them), you could use the source from the server (retrieved via ajax) to populate a document fragment, and apply the id values to the fragment at the same time you apply them to the main document. Then send the source of the fragment to the server.
Populating a fragment with the full HTML from your server is not quite as easy as it should be. Assuming html doesn't have any classes or anything on it, then:
var frag, html, prefix, suffix;
frag = document.createDocumentFragment();
html = document.createElement("html");
frag.appendChild(html);
prefix = stringFromServer..match(/(^.*<html[^>]*>)/);
prefix = prefix ? prefix[1] : "<!doctype html><html>";
suffix = stringFromServer.match(/(<\/html>\s*$)/);
suffix = suffix ? suffix[1] : "</html>";
html.innerHTML = stringFromServer.replace(/^.*<html[^>]*>/, '').replace(/<\/html>\s*$/, '');
There, we take the server's string, grab the outermost HTML parts (or use defaults), and then assign the inner HTML to an html element inside a fragment (although the more I think about it, the less I see the need for a fragment at all — you can probably just drop the fragment part). (Side Note: The part of the regular expressions above that identifies the start tag for the html element, <html[^>]*>, is one of those "good enough" things. It isn't perfect, and in particular will fail if you have a > inside a quoted attribute value, like this: <html data-foo="I have a > in me">, which is perfectly valid. Working around that requires much harder parsing, so I've assumed above that you don't do it, as it's fairly unusual.)
Then you can find elements within it via html.querySelector and html.querySelectorAll in order to apply your id attributes to them. Forming the relevant selectors will be great fun, probably a lot of positional stuff.
When you're done, getting back the HTML to send to the server looks like this:
var stringToSend = prefix + html.innerHTML + suffix;

JQuery each() function code is running even when there are no elements in the collection

I have the following code in my application. It is supposed to build a comma separated string from a JQuery collection. The collection is retrieved from some xml. I use JQuery each() to iterate. This is standard code that I use all the time. I declare and define the result variable (patientConditions) first and set it to blank. Within the function I add the found string to the result variable along with a comma. I am not bothered by the trailing comma this leaves if there are results. The problem is that with no results the second line within my each() is running - they probably both are. After the loop has completed (with no matching elements in the xml) the value of the result is ','. It should be blank. I think this is something to do with closures, or hoisting, but I am unable to figure out how its happening. I have hacked a solution to this scenario, but am more worried about the hole in my js knowledge :(
var patientConditions = '';
$xml.find('patient>prescription>conditions').each(function() {
var conditionName = $(this).find('condition>name');
patientConditions += conditionName.text() + ',';
});
From what I can understand there is a match for patient>prescription>conditions, but not for condition>name, in that case $(this).find('condition>name') will return a zero elemet set. then .text() on that set will return a empty string
$xml.find('patient>prescription>conditions').each(function() {
var conditionName = $(this).find('condition>name');
if(conditionName.length){
patientConditions += conditionName.text() + ',';
}
});
Whenever a jQuery object is used to find non existant nodes, in this case $(this).find('condition>name'). The jQuery object still exists, it just contains no association to a node. This will allow you to run all jQuery functions on this object despite it not having any reference. This is why conditionName.text() returns an empty string despite no node being present. The solution, check if the node exists before doing anything.
var patientConditions = '';
$xml.find('patient>prescription>conditions').each(function() {
var conditionName = $(this).find('condition>name');
if (conditionName.length > 0) {
patientConditions += conditionName.text() + ',';
} else {
// Do something if node doesnt exist
}
});

find and replace text , word translation

I have some problems with finding and replacing words in PHP files, especially when there is tons of them. So I thought that I will try to use javascript / jQuery.
I'd like to create table witch word_to_replace#new_word to do so.
This is my code which doesn't work (and runs very long), filter doesn't seem to work,
any advices?
(function($){
var arr = [ 'photo-board.pl przyjazny portal fotograficzny# ','Upload images from your#Upload zdjęć z ',
'Total number of images# Całkowita liczba zdjęć'];
for(var i in arr)
{
var st_to_replace = arr[i].split('#')[0];
// alert(st_to_replace);
$('*').filter(function() {
return $(this).text() == st_to_replace;
}).html(arr[i].split('#')[1]);
}
}) (jQuery)
You're getting the text() of every page element (which will include the text of child elements) and replacing within it. That means, when you get the 'body' element you replace all the text, and then you get all the elements within body, and replace all the text, etc.
Something like this may work better:
(function($){
var arr = [ 'photo-board.pl przyjazny portal fotograficzny# ','Upload images from your#Upload zdjęć z ',
'Total number of images# Całkowita liczba zdjęć'];
var bodyText = $('body').html();
$.each(arr, function(i, v) {
var words = v.split('#');
var fromTxt = words[0], toTxt = words[1];
bodyText = bodyText.replace(fromTxt, toTxt);
});
$('body').html(bodyText);
})(jQuery);
Demo here.
It's worth noting though that since this destroys and recreates the entire body content, you'll loose event handlers and data set using .data(...).
One of the performance issues of your script is the immediate call to .html every iteration of the filter. This is causing the browser to repaint the element every iteration.
You might consider editing the html detached from the dom and then, after the for and the filter loops, replacing the html in the dom.

construct a DOM tree from a string without loading resources (specifically images)

So I am grabbing RSS feeds via AJAX. After processing them, I have a html string that I want to manipulate using various jQuery functionality. In order to do this, I need a tree of DOM nodes.
I can parse a HTML string into the jQuery() function.
I can add it as innerHTML to some hidden node and use that.
I have even tried using mozilla's nonstandard range.createContextualFragment().
The problem with all of these solutions is that when my HTML snippet has an <img> tag, firefox dutifully fetches whatever image is referenced. Since this processing is background stuff that isn't being displayed to the user, I'd like to just get a DOM tree without the browser loading all the images contained in it.
Is this possible with javascript? I don't mind if it's mozilla-only, as I'm already using javascript 1.7 features (which seem to be mozilla-only for now)
The answer is this:
var parser = new DOMParser();
var htmlDoc = parser.parseFromString(htmlString, "text/html");
var jdoc = $(htmlDoc);
console.log(jdoc.find('img'));
If you pay attention to your web requests you'll notice that none are made even though the html string is parsed and wrapped by jquery.
The obvious answer is to parse the string and remove the src attributes from img tags (and similar for other external resources you don't want to load). But you'll have already thought of that and I'm sure you're looking for something less troublesome. I'm also assuming you've already tried removing the src attribute after having jquery parse the string but before appending it to the document, and found that the images are still being requested.
I'm not coming up with anything else, but you may not need to do full parsing; this replacement should do it in Firefox with some caveats:
thestring = thestring.replace("<img ", "<img src='' ");
The caveats:
This appears to work in the current Firefox. That doesn't meant that subsequent versions won't choose to handle duplicated src attributes differently.
This assumes the literal string "general purpose assumption, that string could appear in an attribute value on a sufficiently...interesting...page, especially in an inline onclick handler like this: <a href='#' onclick='$("frog").html("<img src=\"spinner.gif\">")'> (Although in that example, the false positive replacement is harmless.)
This is obviously a hack, but in a limited environment with reasonably well-known data...
You can use the DOM parser to manipulate the nodes.
Just replace the src attributes, store their original values and add them back later on.
Sample:
(function () {
var s = "<img src='http://www.google.com/logos/olympics10-skijump-hp.png' /><img src='http://www.google.com/logos/olympics10-skijump-hp.png' />";
var parser = new DOMParser();
var dom = parser.parseFromString("<div id='mydiv' >" + s + "</div>", "text/xml");
var imgs = dom.getElementsByTagName("img");
var stored = [];
for (var i = 0; i < imgs.length; i++) {
var img = imgs[i];
stored.push(img.getAttribute("src"));
img.setAttribute("myindex", i);
img.setAttribute("src", null);
}
$(document.body).append(new XMLSerializer().serializeToString(dom));
alert("Images appended");
window.setTimeout(function () {
alert("loading images");
$("#mydiv img").each(function () {
this.src = stored[$(this).attr("myindex")];
})
alert("images loaded");
}, 2000);
})();

Categories

Resources