Remove ads from a Webpage - javascript

I want to make chrome extension to remove some advertisments from webpages.
I think I have to learn to remove lines of text from a longer text to do that:
For example, given this text:
This is text that contains some string bla This is text that contains
some string bla This is text DELETEME some string bla This is text
that contains some string bla This is text that contains some string
bla
how can I remove the whole line of the text that contains the string DELETEME?
as text will be used document.body.InnerHTML; I want to take out some garbage :}

You could split the text into lines:
var lines = text.split('\n');
Then filter out the ones that contained that string:
lines.filter(function(line) {
return line.indexOf('DELETEME') === -1;
})
Then join that back together:
text = lines.filter(function(line) {
return line.indexOf('DELETEME') === -1;
}).join('\n');

For your project (creating a chrome extension to remove ads) you do not actually
need to manipulate text, you need to manipulate HTML code, or more specificall: the DOM, the document object model that represents the HTML code in your browser.
You can do this quite conveniently with jQuery.
Here's how you would delete ALL li's containting DELETEME:
$('li:contains(DELETEME)').remove();
here's a fiddle: http://jsfiddle.net/bjelline/RrCGw/
The best way to try this out is to open den developer tools in your browser and type the commands into the console, then you will see the effect immediately.
For example: google for "learn javascript" - you'll probably see an ad.
Look at the HTML source code to finde out that the id is inside a div with
id "tads". Now open the console and type in
$('#tads').remove();
And the ads will disappear.
You specifically asked about manipulating text. When manipulating text it's a good idea to learn about regular expressions - not just for JavaScript, you can use them in many programming languages.
If your whole text is stored in the variable string, you could do this:
string = string.replace(/.*DELETEME.*/, "XXXX");
to replace the line with XXXX. Just use an empty string as a replacement to empty it completely:
string = string.replace(/.*DELETEME.*/, "");
The ".*" stands for "any character, repeated as often as necessary", which matches
the text before and after DELETEME. This regular expression only works on one line, so text on other lines is not changed.
See http://jsfiddle.net/bjelline/Wc7ve/ for a working example.
But, as stated above: this is not the right tool for your project.

Related

How to make new line from a xml response string

I get my data from an API, which return XML, I already convert it to json because I use angularjs, the field that I need, store Songs Lyrics and it used this symbol ↵ when ever it should go to new line.
for example :
You shout it loud↵But I can’t hear a word you say↵I’m talking loud, not saying much↵↵I’m criticized but all your bullets ricochet↵You shoot me down, but I get up
example above, is something that I get when I use console.log() but when I show this field to my HTML page, its just string with no ↵ in it. I don't know why it not show in HTML, and if its something to make new line, it's not happening.
I was thinking to replace ↵ with <br /> is it possible? I will be appreciate it if you guys can help me with that.
UPDATE :
I use angularjs and fill the model with lyric and show it with {{lyric}} in my html
but as you can see in picture, when I use console.log($scope.lyric) string is formated well, but when I show the same model in HTML, its like this
Simple regexr string replace should take care of it:
var str = 'You shout it loud↵But I can’t hear a word you say↵I’m talking loud, not saying much↵↵I’m criticized but all your bullets ricochet↵You shoot me down, but I get up';
var formatted = str.replace(/↵/ig, "<br/>\n");
console.log(formatted);
document.write(formatted);
The regexr finds everything that matches the character between the / signs and replaces them with a standard newline \n and a HTML breakline tag <br/>.
The i and g flags mean Case Insensitive and Search Global respectively.
Case Insensitive catches the characters even if they are in a different case. Search Global means that if you input a multi line string, then it will replace on all lines and not just on the first.
I just figure it out, I let you know how it works in case of anyone else face with same problem :
when I show lyric like this :
<p>{{lyric}}</p>
it ignored my new lines. but when I use this :
<pre>{{lyrics}}</pre>
it works!

JavaScript RegExp - How to match a word based on conditions

I'm building a search results page (in Angular) but using regular expressions to highlight the searched 'keywords' based on a condition. I'm having problems with RegExp with getting the correct condition, so apologies if my current syntax is messy, I've been playing about for hours.
Basically for this test i'm highlighting the word 'midlands' and I want to highlight every 'midlands' word except the word within the 'a' tag <a /> of the href="" attribute. So anything that's apart of the URL I do not want to highlight as I'll be wrapping the keywords within a span and this will break the url structure. Can anyone help? - I think I'm almost there.
Here's the current RegExp I'm using:
/(\b|^|)(\s|\()midlands(\b|$)(|\))/gi
Here's a link to test what I'm after.
https://regex101.com/r/wV4gC3/2
Further info, after the view has rendered I grab the the html content of the repeating results and then do a search based on the rendered html with the condition above. - If this helps anyone.
You're going about this all wrong. Don't parse HTML with regular expressions - use the DOM's built in HTML parser and explicitly run the regex on text nodes.
First we get all the text nodes. With jQuery that's:
var texts = $(elem).content().get().filter(function(el){
return el.nodeType === 3; // 3 is text
});
Otherwise - see the answer here for code for getting all text nodes in VanillaJS.
Then, iterate them and replace the relevant text only in the text nodes:
foreach(var text of texts) { // if old browser - angular.forEach(texts, fn(text)
text.textContent = text.textContent.replace(/midlands/g, function(m){
return "<b>" + m + "</b>"; // surround with bs.
});
}

How to manipulate particular character in DOM in JavaScript?

Suppose I have text called "Hello World" inside a DIV in html file. I want to manipulate sixth position in "Hello World" text and replace that result in DOM, like using innerHTML or something like that.
The way i do is
var text = document.getElementById("divID").innerText;
now somehow I got the text and and manipluate the result using charAt for particular position and replace the result in html by replacing the whole string not just that position element. What I want to ask is do we have to every time replace the whole string or is there a way using which we can extract the character from particular position and replace the result in that position only not the whole string or text inside the div.
If you just need to insert some text into an already existing string you should use replace(). You won't really gain anything by trying to replace only one character as it will need to make a new string anyway (as strings are immutable).
jsFiddle
var text = document.getElementById("divID").innerText;
// find and replace
document.getElementById("divID").innerText = text.replace('hello world', 'hello big world');
var newtext=text.replace(text[6],'b'); should work. Glad you asked, I didn't know that would work.
Curious that it works, it doesn't replace all instances of that character either which is odd... I guess accessing characters with bracket notation treats the character as some 'character' object, not just a string.
Don't quote me on that though.
Yes, you have to replace the entire string by another, since strings are immutable in JavaScript. You can in various ways hide this behind a function call, but in the end what happens is construction of a new string that replaces the old one.
Text with div's are actually text nodes and hence we will have to explicitly manipulate their content by replacing the older content with the newer one.
If you are using jQuery then you can refer to the below link for a possible technique:
[link Replacing text nodes with jQuery] http://www.bennadel.com/blog/2253-Replacing-Text-Nodes-With-jQuery.htm.
Behind the scenes, I would guess that jQuery still replaces the entire string ** for that text node**

Unable to parse the JSON correctly

In the response of type application/x-javascript I am picking the required JSON portion in a varaible. Below is the JSON-
{
"__ra":1,
"payload":null,
"data":[
[
"replace",
"",
true,
{
"__html": "\u003Cspan class=\"highlight fsm\" id=\"u_c_0\">I want this text only\u003C\/span>"
}
]
]
}
From the references, which I got from Stackoverflow, I am able to pick the content inside data in the following way-
var temp = JSON.parse(resp).data;
But my aim is to get only the text part of __html value which is I want this text only . Somebody help.
First you have to access the object you targeted:
var html = JSON.parse(resp).data[0][3]._html;
But then the output you want is I want this text only
The html variable doesn't containt that text but some html where the content you're looking for is the text inside a span
If you accept including jQuery in your project you can access that content this way
var text = $(html).text();
To put it all together:
var html = JSON.parse(resp).data[0][3]._html;
var div = document.createElement("div");
div.innerHTML = html;
var text = div.textContent || div.innerText || "";
Kudos #Tim Down for this answer on cross-browser innerHTML: JavaScript: How to strip HTML tags from string?
First you'll need to be a bit more specific with that data to get to the string of text you want:
var temp = JSON.parse(resp).data[0][3]['__html'];
Next you'll need to search that string to extract the data you want. That will largely depend on the regularity of the response you are getting. In any case, you will probably need to use a regular expression to parse the string you get in the response.
In this case, you are trying to get the text within the <span> element in the string. If that was the case for all your responses, you could do something like:
var text = /<span[^>]*>([^<]*)<\/span>/.exec(temp)[1];
This very specifically looks for text within the opening and closing of one span tag that contains no other HTML tags.
The main part to look at in the expression here is the ([^<]*), which will capture any character that is not an opening angled bracket, <. Everything around this is looking for instances of <span> with optional attributes. The exec is the method you perform on the temp string to return a match and the [1] will give you the first and only capture (e.g. the text between the <span> tags).
You would need read up more about RegExp to find out how to do something more specific (or provide more specific information in your question about the pattern of response you are looking for). But's generally well worth reading up on regular expressions if you're going to be doing this kind of work (parsing text, looking for patterns and matches) because they are a very concise and powerful way of doing it, if a little confusing at first.

Recommendations on Triming Large Amounts of Text from a DOM Object

I'm doing some in browser editing, and I have some content that's on the order of around 20k characters long in a <textarea>.
So it looks something like:
<textarea>
Text 1
Text 2
Text 3
Text 4
[...]
Text 20,000
</textarea>
I'd like to use jquery to trim it down when someone hits a button to chop, but I'm having trouble doing it without overloading the browser. Assume I know that the character numbers are at 16,510 - 17,888, and what I'd like to do is trim it.
I was using:
jQuery('#textsection').html(jQuery('textarea').html().substr(range.start));
But browsers seem to enjoy crashing when I do this. Alternatives?
EDIT
Solution from the comments:
var removeTextNode = document.getElementById('textarea').firstChild;
removeTextNode.splitText(indexOfCharacterToRemoveEverythingBefore);
removeTextNode.parentNode.removeChild(removeTextNode);
Not sure about jQuery, but with plain vanilla Javascript, this can be done by using the splitText() method of the textNode object. Your <pre> has a textNode child which contains all the text inside of it. (You can get it from the childNodes collection.) Split it at the desired index, then use removeChild() to delete the part you don't need.
What browser are you testing on?
substr takes the starting index, and an optional length. If the length is omitted, then it extracts upto the end of the string. substring takes the starting and ending index of the string to extract, which I think might be a better option since you already have those available.
I've created a small example at fiddle using the book Alice's Adventures in Wonderland, by Lewis Carroll. The book is about 160,000 characters in length, and you can try with various starting/ending indexes and see if it crashes the browser. Seems to work fine on my Chrome, Firefox, and Safari. Unfortunately I don't have access to IE. Here's the function that's used:
function chop(start, end) {
var trimmedText = $('#preId').html().substring(start, end);
$('textarea').val(trimmedText);
}
​

Categories

Resources