Paginate long text using php - javascript

I've a long text (More than 10,000 words) contains html tags stored in a string.
And want to wrap every 1000 words with <div class="chunk"></div> with considering auto close opened html tags and auto open closed html tags in the different chunks.
I found many solutions but they depend on the number of characters and don't consider auto open/close html tags.
Also the php function wordwrap neglects fixing html tags problem.
Simulation
<div id="long-text">
Dynamic long text more than 10,000 words (Text contains HTML (img, p, span, i, ...etc) tags)
</div>
Wrong result
<div id="long-text">
<div class="chunk">
<p>Chunk 1 : first approximately 1000 words with their html tags
<img src="image.jpg"> ## Unclosed <p> tag ##
</div>
<div class="chunk">
## The closed <p> tag of the previous chunk ##
</p><p>Chunk 2 : second approximately 1000 words with their html tags
<img src="image.jpg"> </p><p> ## unclosed <p> tag ##
</div>
<div class="chunk">
## Missing open <p> tag because it was cut in the previous chunk ##
Chunk 3 : third approximately 1000 words with their html tags</p>
</div>
</div>
Expected result
<div id="long-text">
<div class="chunk">
<p>Chunk 1 : first approximately 1000 words with their html tags
<img src="image.jpg"> </p>
</div>
<div class="chunk">
<p>Chunk 2 : second approximately 1000 words with their html tags
<img src="image.jpg"> </p>
</div>
<div class="chunk">
<p>Chunk 3 : third approximately 1000 words with their html tags</p>
</div>
</div>
And then i can paginate the result with javascript.
After searching i found the accepted answer here: Shortening text tweet-like without cutting links inside
cutting the text (from the start only) and auto close opened html tags.
I tried to modify the code to auto open closed tags if i cut from the middle of the text but unfortunately i failed to do the job.
I don't mind if there are another better solutions to paginate the long text according to the number of words using (php or javascript or both of them).

So the idea is to use JQuery to chunk the immediate children via cloning and splitting the internal text. It may need some more work for further nested HTML but it's a start:
function chunkText(length) {
var words = $(this).text().split(" ");
var res = [$(this)];
if (words.length > br) {
var overflow = $(this).clone();
var keepText = words.slice(0,length);
$(this).text(keepText.join(" "));
overflow.text(words.slice(length).join(" "));
res = res.concat(chunkText.call(overflow, length));
}
return res;
}
var br = 10; //Words to split on
$("#long-text > *").each( function () {
var chunks = chunkText.call(this,br);
$.each(chunks, function (i,v) {
$("#long-text")
.append($("<div>").addClass("chunk").append(v))
.append($("<img>").attr("src","image.jpg")));
});
});
Basic demo:
https://jsfiddle.net/o2d8zf4v/

Related

Convert HTMLString to HTML using jQuery or Javascript

I have a string which is basically an HTML String like this -
var str = "<section id='section1' class='class1'> <h3>Type 3 Heading</h3><p>Some text goes in <span> here</span> </p></section>"
I m trying to convert this HTMLString to real HTML Components using this below jQuery function and placing it in the div id sampleSection like this -
$('#sampleSection').html(str);
But it is still loading as normal string, it does take html elements into consideration but places it as normal string like this -
Type 3 Heading Some text goes in here
I want it like this -
Type 3 Heading
Some text goes in here
Note - I am trying to load this on iPad InAppWebView with local HTML and script files. Is there something I m missing or do I need to do it differently? Thanks for your help.
var str = "<section id='section1' class='class1'> <h3>Type 3 Heading</h3><p>Some text goes in <span> here</span> </p></section>"
$("#sample").html(str);
<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<div id="sample"></div>
The problem is in the closing tag of h3..
You have closed it as ...
var str = "<section id='section1' class='class1'> <h3>Type 3 Heading</h3><p>Some text goes in <span> here</span> </p></section>"

how to remove specific html tags with one line using javascript regex code

I want to keep only all these tags <strong></strong> , <em></em>, <p></p> , <strike></strike> etc right now i am using JavaScript regex like this.
var s = "<div><p>p tag</p> <strike>Strike</strike> <strong>strong</strong> in <u>underline</u> <em>italic</em> <span>this is span tag</span> <img src=''><br> final words</div>";
console.log(s.replace(/\<(?!strong|br|em|p|u|strike).*?\>/g, ""));
It is working 50% fine because it is not removing my defined html tags, but problem is it is removing all end tags here is how i am getting the output
Output :
<p>p tag <strike>Strike <strong>strong in <u>underline <em>italic this is span tag <br> final words
but i need the output something like this
Required Output:
<p>p tag</p> <strike>Strike</strike> <strong>strong</strong> in <u>underline</u> <em>italic</em> this is span tag <br> final words
Is there any javascript expert there who could help me with this i really appreciate your help.
Thanks
Match closing tags with an optional / right after the < and use positive lookahead for a word character (to ensure / doesn't get matched):
var s = "<div><p>p tag</p> <strike>Strike</strike> <strong>strong</strong> in <u>underline</u> <em>italic</em> <span>this is span tag</span> <img src=''><br> final words</div>";
console.log(s.replace(/<\/?(?=\w)(?!strong|br|em|p|u|strike).*?>/g, ""));
// ^^^^^^^^^
But regular expressions generally shouldn't be used in an attempt to parse anything but the most trivial HTML

Print multiple pages for different scroll positions via JavaScript

I have an HTML page that is an output of some program.
The program is used a lot to produce evidence and it will be helpful to export this evidence to a PDF.
The HTML is split into two columns.
I need to print (to PDF) multiple pages such that each page shows the two columns where different parts of the columns are aligned.
I rely on the browser ability to print to PDF instead of to the printer.
I tried to use jsPDF but it does not support <pre> tags properly which are prevalent in my text.
The document is a given HTML with two <div>s side by side with a scrollbar.
I already have the code that successfully aligns the two columns in the required positions, and a button that prints it using window.print(). It looks something like this:
function align(loc) {
$('#1').scrollTop($("#mark0"+loc).offset().top);
$('#2').scrollTop($("#mark1"+loc).offset().top);
}
function print() {
window.print();
}
// All locations are known in advance
function printAll() {
for (i=1; i<=3; i++) {
align(i);
print();
}
}
<div id=0 style='left:0; overflow:scroll; height:100%; width:50%'>
A lot of text...
<span id='mark01' />
A lot of text...
<span id='mark02' />
A lot of text...
<span id='mark03' />
A lot of text...
</div>
<div id=1 style='right:0; overflow:scroll; height:100%; width:50%'>
<pre>
Some text.
<b>
A lot of text...
<span id='mark11' />
Some text.
</b>
Some text.
<b>
A lot of text...
</b>
<span id='mark12' />
A lot of text...
<span id='mark13' />
<b>
A lot of text...
</b>
</pre>
</div>
Currently, I have to align the columns using the script (user choose one of the alignments), then print to PDF (user clicks on the print button), then align to a different position, then print again, and so forth.
Is there a way to automatize this process?
I want to create a javascript script that will call align(), then print() multiple times and it will be printed in a single PDF.
The printAll() example obviously does not work.
Using JS, I'd construct a new section laid out for print, then move the content into it.
The goal would be to produce reasonable DOM, like this:
<div id="printzone">
<div class="page">
<div class="col"> col1 section1 </div>
<div class="col"> col2 section1 </div>
</div>
<div class="page">
<div class="col"> col1 section2 </div>
<div class="col"> col2 section2 </div>
</div>
....
</div>
The first part is straightforward, but it seems obvious that the original DOM is not well-suited to the second task. You'll probably have to work with the raw source to accomplish the second step.
Perhaps something like:
// grab source
var col1Source = $('#0').innerHTML;
// treat '<span id="mark##" />' as a delimiter -- BOOM!
var col1Sections = col1Source.split(/<span id='mark\d+'[^>]+>/);
// do same for col2
for(var i = 0,
iMax = Math.max(col1Sections.length, col2Sections.length);
i < iMax;
i++
) {
var newPage = $('<div class="page">');
newPage.append('<div class="col">' + col1Sections[i] + '</div>');
newPage.append('<div class="col">' + col2Sections[i] + '</div>');
$('#printzone').append(newPage);
}
Once you've finished constructing the print zone, open up your dev tools and (1) delete the original #0 and #1 DOM nodes, then (2) play with CSS until the print layout is satisfactory.

How to get count of word in element using javascript?

Hi I would like to do a Word Count in my RTE (Rich Text Editor) with javascript can also use with jquery. But it should not count the html tags and repeating white spaces.
Sample Text:
<h1>Hello World</h1> <p> This is Good!!!</p> answer <h2>thanks! </h2>
The javascript should display 7 only.
Is there any javascript code for this and that is also fast to calculate the Word Count?
Thanks!
EDIT
What if the sample text is this: <p>11 22 33</p><p>44</p>5<br></div>
The javascript should display 5 only.
First you need to get text content of element. You can get text of element using text(). Then you need to remove additional space of text. trim() and replace(/[\s]+/g, " ") remove additional space in text. Now you can convert text to word using split() method.
var length = $(".text").text().trim().replace(/[\s]+/g, " ").split(" ").length;
console.log(length);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div class="text">
<h1>Hello World</h1>
<p> This is Good!!!</p>
answer
<h2>thanks! </h2>
</div>

javascript: split html by class

i'm creating a simple html editor with jquery.
say i have this html:
<div id="content">
page 1
<div class="pageBreak"></div>
page 2
<div class="pageBreak"></div>
page 3
</div>
i want to split my content by pageBreaks to have this output:
page1 buffer: page 1
page2 buffer: <div class="pageBreak"></div>page 2
page3 buffer: <div class="pageBreak"></div>page 3
ideas?
Since your pages are not enclosed by HTML tags (which would be the DOM-friendly way of solving this problem), you will have to do your own processing of the HTML in the content div to break apart the content into your pages. Here's one way to do that using a regular expression to split that content at each of your pagebreaks.
var t = $("#content").html();
var pages = t.split(/<div\s+class\s*=\s*['"]?pageBreak["']?\s*>\s*<\/div>/i);
At this point, the pages array would have the content between each page break.
And, a working example: http://jsfiddle.net/jfriend00/wpyH2/
P.S. Note that the regular expression tries to be very general here because browsers do not always give you back the same innerHTML that you put in the page.

Categories

Resources