Determine if splitting two TextFrames will split a paragraph in InDesign

Determine if splitting two TextFrames will split a paragraph in InDesign - javascript

TextFrame#nextTextFrame tells me if a TextFrame overflows; I can also split the two TextFrames the way StorySplitter does.
What I can't seem to figure out is: does the second TextFrame start with a new paragraph, or does a single paragraph extend between the two?
I need this in order to reconstruct the flow afterwards, externally: I need to know if I have to merge the last paragraph of the first TextFrame and the first paragraph of the second TextFrame, or if instead they are two distinct paragraphs.

They would be considered the same Paragraph object. You can test this yourself with something similar to the following code (assuming there is a paragraph that spans two text frames and the text frames are linked together).
var doc = app.activeDocument;
var frame1 = doc.pages[0].textFrames[0];
var frame2 = frame1.nextTextFrame;
var para1 = frame1.paragraphs.lastItem();
var para2 = frame2.paragraphs.firstItem();
alert(para1 === para2);

Related

Javascript - create an array of words and tags from a paragraph

I have this:
<p><strong>test the strong tag</strong> some dummy text test the link some other text Test another link</p>
I need an JS array:
['<strong>Test the strong tag</strong>', 'some', 'dummy', 'text', 'test the link', 'some', 'other', 'text', 'Test another link']
Help, im giving up!
p.s. It's a part of a bigger idea to distribute the content of a section into horizontal slides (which can be animated to left and right). For that i need every word being wrapped into span to measure its position and determine if it belongs to the next "line". sry it's hard to explain the whole thing.

This is a really neat problem and it was really fun to work on!
The solution: Assuming you named your p element test. Like this:
<p id="test"><strong>test the strong tag</strong> some dummy text test the link some other text Test another link</p>
The javascript would be:
var nodes = document.getElementById("test").childNodes;
var arr = [];
for(var i = 0; i < nodes.length; i++) {
if (nodes[i].nodeName == "#text") {
arr.push(nodes[i].nodeValue);
continue;
}
arr.push(nodes[i].outerHTML);
}
This works using the node part of HTML. HTML nodes include text and HTML elements. The first line just gets the nodes of an element called 'test'. The second line initializes an array to push to later. The third line starts iterating over the child nodes. The fourth through seventh lines deal with text nodes and the eighth line adds it to the array. If you want the text to be split like how you put in the question this code does it:
var nodes = document.getElementById("t").childNodes;
var arr = [];
for(var i = 0; i < nodes.length; i++) {
if (nodes[i].nodeName == "#text") {
arr = arr.concat(nodes[i].nodeValue.split(" "));
continue;
}
arr.push(nodes[i].outerHTML);
}
arr.filter(v=>v!='');

So you want to have every tags on its own line, and every word not included in a tag being on its own line.
I don't think there is a ready-made utility for that.
What I would do is writing a custom parser. You loop on every char and you execute these two primary checks :
When you come across a < char, you switch the "tag" flag on. Then you read the tag name until > char. Keep the tag name in a variable. You keep adding chars until you find the corresponding closing tag, and you switch the "tag" flag off.
When "tag" flag is off, you are in "word-mode", so you keep adding chars until you come across a space char.

Get sub part of HTML code in jquery (match entire paragraphs instead of line by line)

I make an ajax call from jquery to obtain entire html of a page. I now want to extract a sub section of this html code between String 1 and String 2. I tried using regex but it matches line by line hence returns null for :
data.match(new RegExp("String 1(.*)String 2"));
What can I use to match entire paragraph since the two Strings are present in different lines and I want the part between these 2 lines.

An example of my comment:
var data = '<html><head></head><body><p>1 and it could span multiple lines.</p><p>2 and it could span multiple lines.</p></body>';
//create an container element in memory
var container = document.createElement("div");
//serialize the HTML into the container
container.insertAdjacentHTML("beforeend", data);
//grab the paragraphs using querySelector
var pars = container.querySelectorAll("p");
//loop the nodelist with Array's forEach
Array.prototype.forEach.call(pars, function(element){
console.log(element.textContent);
});

How can I get the next paragraph after my selection in InDesign?

I'm using Adobe InDesign and ExtendScript to find a keyword using app.activeDocument.findGrep(), and I've got this part working well. I know findGrep() returns an array of Text objects. Let's say I want to work with the first result:
var result = app.activeDocument.findGrep()[0];
How can I get the next paragraph following result?

Use the InDesign DOM
var nextParagraph = result.paragraphs[-1].insertionPoints[-1].paragraphs[-1];
The Indesign DOM has different Text objects that you can use to address paragraphs, words, characters, or insertion points (the space between characters where your blinking cursor sits). A group of Text objects is called a collection. Collections in Indesign are similar to arrays, but one significant difference is that they can be addressed from the back by using a negative index (paragraphs[-1]).
result refers to the findGrep() result. It can be any Text object, depending on your search terms.
paragraphs[-1] means the last paragraph of your result (Paragraph A). If the search result is just one word, then this refers the word's enclosing paragraph, and this collection of paragraphs has just one element.
insertionPoints[-1] refers to the last insertionPoint of Paragraph A. This comes after the paragraph mark and before the first character of the next paragraph (Paragraph B). This insertionPoint belongs to both this paragraph and the following paragraph.
paragraphs[-1] returns the last paragraph of the insertionPoint, which is Paragraph B (the next paragraph).

Althouh nextItem seems totally appropriate and efficient, it may be a source of performance leaks especially if you call it several times in a huge loop. Keep in mind that nextItem() is a function creating an internal scope and stuff…
An alternative is to navigate within the story and reach the next paragraph thanks to the insertionPoints indeces:
var main = function() {
var doc, found, st, pCurr, pNext, ipNext, ps;
if (!app.documents.length) return;
doc = app.activeDocument;
app.findGrepPreferences = app.changeGrepPreferences = null;
app.findGrepPreferences.findWhat = "\\A.";
found = doc.findGrep();
if ( !found.length) return;
found = found[0];
st = found.parentStory;
pCurr = found.paragraphs[0];
ipNext = st.insertionPoints [ pCurr.insertionPoints[-1].index ];
var pNext = ipNext.paragraphs[0];
alert( pNext.contents );
};
main();
Not claiming the absolute truth here. Just advising about possible issues with nextItem().

more simply code below
result.paragraphs.nextItem(result.paragraphs[0]);
thank you
mg.

How to split an HTML paragraph up into its lines of text with JavaScript

Is it possible (and how can I) split a paragraph of text up into its respective lines with JavaScript. What I'd like to do is implement hanging punctuation on every line of a paragraph or blockquote, not just the starting line.
Any other ideas would be welcome too!
Clarification I was asked to be less ambiguous about what "split a paragraph into its respective lines" means.
In HTML a <p> element creates a block of text. Similarly, so do many other elements. These bodies of text get wrapped to fit the width (whether that width is set by the css or assumed by the default setting) I can't seem to detect where the line breaks happen using regex or any other means. So let's say a paragraph ends up being 7 lines long, I'd like to be able to detect that it's seven lines, and where those lines start and end.
Looking for a \n or a \r doesn't seem to yield anything.

A brute force way to do it is to split all the words in your paragraph and make spans of them. You can then measure the offsetTop property of your spans to find which ones end up in different lines.
In the snippet below, getLines() returns an array of arrays where each inner array is an contains the span elements for each word in a line. You could then manipulate that as you wish to create your hanging punctuation using some CSS, maybe by inserting absolutely positioned spans with your punctuation.
//So it runs after the animation
setTimeout(function(){
splitLines();
showLines();
}, 1000)
function showLines() {
var lines = getLines();
console.log(
lines.map(function(line) {
return line.map(function(span) {
return span.innerText;
}).join(' ')
}));
}
function splitLines() {
var p = document.getElementsByTagName('p')[0];
p.innerHTML = p.innerText.split(/\s/).map(function(word) {
return '<span>' + word + '</span>'
}).join(' ');
}
function getLines() {
var lines = [];
var line;
var p = document.getElementsByTagName('p')[0];
var words = p.getElementsByTagName('span');
var lastTop;
for (var i = 0; i < words.length; i++) {
var word = words[i];
if (word.offsetTop != lastTop) {
lastTop = word.offsetTop;
line = [];
lines.push(line);
}
line.push(word);
}
return lines;
}
<p>Here is a paragraph that we want to track lines for. Here is a paragraph that we want to track lines for. Here is a paragraph that we want to track lines for Here is a paragraph that we want to track lines for Here is a paragraph that we want to track
lines for Here is a paragraph that we want to track lines for</p>
Here's a fiddle that you can resize the window so the paragraph changes size http://jsfiddle.net/4zs71pcd/1/

Looks like the hanging-punctuation css property only makes sure that any punctuation at the start of the first formatted line of an element hangs. So you would want to dynamically split the text into lines of the correct length, throw those into new <p> elements (or blockquotes) & apply hanging-punctuation: 'first' to those new elements. As of right now no major browser supports the hanging-punctuation property (citation).
Normally I would recommend checking where the newline character (\n) is inside the text, but most often no one explicitly puts that in the text they write. Instead they let the browser decide where to add the new lines depending on the window size (something like word wrap).This gets even trickier when you start to consider that there could be multiple lines in a given <p> element, and depending on the size of the browser window, the line could be split anywhere. You'd have to grab the text, find the width of it's container, and somehow see where in the text string it hits that width. Heres a great blogpost that talks about how to implement this in a more general sense though.

Trying to find length of text within div with jquery

I need to find the length of text (ie. number of characters) of text within a specified div (#post_div) EXCLUDING HTML formatting AND the content of a NON specific span . So any embedded span that is NOT #span1 #span2 needs to be excluded from the count.
So far I have the following solution which works, but it adds/removes from the DOM which I would prefer not to do.
var post = $("#post_div");
var post2 = post.html(); //duplicating for later
post.find("span:not(#span1):not(#span2)").remove(); //removing unwanted (only for character count) spans from DOM - YUCK!
post = $.trim(post.text());
console.log(post.length); // The correct length is here.
$("#post_div").html(post2); //replacing butchered DIV with original duplicate in DOM - YUCK!
I would prefer to achieve the same result, but without butchering the DOM/adding/replacing things from it for a simple character count.
Hope that makes sense

Instead of duplicating the HTML then working on the original node, duplicate the node and work on it outside of the main DOM tree.
var post = $("#post_div").clone();
post.find("span:not(.post_tag):not(.post_mentioned)").remove();
post = $.trim(post.text());
console.log(post.length); // The correct length is here.

Actually, the simple
var t = $.trim($("#post_div span.post_tag, #post_div span.post_mentioned").text());
console.log(t.length);
Should Suffice.
However, if you have textual content Outside of span Elements, you would have to use
var t = $.trim($("#post_div").text());
var t_inner = $("#post_div span:not(.post_tag):not(.post_mentioned)").text());
console.log(t.length - t_inner.length);

Develop Reference

JavaScript is the programming language of the Web.

Determine if splitting two TextFrames will split a paragraph in InDesign - javascript

Related

Javascript - create an array of words and tags from a paragraph

Get sub part of HTML code in jquery (match entire paragraphs instead of line by line)

How can I get the next paragraph after my selection in InDesign?

How to split an HTML paragraph up into its lines of text with JavaScript

Trying to find length of text within div with jquery

Categories

Resources