I'm using Adobe InDesign and ExtendScript to find a keyword using app.activeDocument.findGrep(), and I've got this part working well. I know findGrep() returns an array of Text objects. Let's say I want to work with the first result:
var result = app.activeDocument.findGrep()[0];
How can I get the next paragraph following result?
Use the InDesign DOM
var nextParagraph = result.paragraphs[-1].insertionPoints[-1].paragraphs[-1];
The Indesign DOM has different Text objects that you can use to address paragraphs, words, characters, or insertion points (the space between characters where your blinking cursor sits). A group of Text objects is called a collection. Collections in Indesign are similar to arrays, but one significant difference is that they can be addressed from the back by using a negative index (paragraphs[-1]).
result refers to the findGrep() result. It can be any Text object, depending on your search terms.
paragraphs[-1] means the last paragraph of your result (Paragraph A). If the search result is just one word, then this refers the word's enclosing paragraph, and this collection of paragraphs has just one element.
insertionPoints[-1] refers to the last insertionPoint of Paragraph A. This comes after the paragraph mark and before the first character of the next paragraph (Paragraph B). This insertionPoint belongs to both this paragraph and the following paragraph.
paragraphs[-1] returns the last paragraph of the insertionPoint, which is Paragraph B (the next paragraph).
Althouh nextItem seems totally appropriate and efficient, it may be a source of performance leaks especially if you call it several times in a huge loop. Keep in mind that nextItem() is a function creating an internal scope and stuff…
An alternative is to navigate within the story and reach the next paragraph thanks to the insertionPoints indeces:
var main = function() {
var doc, found, st, pCurr, pNext, ipNext, ps;
if (!app.documents.length) return;
doc = app.activeDocument;
app.findGrepPreferences = app.changeGrepPreferences = null;
app.findGrepPreferences.findWhat = "\\A.";
found = doc.findGrep();
if ( !found.length) return;
found = found[0];
st = found.parentStory;
pCurr = found.paragraphs[0];
ipNext = st.insertionPoints [ pCurr.insertionPoints[-1].index ];
var pNext = ipNext.paragraphs[0];
alert( pNext.contents );
};
main();
Not claiming the absolute truth here. Just advising about possible issues with nextItem().
more simply code below
result.paragraphs.nextItem(result.paragraphs[0]);
thank you
mg.
Related
Update: This is a better way of asking the following question.
Is there an Id like attribute for an Element in a Document which I can use to reach that element at a later time. Let's say I inserted a paragraph to a document as follows:
var myParagraph = 'This should be highlighted when user clicks a button';
body.insertParagraph(0, myParagraph);
Then the user inserts another one at the beginning manually (i.e. by typing or pasting). Now the childIndex of my paragraph changes to 1 from 0. I want to reach that paragraph at a later time and highlight it. But because of the insertion, the childIndex is not valid anymore. There is no Id like attribute for Element interface or any type implementing that. CahceService and PropertiesService only accepts String data, so I can't store myParagraphas an Object.
Do you guys have any idea to achieve what I want?
Thanks,
Old version of the same question (Optional Read):
Imagine that user selects a word and presses the highlight button of my add-on. Then she does the same thing for several more words. Then she edits the document in a way that the start end end indexes of those highlighted words change.
At this point she presses the remove highlighting button. My add-on should disable highlighting on all previously selected words. The problem is that I don't want to scan the entire document and find any highlighted text. I just want direct access to those that previously selected.
Is there a way to do that? I tried caching selected elements. But when I get them back from the cache, I get TypeError: Cannot find function insertText in object Text. error. It seems like the type of the object or something changes in between cache.put() and cache.get().
var elements = selection.getSelectedElements();
for (var i = 0; i < elements.length; ++i) {
if (elements[i].isPartial()) {
Logger.log('partial');
var element = elements[i].getElement().asText();
var cache = CacheService.getDocumentCache();
cache.put('element', element);
var startIndex = elements[i].getStartOffset();
var endIndex = elements[i].getEndOffsetInclusive();
}
// ...
}
When I get back the element I get TypeError: Cannot find function insertText in object Text. error.
var cache = CacheService.getDocumentCache();
cache.get('text').insertText(0, ':)');
I hope I can clearly explained what I want to achieve.
One direct way is to add a bookmark, which is not dependent on subsequent document changes. It has a disadvantage: a bookmark is visible for everyone...
More interesting way is to add a named range with a unique name. Sample code is below:
function setNamedParagraph() {
var doc = DocumentApp.getActiveDocument();
// Suppose you want to remember namely the third paragraph (currently)
var par = doc.getBody().getParagraphs()[2];
Logger.log(par.getText());
var rng = doc.newRange().addElement(par);
doc.addNamedRange("My Unique Paragraph", rng);
}
function getParagraphByName() {
var doc = DocumentApp.getActiveDocument();
var rng = doc.getNamedRanges("My Unique Paragraph")[0];
if (rng) {
var par = rng.getRange().getRangeElements()[0].getElement().asParagraph();
Logger.log(par.getText());
} else {
Logger.log("Deleted!");
}
}
The first function "marks" the third paragraph as named range. The second one takes this paragraph by the range name despite subsequent document changes. Really here we need to consider the exception, when our "unique paragraph" was deleted.
Not sure if cache is the best approach. Cache is volatile, so it might happen that the cached value doesn't exist anymore. Probably PropertiesService is a better choice.
Using Indesign CS5.5, I have a vast collection of groups - all with an image and a textframe. The textframe has 3 paragraphs by default.
I need to get the text from the first paragraph of each textframe.
So far I have this:
var textboxes = app.activeDocument.groups.everyItem().textFrames;
for (i = 0; i <= textboxes.length; i++) {
if(textboxes[i] != 'undefined') {
var product = textboxes[i].contents;
$.writeln(product);
}
}
This gives me ALL the text...I really need to get the first paragraph only OR filter it somehow by font size.
I've tried using textboxes[i].paragraphs[0], but this returns the rather vague Object Invalid. It might be a specific group, but it's too vague for me to tell.
Is there a way to skip and continue if an object is invalid. AND is there perhaps a way to only look for text with a certain font size?
Any help would be greatly appreciated. I find Indesign's scripting API documentation quite poor.
Suggest to use:
var m1stParas = app.activeDocument.groups.everyItem().textFrames.everyItem().paragraphs[0];
which should return an array of paragraphs (each element is a 1st para of each TF from each group)
So you will have a set of text objects. Each object.contents is a string.
In case of error "invalid object" - has your doc possibly empty textFrames in some groups?
Jarek
I need to find the length of text (ie. number of characters) of text within a specified div (#post_div) EXCLUDING HTML formatting AND the content of a NON specific span . So any embedded span that is NOT #span1 #span2 needs to be excluded from the count.
So far I have the following solution which works, but it adds/removes from the DOM which I would prefer not to do.
var post = $("#post_div");
var post2 = post.html(); //duplicating for later
post.find("span:not(#span1):not(#span2)").remove(); //removing unwanted (only for character count) spans from DOM - YUCK!
post = $.trim(post.text());
console.log(post.length); // The correct length is here.
$("#post_div").html(post2); //replacing butchered DIV with original duplicate in DOM - YUCK!
I would prefer to achieve the same result, but without butchering the DOM/adding/replacing things from it for a simple character count.
Hope that makes sense
Instead of duplicating the HTML then working on the original node, duplicate the node and work on it outside of the main DOM tree.
var post = $("#post_div").clone();
post.find("span:not(.post_tag):not(.post_mentioned)").remove();
post = $.trim(post.text());
console.log(post.length); // The correct length is here.
Actually, the simple
var t = $.trim($("#post_div span.post_tag, #post_div span.post_mentioned").text());
console.log(t.length);
Should Suffice.
However, if you have textual content Outside of span Elements, you would have to use
var t = $.trim($("#post_div").text());
var t_inner = $("#post_div span:not(.post_tag):not(.post_mentioned)").text());
console.log(t.length - t_inner.length);
TextFrame#nextTextFrame tells me if a TextFrame overflows; I can also split the two TextFrames the way StorySplitter does.
What I can't seem to figure out is: does the second TextFrame start with a new paragraph, or does a single paragraph extend between the two?
I need this in order to reconstruct the flow afterwards, externally: I need to know if I have to merge the last paragraph of the first TextFrame and the first paragraph of the second TextFrame, or if instead they are two distinct paragraphs.
They would be considered the same Paragraph object. You can test this yourself with something similar to the following code (assuming there is a paragraph that spans two text frames and the text frames are linked together).
var doc = app.activeDocument;
var frame1 = doc.pages[0].textFrames[0];
var frame2 = frame1.nextTextFrame;
var para1 = frame1.paragraphs.lastItem();
var para2 = frame2.paragraphs.firstItem();
alert(para1 === para2);
I'm trying to use javascript/jQuery to wrap any abbreviations in a paragraph in a <abbr title=""> tag.
For example, in a sentence like, The WHO eLENA clarifies guidance on life-saving nutrition interventions, and assists in scaling up action against malnutrition, WHO and eLENA would both be wrapped in an <abbr> tag. I'd like the title attribute to display the extended version of the abbreviation; i.e. WHO = World Health Organization.
Whats the best way of accomplishing this? I'm a bit new to javascript/jQuery so I'm fiddling in the dark here. So far I've created a variable that contains all the abbreviations as key/value pairs, and I can replace a specific instance of an abbreviation, but not much else.
First you must decide exactly what criteria you will use for selecting a replacement -- I would suggest doing it on a word boundary, such that "I work with WHO" will wrap "WHO" in an abbr, but "WHOEVER TOUCHED MY BIKE WILL REGRET IT" won't abbreviate "WHO". You should also decide if you are going to be case sensitive (probably you want to be, so that "The guy who just came in" doesn't abbreviate "who".)
Use jQuery to recurse over all of the text in the document. This can be done using the .children selector and stepping through elements and reading all the text.
For each text node, split the text into words.
For each word, look it up in your key value store to see if it matches a key. If so, get the value, and construct a new element <abbr title="value">key</abbr>.
Break up the text node into a) the text before the abbreviation (a text node), b) the abbreviation itself (an element), and c) the text after the abbreviation (a text node). Insert all three as child nodes of the original text node's parent, replacing the original text node.
Each of these steps will require a bit of work and looking up some API docs, but that is the basic process.
Firstly, this should really be done on the server, doing it on the client is very inefficient and much more prone to error. But having said that...
You can try processing the innerHTML of the element, but javascript and regular expressions are really bad at that.
The best way is to use DOM methods and parse the text of each element. When a matching word is found, replace it with an abbr element. This requires that where a match is found in a text node, the entire node is replaced because what was one text node will now be two text nodes (or more) either side of an abbr element.
Here is a simple function that goes close, but it likely has foibles that you need to address. It works on simple text strings, but you'll need to test it thoroughly on more complex strings. Naturally it should only ever be run once on a particular node or abbreviations will be doubly wrapped.
var addAbbrHelp = (function() {
var abbrs = {
'WHO': 'World Health Organisation',
'NATO': 'North Atlantic Treaty Organisation'
};
return function(el) {
var node, nodes = el.childNodes;
var word, words;
var adding, text, frag;
var abbr, oAbbr = document.createElement('abbr');
var frag, oFrag = document.createDocumentFragment()
for (var i=0, iLen=nodes.length; i<iLen; i++) {
node = nodes[i];
if (node.nodeType == 3) { // if text node
words = node.data.split(/\b/);
adding = false;
text = '';
frag = oFrag.cloneNode(false);
for (var j=0, jLen=words.length; j<jLen; j++) {
word = words[j];
if (word in abbrs) {
adding = true;
// Add the text gathered so far
frag.appendChild(document.createTextNode(text));
text = '';
// Add the wrapped word
abbr = oAbbr.cloneNode(false);
abbr.title = abbrs[word];
abbr.appendChild(document.createTextNode(word));
frag.appendChild(abbr);
// Otherwise collect the words processed so far
} else {
text += word;
}
}
// If found some abbrs, replace the text
// Otherwise, do nothing
if (adding) {
frag.appendChild(document.createTextNode(text));
node.parentNode.replaceChild(frag, node);
}
// If found another element, add abbreviation help
// to its content too
} else if (node.nodeType == 1) {
addAbbrHelp(node);
}
}
}
}());
For the markup:
<div id="d0">
<p>This is the WHO and NATO string.</p>
<p>Some non-NATO forces were involved.</p>
</div>
and calling:
addAbbrHelp(document.getElementById('d0'));
results in (my formatting):
<div id="d0">
<p>This is the<abbr title="World Health Organisation">WHO</abbr>
and <abbr title="North Atlantic Treaty Organisation">NATO</abbr>
string.</p>
<p>Some non-<abbr title="North Atlantic Treaty Organisation">NATO</abbr> forces were involved.</p>
</div>
Using the word break pattern to split words is interesting because in strings like "with non–NATO forces", the word NATO will still get wrapped but not the "non–" part. However, if the abbreviation is split across a text node or by a hyphen, it will not be recognised unless the same pattern is included as a property name in the abbrs object.
Check out the javascript replace method.
I'd use JQuery to pull out all the text in the paragraph
var text = $(p#paragraphId).html()
Use a for loop to loop through the list of abbreviations you have and then use the replace() method mentioned above to swap out the abbreviation for the tag you need.
Finally use JQuery to set the html of the paragraph back to your newly updated string.