Javascript, Text Annotations and Ideas

Javascript, Text Annotations and Ideas - javascript

I am very curious to hear input from others on a problem I've been contemplating for some time now.
Essentially I would like to present a user with a text document and allow him/her to make selections of text and annotate it. Specific to the annotations I aim to achieve the following:
Allow users to make a text selection, annotate it, then save the selection and annotation for reference later
(UI) Support representing overlapped annotations. For example if the string where: "This is the test sentence for my example test sentence", user1 might have an annotation on "is the test sentence for my example" and user2 might have an annotation on "for my example".
Account for a situations where the document's text changes. The annotations would to be updated, if possible.
How would you tackle this from a technical perspective?
Some ideas I've had are:
Use javascript ranges and store an annotation as a pair of integers something like: (document_start_char, document_end_char). Save this pair in the db.
Alternatively, using JS get the text selected and actually save the full text in the db. (not sure how i would then do overlapping annotations)
Represent overlapped annotations by applying a css style to highlight the text then darken the "stack" of annotations where they overlap. Smallest annotation would always have to be on the top of the "stack".
What are your thoughts or areas of improvement? How the heck could i support a document's text being updated without breaking all the annotations?

I'm researching this same question and personally I favor staying away from rolling my own, in favor of an existing open source library like Annotator.

http://mark.koli.ch/2009/09/use-javascript-and-jquery-to-get-user-selected-text.html (404 response)
http://mark.koli.ch/2009/09/05/get-selected-text-javascript.html- (404 response)
Getting the selected text is really easy. Storing it (or its starting/ending points) is also a joke. But what about your point number 3? What if the text changes?
If the text changes, both the original text and the original selection coordinates you stored won't equal the current modified text. You should be aware of the annotations present in the text document, so that everytime it changes, the annotations referencing to that particular piece of changed text should be updated, or deleted (maybe after a quick comparison between the before and after text: are some words missing? or just some words have been corrected?), but this seems really a struggling task.
I think storing the entire text annotation in a db is essential, to avoid it being changed and the annotation lost. This way you will still have the complete text you annotated. Then you should also use a sort of flag to indicate the start character of the annotation, and if the text changes, you could calculate the difference in characters from the document text before the change, and the one after it, and find this way the new starting point of the original annotation (assuming the annotation part of the document text has't changed).
Dividing the text document in as many paragraphs as possible should also help, this way you could separate different pieces of the document and work on one by one.
Now I would really like to see it done! :)

Related

Javascript retrieve linebreaks from dom [duplicate]

I need to add line breaks in the positions that the browser naturally adds a newline in a paragraph of text.
For example:
This is some very long text \n that spans a number of lines in the paragraph.
This is a paragraph that the browser chose to break at the position of the \n
I need to find this position and insert a 
Does anyone know of any JS libraries or functions that are able to do this?
The only solutuion that I have found so far is to remove tokens from the paragraph and observe the clientHeight property to detect a change in element height. I don't have time to finish this and would like to find something that's already tested.
Edit:
The reason I need to do this is that I need to accurately convert HTML to PDF. Acrobat renders text narrower than the browser does. This results in text that breaks in different positions. I need an identical ragged edge and the same number of lines in the converted PDF.
Edit:
#dtsazza: Thanks for your considered answer. It's not impossible to produce a layout editor that almost exactly replciates HTML I've written 99% of one ;)
The app I'm working on allows a user to create a product catalogue by dragging on 'tiles' The tiles are fixed width, absolutely positioned divs that contain images and text. All elemets are styled so font size is fixed. My solution for finding \n in paragraph is ok 80% of the time and when it works with a given paragrah the resulting PDF is so close to the on-screen version that the differences do not matter. Paragraphs are the same height (to the pixel), images are replaced with high res versions and all bitmap artwork is replaced with SVGs generated server side.
The only slight difference between my HTML and PDF is that Acrobat renderes text slightly more narrowly which results in line slightly shorter line length.
Diodeus's solution of adding span's and finding their coords is a very good one and should give me the location of the BRs. Please remember that the user will never see the HTML with the inserted BRs - these are added so that the PDF conversion produces a paragraph that is exactly the same size.
There are lots of people that seem to think this is impossible. I already have a working app that created extremely accurate HTML->PDF conversion of our docs - I just need a better solution of adding BRs because my solution sometimes misses a BR. BTW when it does work my paragraphs are the same height as the HTML equivalents which is the result we are after.
If anyone is interested in the type of doc i'm converting then you can check ou this screen cast:
http://www.localsa.com.au/brochure/brochure.html
Edit: Many thanks to Diodeus - your suggestion was spot on.
Solution:
for my situation it made more sense to wrap the words in spans instead of the spaces.
var text = paragraphElement.innerHTML.replace(/ /g, ' ');
text = ""+text+""; //wrap first and last words.
This wraps each word in a span. I can now query the document to get all the words, iterate and compare y position. When y pos changes add a br.
This works flawlessly and gives me the results I need - Thank you!

I would suggest wrapping all spaces in a span tag and finding the coordinates of each tag. When the Y-value changes, you're on a new line.

I don't think there's going to be a very clean solution to this one, if any at all. The browser will flow a paragraph to fit the available space, linebreaking where needed. Consider that if a user resizes the browser window, all the paragraphs will be rerendered and almost certainly will change their break positions. If the user changes the size of the text on the page, the paragraphs will be rerendered with different line break points. If you (or some script on your page) changes the size of another element on the page, this will change the amount of space available to a floating paragraph and again - different line break points.
Besides, changing the actual markup of your page to mimic something that the browser does for you (and does very well) seems like the wrong approach to whatever you're doing. What's the actual problem you're trying to solve here? There's probably a better way to achieve it.
Edit: OK, so you want to render to PDF the same as "the screen version". Do you have a specific definitive screen version nominated - in terms of browser window dimensions, user stylesheets, font preferences and adjusted font size? The critical thing about HTML is that it deliberately does not specify a specific layout. It simply describes what is on the page, what they are and where they are in relation to one another.
I've seen several misguided attempts before to produce some HTML that will exactly replicate a printed creative, designed in something like a DTP application where a definitive absolute layout is essential. Those efforts were doomed to failure because of the nature of HTML, and doing it the other way round (as you're trying to) will be even worse because you don't even have a definitive starting point to work from.
On the assumption that this is all out of your hands and you'll have to do it anyway, my suggestion would be to give up on the idea of mangling the HTML. Look at the PDF conversion software - if it's any good it should give you some options for font kerning and similar settings. Playing around with the details here should get you something that approximates the font rendering in the browser and thus breaks lines at the same places.
Failing that, all I can suggest is taking screenshots of the browser and parsing these with OCR to work out where the lines break (it shouldn't require a very accurate OCR since you know what the raw text is anyway, it essentially just has to count spaces). Or perhaps just embed the screenshot in the PDF if text search/selection isn't a big deal.
Finally doing it by hand is likely the only way to make this work definitively and reliably.
But really, this is still just wrong and any attempts to revise the requirements would be better. Keep going up one step in the chain - why does the PDF have to have the exact same ragged edge as some arbitrary browser rendering? Can you achieve that purpose in another (better) way?

Sounds like a bad idea when you account for user set font sizes, MS Windows accessibility mode, and the hundreds of different mobile devices. Let the browser do it's thing - trying to have exact control over the rendering will only cause you hours of frustration.

I don't think you'll be able to do this with any kind of accuracy without embedding Gecko/WebKit/Trident or essentially recreating them.

Maybe an alternative: do all line-breaks yourself, instead of relying on the browser. Place all text in pre tags, and add your own linebreaks. Now at least you don't have to figure out where the browser put them.

How to search for specific text that is changed inside a string in Javascript

I am stuck on one problem that I hardly can describe that easily.
I fetch JSON data from an RESTful API that contains several objects which are then placed as text inside a Textarea field, so it can be edited. After the edit is done, a button is clicked and then that string is saved somewhere else in the DB.
So far so good.
Problem comes in the scenario when an user edits that text in the Textarea field and then triggers the API again (answers another questions from the form on the same website), so that fetches another data into that Textarea, but the edited data should be present as well.
E.g. First time there are 2 sentences inserted inside the Textarea:
The car is painted red. The car has 4 wheels.
So then the user changes the first answer in the form, so the Textarea looks like this:
The car is painted blue. The car has 4 wheels.
I got that figured out with the Javascript replace() function, just find the sentence "The car is painted red." and replace it with the "The car is painted blue."
document.getElementById("myTextarea").value = journalTextareaString.replace(tempPreviousAnswer,tempChangedAnswer);
If the text is edited like before/after the sentence, the sentence is replaced normally with the new one, all the added text from the user stays. For example the user has manually inputed some extra text:
I love my car. The car is painted red. The car is nice. The car has 4
wheels.
Now if he switches the car to color blue on the form, the manually edited text stays and only the sentence with the color is changed:
I love my car. The car is painted blue. The car is nice. The car has 4
wheels.
But how do I do it when a user has edited the text from inside the sentence, for example he puts a word "chrome" in between the sentence, like:
The car is painted chrome red. The car has 4 wheels.
Thanks.

The tough thing here is you are appending the text together. Keep them in different boxes. This makes it next to Impossible to differentiate the different fields. Separate them. This makes it possible for you to roll back and also tell the difference about what was edited.

What you want to do is merging the text of a remote party and a local party. This is a complex thing do do, and there is not a simple solution. Please let my explain why..
Example scenario
Lets take for example user (B) "Bob" and user (A) "Allan". They are going to work on a document (or in your case just a string) in collaboration. There is a version 1, that they both start working on, and in the meantime you will have A and B make changes, so you get version 2 from user A. Lets call it A2, and version B from user 2, version B2.
The problem of conflicts
First of all, you should make sure that both these users came from version 1. So you should keep track of versions. Then, an even more complicated problem has to be solved: merge conflicts. Take for example this image:
You are going to have to decide how this conflict is resolved. Is A always going to overrule B? Wich one is overruling wich one? Or are you going to just randomly put the edits of user A and B on the same position in the text together after eachother? How are you going to handle word-spacing, deletions, additions, updates?
See how this is going to cause you alot to think about?
Solution?
You should offer your user a way to resolve their conflicts (keep mine, keep theirs, or interactively show the conflicts), whenever you are going to run into trouble
Use resource-locking
Another solution is to avoid collaboration alltogether: Lock your data (make it temporarily read-only for everyone but 1 person), so that never-ever 2 people can be editing your data at the same time.
See also
This problem is very well known by people that used "version control systems" such as GIT, SCM and SVN. Heres how one merge application helps programmers to solve their merge-conflicts:

Grabbing the sentence that a selected word appears in

Using Javascript, I need to allow a user to double click a word on a page and retrieve the sentence that it appears in. Not just any sentence, but that specific one. I've toyed with retrieving all sentences that that word appears in and somehow choosing the correct sentence, so maybe that's an option.
I've scoured the web looking for this beast, and I've thought a lot about it. Some have recommended using Rangy but I haven't been able to find the functionality I'm looking for, or even functionality that would help me get where I need to be.
Any ideas?

You could turn your page into one or multiple read-only textareas, use clever CSS styling to mask it, then use the onselect event as described here: Detect selected text in a text area with javascript
Depends of course, how your page looks like and where it's used.

Mentioning system that mimics Facebook's

I've had a horrible problem that I've been wracking my brain for the past two days for, and have yet to come up with a solution. As such, I think this needs someone smarter than I to accomplish.
What I'm trying to build is a textbox that simulates that of Facebook's; essentially, the tagging function.
Now if you've used Facebook, you'll have noticed that Facebook allows you to tag people in a comment/post, simply by typing in their name and selecting from a dropdown list. The name of the person you've selected then appears in highlighted text in that very textarea. I've successfully managed to create and populate the dropdown list a combination of JQuery and AJAX, but the tagging process itself is the stumper.
Once a dropdown item has been selected (by Enter or clicking), the query text will be replaced with the tagged name. Now, it's difficult to see how one can give text in a textarea any kind of a highlight, so I've discovered (by inspecting elements in Google Chrome and deleting the textarea node) that the textarea itself is transparent, and there is a white div below "simulating" the text. Highlighted words are placed in a tag with custom CSS, which gives it that blue background. All of this I've found out myself, and I have successfully simulated this - but I can only do one tag.
Now I've investigated further and found an input type="hidden" element, of class "mentionsHidden". This input element has a value attribute, which dynamically populates itself based on the content of the textarea. So if I typed "ABC", the value of the element becomes "ABC". If I included a tag, say "hi [Rei]!" (where the name in [] is the tag), the value of the element becomes "hi #[member_id:Rei]!".
So I HAVE done my homework. But here comes the part I can't figure out.
I can't figure out how exactly to dynamically populate the hidden input element with the value of the textbox. It's obvious that the underlying div giving the blue tag background is populated from the input element. But the input element is giving me a headache.
You see, I can't do the following:
-I can't simply "copy" the entire value of the current textarea and "paste" it into the input element's value, because that would override any previously tagged people in the input element (after all, the textarea can only possess plaintext).
-Even though I CAN locate the current index of the caret (the flashing black line in the textarea that tells you where you're going to be typing into), that's only for the textarea. Index position 10 in the textarea and in the input element's value might be different things, because this way of "tagging" people will result in adding additional characters to the value String.
-I can't simply do a "replace" of the text I am intending to replace, because there might be other instances of that same text in other parts of the value String.
I know it's a very long and confusing post, but I do hope you get what I mean. I really need a solution and I don't want to use contenteditable, because it's only for HTML5 and some older browsers might not support it.
Yours,
Rei

I hope you were able to come up with, or find, a solution to your problem. Since there doesn't seem to be one here, i'd like to offer one for and anyone who might stumble upon this (as well as you if my assumption was incorrect).
You are going to need to maintain explicit locational data of each existing mention in the textarea in the order in which they appear. If, after a modification of the content in textarea, the position of a mention in it is changed, you will need to determine which appearance of its value, if any, will be used to represent it, and appropriately update the locational data of the mention.
With such a collection of data, it becomes trivial to construct the value of mentionsHidden, though the existence of such data makes the element unnecessary.
Mentionator is an existing, robust solution which takes this approach in providing the functionality you are trying to recreate. Considering it is well-structured, easy to follow, and copiously commented, it should be of use to you as either out-of-the box solution or reference material to cite as you roll out your own. It is maintained by yours truly :) .

Ebook in iOS using CSS multi-column (calculate range of text in different column)

My first question here. Correct me if I've done anything wrong.
I've found a source here demonstrate how to paginate html using CSS multi-column.
http://groups.google.com/group/leaves-developers/browse_thread/thread/27e4bf5ff3c53113/f137dc01b6d853b7
My question is:
How to calculate the range / location of text in different column (page)?
For example, when changing the font size,
the text in current page will jump to another page.
To solve this, the program should save the current text location,
and move to the correct page (column) after reformatting the web page.
It is also useful for implementing bookmark function.
I think it should be done by javascript, but I'm new to javascript.
Any suggestions and tips are welcomed.

This is a bit of a general question, and very hard to answer, so I'll just try to point you in a direction I might try.
I have no idea how the layout of your page might look or function, but one way you could theoretically do this is by checking the text node of your 'column' whenever you change the font-size (presuming this font size change is implimented by a button click). So, for instance, say you have a div w/ the id #column_1, whenever someone clicks the button ui element you could evaluate the first several characters of #column_1, then search your string to find that text, and load whatever no. of characters you have defined as a 'page' around that text and call your render method to 'turn' to that page. So the flow of your function might look something like this:
zoomControl.click(click event){ //do something when you click the zoom control
var text = findTextofCurrentPage() //get the first bit of text of your current 'page'
renderLargerTextSize() //re-render your 'page' w/ larger text (for user feedback purposes)
renderPageWith(text) //render/navigate to the 'new' page wherein your the text in the variable 'text' can be found
}
Obviously this is a super generic 'idea' of what functions/methods you might use to make this happen, but I think if you dug into JS and say something like JQuery this sort of thing could be done relatively easily.

Develop Reference

JavaScript is the programming language of the Web.