Find and Highlight Arabic with diacritics Text in UIWebView - javascript

I'm viewing an Arabic with diacritics (tashkel) text in a UIWebView.
I also have a Search View.
I want to give the UIWebView a keyword and UIWebView finds and highlights it,
but the search should ignore the diacritics.
The main text should remain with the diacritics.
example :
if the text is " الَلَهُم صَلِ عَلى مُحَمّد و آل مُحمد "
and I tell the UIWebView to search for " محمد "
it should highlight "مُحَمّد " and " مُحمد " regardless of all the diacritics.
I think of two approaches :
1- I do the find and highlight by Javascript after the UIWebView load.
2- I edit the text by Objective-C before loading the UIWebView.

You have to first strip all the diacritics from the string, then you can compare without any diacritics. Use regular expression to remove the characters you don't want. Check out this fiddle i did, inside the strip() function you need to add all the diacritics you need to take out.
hope this helps.

The version below does not remove sukoon.
I have converted the eight diacritics (fatha, kasrah, damma, shaddah, sukoon and tanween) into their Unicode equivalent. It is much easier to manipulate Arabic converted into Latin characters in any text editor :
<html>
<head><meta charset="UTF-8"></head>
<body>
<script>document.write("حَرْفٌ".replace(/(\u0652)|(\u0650)|(\u064C)|(\u064E)|(\u064B)|(\u064F)|(\u064D)|(\u0651)/g,""));</script>
</body>
</html>
Resources used:
Arabic Keyboard with diacritics
Unicode Code Converter

Related

change color of special characters in HTML

I've a Unicode HTML page (saved in DB), is there anyway that I can programmatically change color of all "." and ":" characters in text (please pay attention that my HTML content has also inline CSS which may contain "." or ":" characters, but I just want to change color of the mentioned characters in real text.
what are my options? One way can be finding these characters in the text and put them in tag, so that can be styled, any other suggestion? (if I'm going to use this method, how can I distinguish between HTML/CSS characters and real characters in the text?) I'm using ASP.NET/C#
Try utilizing String.prototype.replace() with RegExp /\.|:/g , returning i element with style attribute set to specific color
var div = document.getElementsByTagName("div")[0];
div.innerHTML = div.innerHTML.replace(/\.|:/g, function(match) {
return "<i style=color:tomato;font-weight:bold>" + match + "</i>"
})
<head>
<meta charset="utf-8" />
</head>
<body>
<div>
I've a Unicode HTML page (saved in DB), is there anyway that I can programmatically change color of all "." and ":" characters in text (please pay attention that my HTML content has also inline CSS which may contain "." or ":" characters, but I just want
to change color of the mentioned characters in real text. what are my options? One way can be finding these characters in the text and put them in tag, so that can be styled, any other suggestion? (if I'm going to use this method, how can I distinguish
between HTML/CSS characters and real characters in the text?) I'm using ASP.NET/C#
</div>
</body>
This is the Simple way to change color of any character in HTML language
"Spacial Character"

Regexp to read HTML and match onlu specific word in text

I have this String:
<body>
<span class="open crack-opener o_open i_opens ng-open" style='open'>Open opens openes "Open opens openes" clopened</span>
</body>
I need to select only the words OPEN or OPENS or OPENES only inside the text. I tried the following RegExp, but it only selects the tags. I need to negate this and select the words.
/(<\/?\w+((\s+\w+(\s*=\s*(?:\".*?"|'.*?'|[^'\">\s]+))?)+\s*|\s*)?>)/ig
How do I negate this match and insert the word open?
Thanks in advance
To begin with: Do not use regex to parse HTML, it is not a good idea, since it is impossible to build regex parsing HTML :)
But back to your question:
var str="<body><span class=\"open crack-opener o_open i_opens ng-open\" style='open'>Open opens openes \"Open opens openes\" clopened</span></body>";
var words=str.match(/(\bopen\b|\bopens\b|\bopenes\b)(?=[^>]*<)/ig);
This will search for your words followed by anything except for > and then followed by <. That solution is not the best but you cannot expect regex to do something it was not designed for.

Regular Expression find match in text not after word

I'm trying to find all appearance of smiley text ":D" which is not following after "mailto"
The problem is that I'm trying to detect smileys in htmlText and sometimes it may be not correct
example "mailto:Denis#email.com" smile pattern ":D" should not be matched when appear in email.
This is the wrong example that I'm trying to implement: http://regexr.com/3a3at
Somebody please help.
Javascript (what a shame!) doesn't support lookbehinds, therefore the most realistic option would be something like
input
.replace(/(mailto|https?):/g, "$1\x00")
.replace(/:D/g, "smile")
.replace(/\x00/g, ":")
Basically, replace "unwanted" colons with \x00 (or some other symbol that is unlikely to appear in the input), process smiles and then replace \x00s back to colons.

Invalid location of <script> tag within a HTML <pre> tag

I am going through the example given in JavaScript The Complete Reference 3rd Edition.
The O/P can be seen here, given by the author.
<body>
<h1>Standard Whitespace Handling</h1>
<script>
// STRINGS AND (X)HTML
document.write("Welcome to JavaScript strings.\n");
document.write("This example illustrates nested quotes 'like this.'\n");
document.write("Note how newlines (\\n's) and ");
document.write("escape sequences are used.\n");
document.write("You might wonder, \"Will this nested quoting work?\"");
document.write(" It will.\n");
document.write("Here's an example of some formatted data:\n\n");
document.write("\tCode\tValue\n");
document.write("\t\\n\tnewline\n");
document.write("\t\\\\\tbackslash\n");
document.write("\t\\\"\tdouble quote\n\n");
</script>
<h1>Preserved Whitespace</h1>
<pre>
<script> // in Eclipse IDE, at this line invalid location of tag(script)
// STRINGS AND (X)HTML
document.write("Welcome to JavaScript strings.\n");
document.write("This example illustrates nested quotes 'like this.'\n");
document.write("Note how newlines (\\n's) and ");
document.write("escape sequences are used.\n");
document.write("You might wonder, \"Will this nested quoting work?\"");
document.write(" It will.\n");
document.write("Here's an example of some formatted data:\n\n");
document.write("\tCode\tValue\n");
document.write("\t\\n\tnewline\n");
document.write("\t\\\\\tbackslash\n");
document.write("\t\\\"\tdouble quote\n\n");
</script>
</pre>
</body>
(X)HTML automatically “collapses” multiple whitespace characters down to one whitespace. So, for example, including multiple consecutive tabs in your HTML shows up as only one space character. In this example, the pre tag is used to tell the browser that the
text is preformatted and that it should not collapse the white space inside of it. Similarly, we could use the CSS white-space property to modify standard white space handling. Using pre allows the tabs in the example to be displayed correctly in the output.
So, how to get rid of this warning and do i really need to have a concern for this? I think i am missing something as i have the intuition of the authors not being wrong?
There is nothing wrong in having script inside pre tag. It is just Eclipse IDE validation issue. If you use this html in the browser everything works fine and no warnings are displayed.
Also, if you wanted to show script tag as 'text content' inside pre tag, then have a look at this question: script in pre

how to display whitespace characters.. but omit when text is selected

Setup:
I'd like to output some text that shows visible spaces, linebreaks, etc
(For the purpose of displaying strings for debug purposes (or for say a rich-text editor))
ie, id like to make the following type of substitutions
" " -> "<span class="whitespace">·</span>"
"\r" -> "<span class="whitespace">\\r</span>"
"\n" -> "<span class="whitespace">\\n</span>"
perhaps the following CSS rule could be defined
/*display whitespace chars as a light grey*/
.whitespace { color:#CCC; }
so that
this two line
string
would be displayed as
this·two·lined\n
\t string
The Question:
Is it possible so that when the above "visual-whitepace" text is selected / copied-to-clipboard... it copies without the whitespace markup?
Is there some CSS property to display x, but copy y?
javascript hack?
special whitespace-font?
other?
<style>.paragraph-marker:after { content: "\B6" }</style>
<p>Foo<span class="paragraph-marker"></span></p>
<p>Bar<span class="paragraph-marker"></span></p>
The :after is a "pseudo-selector" which matches a pseudo-node that immediately follows the affected element.
The content property can be used with these pseudo-nodes to specify the textual content of them. It comes in handy when specifying quotation marks before and after quoted sections, or list separators like commas in semantic HTML <ol> which you don't want to display in bullet format.
It should come in handy for your use case since browsers don't deal with pseudo-nodes when converting a DOM selection stored in the clipboard to plain text on paste.
http://codepen.io/msvbg/pen/ebgrj
Works fine in the latest version of Chrome. Flip the showWhitespace variable to try it both ways. It works by sticking a visible whitespace layer underneath the text layer, and only the top-most layer is copied by default.

Categories

Resources