How can I convert a string into a unicode character?

How can I convert a string into a unicode character? - javascript

In Javascript '\uXXXX' returns in a unicode character. But how can I get a unicode character when the XXXX part is a variable?
For example:
var input = '2122';
console.log('\\u' + input); // returns a string: "\u2122"
console.log(new String('\\u' + input)); // returns a string: "\u2122"
The only way I can think of to make it work, is to use eval; yet I hope there's a better solution:
var input = '2122';
var char = '\\u' + input;
console.log(eval("'" + char + "'")); // returns a character: "™"

Use String.fromCharCode() like this: String.fromCharCode(parseInt(input,16)). When you put a Unicode value in a string using \u, it is interpreted as a hexdecimal value, so you need to specify the base (16) when using parseInt.

String.fromCharCode("0x" + input)
or
String.fromCharCode(parseInt(input, 16)) as they are 16bit numbers (UTF-16)

JavaScript uses UCS-2 internally.
Thus, String.fromCharCode(codePoint) won’t work for supplementary Unicode characters. If codePoint is 119558 (0x1D306, for the '𝌆' character), for example.
If you want to create a string based on a non-BMP Unicode code point, you could use Punycode.js’s utility functions to convert between UCS-2 strings and UTF-16 code points:
// `String.fromCharCode` replacement that doesn’t make you enter the surrogate halves separately
punycode.ucs2.encode([0x1d306]); // '𝌆'
punycode.ucs2.encode([119558]); // '𝌆'
punycode.ucs2.encode([97, 98, 99]); // 'abc'

Since ES5 you can use
String.fromCodePoint(number)
to get unicode values bigger than 0xFFFF.
So, in every new browser, you can write it in this way:
var input = '2122';
console.log(String.fromCodePoint(input));
or if it is a hex number:
var input = '2122';
console.log(String.fromCodePoint(parseInt(input, 16)));
More info:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/fromCodePoint
Edit (2021):
fromCodePoint is not just used for bigger numbers, but also to combine Unicode emojis.
For example, to draw a waving hand, you have to write:
String.fromCodePoint(0x1F44B);
But if you want a waving hand with a skin tone, you have to combine it:
String.fromCodePoint(0x1F44B, 0x1F3FC);
In future (or from now), you will even be able to combine 2 emoji to create a new one, for example a heart and a fire, to create a burning heart:
String.fromCodePoint(0x2764, 0xFE0F, 0x200D, 0x1F525);
32-bit number:
<script>
document.write(String.fromCodePoint(0x1F44B));
</script>
<br>
32-bit number + skin:
<script>
document.write(String.fromCodePoint(0x1F44B, 0x1F3FE));
</script>
<br>
32-bit number + another emoji:
<script>
document.write(String.fromCodePoint(0x2764, 0xFE0F, 0x200D, 0x1F525));
</script>

var hex = '2122';
var char = unescape('%u' + hex);
console.log(char);
will returns " ™ "

Related

How to get unicode name from a string character in JavaScript [duplicate]

I have the following:
function showUnicode()
{
var text = prompt( 'Enter the wanted text', 'Unicode' ),
unicode = 0,
ntext,
temp,
i = 0
;
// got the text now transform it in unicode
for(i; i < text.length; i++)
{
unicode += text.charCodeAt(i)
}
// now do an alert
alert( 'Here is the unicode:\n' + unicode + '\nof:\n' + text )
}
Thanks for the idea to initialize unicode but now unicode variable gets the Unicode of the last character, why does it?

JavaScript uses UCS-2 internally.
This means that supplementary Unicode symbols are exposed as two separate code units (the surrogate halves). For example, '𝌆'.length == 2, even though it’s only one Unicode character.
Because of this, if you want to get the Unicode code point for every character in a string, you’ll need to convert the UCS-2 string into an array of UTF-16 code points (where each surrogate pair forms a single code point). You could use Punycode.js’s utility functions for this:
punycode.ucs2.decode('abc'); // [97, 98, 99]
punycode.ucs2.decode('𝌆'); // [119558]

You should initialize the unicode variable to something, or you're adding the char codes to undefined.

NaN = Not a Number
You need to initialize "unicode" as a numeric type:
var unicode = 0

Converting unicode to currency symbol in javascript

I am working with currency symbol in appcelerator for building apps in Android and iOS. I want make many parameters dynamic, so passing this value(u20b9) as api to app. Can't pass value(\u20b9) like this because of some reasons, so passing without slash.
When I use below code it works proper:-
var unicode = '\u20b9';
alert(unicode);
Output:- ₹
When I use below code:-
var unicode = '\\'+'u20b9';
alert(unicode);
Output:- \u20b9
Because of this, instead of ₹ it prints \u20b9 everywhere, which I don't want.
Thanks in advance.

The following works for me:
console.log(String.fromCharCode(0x20aa)); // ₪ - Israeli Shekel
console.log(String.fromCharCode(0x24)); // $ - US Dollar
console.log(String.fromCharCode(0x20b9)); // ₹ - ???
alert(String.fromCharCode(0x20aa) + "\n" + String.fromCharCode(0x24) + "\n" + String.fromCharCode(0x20b9));

As far I understand, you need to pass string values of unicode characters via api. Obviously you can't use string-code without slash because that will make it invalid unicode and if you pass the slash that'll convert the value to unicode. So what you can do here is to pass the string without slash & 'u' character and then parse the remaining characters as hexadecimal format.
See following code snippet:
// this won't work as you have included 'u' which is not a hexadecimal character
var unicode = 'u20b9';
String.fromCharCode(parseInt(unicode, 16));
// It WORKS! as the string now has only hexadecimal characters
var unicode = '20b9';
String.fromCharCode( parseInt(unicode, 16) ); // prints rupee character by using 16 as parsing format which is hexadecimal
I hope it solves your query !

replacing a unicode character

I can't figure out how to remove a Unicode character using .replace(...) here's what I've tried
$(elem).click(function () {
var display = $("." + target).css('display');
var lastChar = display == 'none' ? 'a' : '8';
$("." + target).slideToggle(500);
alert('index: ' + $(this).text().indexOf('⇈'));
$(this).html($(this).html().replace('⇈','').replace('⇊','') + ' &#x21c' + lastChar + ';');
});
It is adding my double arrow up and down, but indexOf is always -1, and my replace calls are not removing the Unicode character. I'm looking at this now and thinking that if I start off with one of these Unicode chars I could just replace it with the other one... If I could get replace to work at all ;-)
What am I doing wrong? Thanks!

HTML entities get converted to actual unicode characters. Example:
document.body.innerHTML = "⇈";
console.log(document.body.innerHTML === "\u21c8"); // true
// Instead of a unicode esape sequence, you can write the
// actual unicode character. This is safe as long as you
// specify the correct encoding for your JavaScript files:
console.log(document.body.innerHTML === "⇈"); // true
So when you read the innerHTML via $(this).html() or the textContent via $(this).text(), you need to look for the actual unicode character given by its unicode escape sequence "\u21c8" or directly "⇈" and not its entity "⇈".

Try using String.fromCharCode()
The static String.fromCharCode() method returns a string created by
using the specified sequence of Unicode values.
Syntax
String.fromCharCode(num1[, ...[, numN]]);
Examples
myString.replace(String.fromCharCode(8648),''); for ⇈
myString.replace(String.fromCharCode(8650),''); for ⇊
myString.indexOf(String.fromCharCode(84,69,83,84);

Converting ## to number regex

I am holding in a field the validation format that I would need.
I need to convert different ## into a regex validation.
Is there a simple replace that can do this for me.
for example, i need to validate the account number.
sometimes it might need to be ###-###, or I'll get ####### or ##-####.
depending what is in the id="validationrule" field
I'm looking for
regex = $('#validationrule').replace("#", "[0/9]");
It also has to take into consideration that sometimes there is a dash in there.

Your question seems to be about creating regexes from a string variable (which you get from an input field that specifies the validation format).
"###-###" might turn into /^\d{3}\-\d{3}$/
"#######" might turn into /^\d{7}$/
If your validation format is built from the 2 characters # and -, this would work:
function createValidationRegEx(format){
format = format
.replace(/[^#\-]/g, '') //remove other chars
.replace(/#/g, '\\d') //convert # to \d
.replace(/\-/g, '\\-'); //convert - to \-
return new RegExp('^' + format + '$', 'g');
}
//create regexes
var format1 = createValidationRegEx('###-###');
var format2 = createValidationRegEx('#######');
//test regexes
console.log(format1.test('123-456')); // true
console.log(format2.test('123-456')); // false
console.log(format1.test('1234567')); // false
console.log(format2.test('1234567')); // true
Please note that you need to pay attention to which characters needs to be escaped when creating regexes from strings. This answer provides more details about how to solve this more generally, if you want to build more complex solutions.

If you are trying to replace the .value of an <input> element you can use .val(function), return replacement string from .replace() inside of function, chain .val() to assign result to regex. Use RegExp constructor with g flag to replace all matches of the RegExp supplied to .replace() to match characters against at string.
var regex = $("#validationrule").val(function(_, val) {
return val.replace("#", "[0/9]");
}).val();
console.log(regex);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js">
</script>
<input id="validationrule" value="#">

What is an easy way to call Asc() and Chr() in JavaScript for Unicode values?

I am not that familiar with Javascript, and am looking for the function that returns the UNICODE value of a character, and given the UNICODE value, returns the string equivalent. I'm sure there is something simple, but I don't see it.
Example:
ASC("A") = 65
CHR(65) = "A"
ASC("ਔ") = 2580
CHR(2580) = "ਔ"

Have a look at:
String.fromCharCode(64)
and
String.charCodeAt(0)
The first must be called on the String class (literally String.fromCharCode...) and will return "#" (for 64). The second should be run on a String instance (e.g., "###".charCodeAt...) and returns the Unicode code of the first character (the '0' is a position within the string, you can get the codes for other characters in the string by changing that to another number).
The script snippet:
document.write("Unicode for character ਔ is: " + "ਔ".charCodeAt(0) + "<br />");
document.write("Character 2580 is " + String.fromCharCode(2580) + "<br />");
gives:
Unicode for character ਔ is: 2580
Character 2580 is ਔ

Because JavaScript uses UCS-2 internally, String.fromCharCode(codePoint) won’t work for supplementary Unicode characters. If codePoint is 119558 (0x1D306, for the '𝌆' character), for example.
If you want to create a string based on a non-BMP Unicode code point, you could use Punycode.js’s utility functions to convert between UCS-2 strings and UTF-16 code points:
// `String.fromCharCode` replacement that doesn’t make you enter the surrogate halves separately
punycode.ucs2.encode([0x1d306]); // '𝌆'
punycode.ucs2.encode([119558]); // '𝌆'
punycode.ucs2.encode([97, 98, 99]); // 'abc'
if you want to get the Unicode code point for every character in a string, you’ll need to convert the UCS-2 string into an array of UTF-16 code points (where each surrogate pair forms a single code point). You could use Punycode.js’s utility functions for this:
punycode.ucs2.decode('abc'); // [97, 98, 99]
punycode.ucs2.decode('𝌆'); // [119558]

Example for generating alphabet array here :
const arr = [];
for(var i = 0; i< 20; i++) {
arr.push( String.fromCharCode('A'.charCodeAt(0) + i) )
}

Develop Reference

JavaScript is the programming language of the Web.

How can I convert a string into a unicode character? - javascript

Use String.fromCharCode() like this: String.fromCharCode(parseInt(input,16)). When you put a Unicode value in a string using \u, it is interpreted as a hexdecimal value, so you need to specify the base (16) when using parseInt.

String.fromCharCode("0x" + input) or String.fromCharCode(parseInt(input, 16)) as they are 16bit numbers (UTF-16)

var hex = '2122'; var char = unescape('%u' + hex); console.log(char); will returns " ™ "

Related

How to get unicode name from a string character in JavaScript [duplicate]

Converting unicode to currency symbol in javascript

replacing a unicode character

Converting ## to number regex

What is an easy way to call Asc() and Chr() in JavaScript for Unicode values?

Categories

Resources