Building JavaScript Array in C#, apostrophes changing

Building JavaScript Array in C#, apostrophes changing - javascript

I have done this to build JavaScript Arrays from int, double and string lists.
public string listToJsArray<T>(List<T> cslist)
{
bool numeric = true;
if(
!(typeof(T)==typeof(int)
|| typeof(T) == typeof(string)
|| typeof(T) == typeof(double))
)
{
throw (new ArgumentException(message: "Only int, double and string are supported"));
}
if(typeof(T)==typeof(string))
{
numeric = false;
}
string JsArray = "[";
for(int i=0;i<cslist.Count;i++)
{
string dataWithSurrendings = cslist[i].ToString();
if(!numeric)
{
dataWithSurrendings = "'" + cslist[i].ToString() + "'";
}
if(i !=0)
{
dataWithSurrendings = "," + dataWithSurrendings;
}
if(i +1==cslist.Count)
{
dataWithSurrendings = dataWithSurrendings + "]";
}
JsArray += dataWithSurrendings;
}
return JsArray;
}
My problem is when a list of strings is passed, apostrophes turn into '.
for example, a list of {"1","2","3","4","5","6","7"} becomes this:
['1','2','3','4','1','6','7']
What modification is needed in this function, to return a correct array in JavaScript?
None of solutions did solve the problem. With JsonConvert I get almost same result. The problem is the single or double quote in View editor have not the same encoding as CS string.

I'm assuming that you are doing this to drop into a webpage somewhere, something like:
<script>
#{
var output = listToJsArray(Model.SomeList);
}
var myArray = #Html.Raw(output);
// some Javascript using that array
</script>
Don't waste your time trying to do it yourself. It's a pain and you are reinventing the wheel. JSON is valid Javascript and a serialization of an array into JSON is absolutely identical to a Javascript array literal. So use Javascript. JSON.Net is really useful here:
<script>
#{
var output = Newtonsoft.Json.JsonConvert.SerializeObject(Model.SomeList);
}
var myArray = #Html.Raw(output);
// some Javascript using that array
</script>
The serializer will handle all the annoying escaping, special characters and edge cases for you.

Related

string containing equals "=" and no quotes convert to JSON

Im currently stuck trying to convert a string into JSON in javascript.
the string im getting from the server is:
"{knee=true, centered=true}"
the outcome im looking for is something like this:
{ knee: true, centered: true}
but since the string is using equals and there are missing quotes the JSON.parse isnt working, I dont know how to solve this. any help will be appreciated, thank you!

The best I could do was this ... It returns value of object in strings though it seems to work perfect ! ( Actually this one challenged me so I had to do it ) :-)
let str = "{knee = true, centered = true}";
str = str.replaceAll('{', '')
str = str.replaceAll('}', '')
str = str.split(",")
str = Object.assign({}, str);
let key_value;
let key;
let val;
for (var i = 0; i < Object.keys(str).length; i++) {
key_value = str[i].split("=");
key = String(key_value[0]);
val = key_value[1];
str[i] = val;
delete Object.assign(str, {[key]: str[i]
})[i];
}
console.log(str)

Assuming you don't have nested things or strings with commas or brackets in them, you could replace all { with {", = with ":, and , with , ":
const str = "{knee=true, centered=true}"
console.log(
JSON.parse(str.split('{').join('{"').split('=').join('":').split(', ').join(', "'))
)

Without more specifics it's impossible to verify how correct this is, but if I was to make some assumptions:
An object is a set of key/value pairs surrounded by { and }
Key/value pairs are separated by ,
Any arbitrary whitespace is allowed around key/value pairs
A key and value are separated by a =
Values can only hold the value true or false which should be translated to a JavaScript boolean
...then parsing can be done through some regular expressions and string manipulations.
const objectRegExp = /^\{(.*)}$/;
function parseNJson(str) { // notJSON
const match = objectRegExp.exec(str);
if (!match) {
throw new Error('This is not NJson');
}
const [, keyValuesBlock] = match;
const keyValueStatements = keyValuesBlock.split(',');
const keyValues = keyValueStatements.map(statement => statement.split('='));
return keyValues.reduce((result, [keyStr, valueStr]) => {
const key = keyStr.trim();
const trimmedValue = valueStr.trim();
let value;
if (trimmedValue === 'true') {
value = true;
} else if (trimmedValue === 'false') {
value = false;
} else {
throw new Error(`Unsupported value ${trimmedValue}`);
}
return Object.assign(result, { [key]: value });
}, {});
}
This will easily fall apart if any assumptions were incorrect, like "what if values can be strings? What if strings can be quoted with double quotes? What if they can also be surrounded by single quotes? What if numbers are supported? What if hexadecimal numbers are supported?"
If the data being sent on the server is a standard format, they should be able to tell you "this was formatted as X" so you can find a spec-compliant X parser. Or you could insist data is sent as JSON instead, since that's a super common exchange format. The best thing is that the server and client are using a common, well-defined message formatting spec so you don't accidentally break things whenever receiving or sending data that has characteristics you didn't account for.

Concatenation of string in javascript

I'm looking to add to a string's value based on the output for multiple if statements but I don't seem to be having much success. I've declared comp_string="" at the beginning of the script then tried += so that for each condition that is true it adds a section on.
For the code example below if I submitted the value of www.facebook.com and www.twitter.com I would like comp_string to return 'fb=www.facebook.com&tw=www.twitter.com'
How would I go about concatenating/adding the string together and how do I add the & if more than one link is provided. I could add it to each string for any value thats not blank, but would an & on the end of the url with nothing following mess things up?
if (facebook_url != "") {
comp_string += "fb="+facebook_url;
}
if (twitter_url != "") {
comp_string += "tw="+twitter_url;
}
alert(comp_string);

A simple approach would be to add each string to an array, then join the array elements to produce the end result you are looking for.
var params = [];
if (facebook_url !== "") {
params.push("fb=" + facebook_url);
}
if (twitter_url !== "") {
params.push("tw=" + twitter_url);
}
alert(params.join("&"));
Reference

javascript and string manipulation w/ utf-16 surrogate pairs

I'm working on a twitter app and just stumbled into the world of utf-8(16). It seems the majority of javascript string functions are as blind to surrogate pairs as I was. I've got to recode some stuff to make it wide character aware.
I've got this function to parse strings into arrays while preserving the surrogate pairs. Then I'll recode several functions to deal with the arrays rather than strings.
function sortSurrogates(str){
var cp = []; // array to hold code points
while(str.length){ // loop till we've done the whole string
if(/[\uD800-\uDFFF]/.test(str.substr(0,1))){ // test the first character
// High surrogate found low surrogate follows
cp.push(str.substr(0,2)); // push the two onto array
str = str.substr(2); // clip the two off the string
}else{ // else BMP code point
cp.push(str.substr(0,1)); // push one onto array
str = str.substr(1); // clip one from string
}
} // loop
return cp; // return the array
}
My question is, is there something simpler I'm missing? I see so many people reiterating that javascript deals with utf-16 natively, yet my testing leads me to believe, that may be the data format, but the functions don't know it yet. Am I missing something simple?
EDIT:
To help illustrate the issue:
var a = "0123456789"; // U+0030 - U+0039 2 bytes each
var b = "𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"; // U+1D7D8 - U+1D7E1 4 bytes each
alert(a.length); // javascript shows 10
alert(b.length); // javascript shows 20
Twitter sees and counts both of those as being 10 characters long.

Javascript uses UCS-2 internally, which is not UTF-16. It is very difficult to handle Unicode in Javascript because of this, and I do not suggest attempting to do so.
As for what Twitter does, you seem to be saying that it is sanely counting by code point not insanely by code unit.
Unless you have no choice, you should use a programming language that actually supports Unicode, and which has a code-point interface, not a code-unit interface. Javascript isn't good enough for that as you have discovered.
It has The UCS-2 Curse, which is even worse than The UTF-16 Curse, which is already bad enough. I talk about all this in OSCON talk, 🔫 Unicode Support Shootout: 👍 The Good, the Bad, & the (mostly) Ugly 👎.
Due to its horrible Curse, you have to hand-simulate UTF-16 with UCS-2 in Javascript, which is simply nuts.
Javascript suffers from all kinds of other terrible Unicode troubles, too. It has no support for graphemes or normalization or collation, all of which you really need. And its regexes are broken, sometimes due to the Curse, sometimes just because people got it wrong. For example, Javascript is incapable of expressing regexes like [𝒜-𝒵]. Javascript doesn’t even support casefolding, so you can’t write a pattern like /ΣΤΙΓΜΑΣ/i and have it correctly match στιγμας.
You can try to use the XRegEXp plugin, but you won’t banish the Curse that way. Only changing to a language with Unicode support will do that, and 𝒥𝒶𝓋𝒶𝓈𝒸𝓇𝒾𝓅𝓉 just isn’t one of those.

I've knocked together the starting point for a Unicode string handling object. It creates a function called UnicodeString() that accepts either a JavaScript string or an array of integers representing Unicode code points and provides length and codePoints properties and toString() and slice() methods. Adding regular expression support would be very complicated, but things like indexOf() and split() (without regex support) should be pretty easy to implement.
var UnicodeString = (function() {
function surrogatePairToCodePoint(charCode1, charCode2) {
return ((charCode1 & 0x3FF) << 10) + (charCode2 & 0x3FF) + 0x10000;
}
function stringToCodePointArray(str) {
var codePoints = [], i = 0, charCode;
while (i < str.length) {
charCode = str.charCodeAt(i);
if ((charCode & 0xF800) == 0xD800) {
codePoints.push(surrogatePairToCodePoint(charCode, str.charCodeAt(++i)));
} else {
codePoints.push(charCode);
}
++i;
}
return codePoints;
}
function codePointArrayToString(codePoints) {
var stringParts = [];
for (var i = 0, len = codePoints.length, codePoint, offset, codePointCharCodes; i < len; ++i) {
codePoint = codePoints[i];
if (codePoint > 0xFFFF) {
offset = codePoint - 0x10000;
codePointCharCodes = [0xD800 + (offset >> 10), 0xDC00 + (offset & 0x3FF)];
} else {
codePointCharCodes = [codePoint];
}
stringParts.push(String.fromCharCode.apply(String, codePointCharCodes));
}
return stringParts.join("");
}
function UnicodeString(arg) {
if (this instanceof UnicodeString) {
this.codePoints = (typeof arg == "string") ? stringToCodePointArray(arg) : arg;
this.length = this.codePoints.length;
} else {
return new UnicodeString(arg);
}
}
UnicodeString.prototype = {
slice: function(start, end) {
return new UnicodeString(this.codePoints.slice(start, end));
},
toString: function() {
return codePointArrayToString(this.codePoints);
}
};
return UnicodeString;
})();
var ustr = UnicodeString("f𝌆𝌆bar");
document.getElementById("output").textContent = "String: '" + ustr + "', length: " + ustr.length + ", slice(2, 4): " + ustr.slice(2, 4);
<div id="output"></div>

Here are a couple scripts that might be helpful when dealing with surrogate pairs in JavaScript:
ES6 Unicode shims for ES3+ adds the String.fromCodePoint and String.prototype.codePointAt methods from ECMAScript 6. The ES3/5 fromCharCode and charCodeAt methods do not account for surrogate pairs and therefore give wrong results.
Full 21-bit Unicode code point matching in XRegExp with \u{10FFFF} allows matching any individual code point in XRegExp regexes.

Javascript string iterators can give you the actual characters instead of the surrogate code points:
>>> [..."0123456789"]
["0", "1", "2", "3", "4", "5", "6", "7", "8", "9"]
>>> [..."𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"]
["𝟘", "𝟙", "𝟚", "𝟛", "𝟜", "𝟝", "𝟞", "𝟟", "𝟠", "𝟡"]
>>> [..."0123456789"].length
10
>>> [..."𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡"].length
10

This is along the lines of what I was looking for. It needs better support for the different string functions. As I add to it I will update this answer.
function wString(str){
var T = this; //makes 'this' visible in functions
T.cp = []; //code point array
T.length = 0; //length attribute
T.wString = true; // (item.wString) tests for wString object
//member functions
sortSurrogates = function(s){ //returns array of utf-16 code points
var chrs = [];
while(s.length){ // loop till we've done the whole string
if(/[\uD800-\uDFFF]/.test(s.substr(0,1))){ // test the first character
// High surrogate found low surrogate follows
chrs.push(s.substr(0,2)); // push the two onto array
s = s.substr(2); // clip the two off the string
}else{ // else BMP code point
chrs.push(s.substr(0,1)); // push one onto array
s = s.substr(1); // clip one from string
}
} // loop
return chrs;
};
//end member functions
//prototype functions
T.substr = function(start,len){
if(len){
return T.cp.slice(start,start+len).join('');
}else{
return T.cp.slice(start).join('');
}
};
T.substring = function(start,end){
return T.cp.slice(start,end).join('');
};
T.replace = function(target,str){
//allow wStrings as parameters
if(str.wString) str = str.cp.join('');
if(target.wString) target = target.cp.join('');
return T.toString().replace(target,str);
};
T.equals = function(s){
if(!s.wString){
s = sortSurrogates(s);
T.cp = s;
}else{
T.cp = s.cp;
}
T.length = T.cp.length;
};
T.toString = function(){return T.cp.join('');};
//end prototype functions
T.equals(str)
};
Test results:
// plain string
var x = "0123456789";
alert(x); // 0123456789
alert(x.substr(4,5)) // 45678
alert(x.substring(2,4)) // 23
alert(x.replace("456","x")); // 0123x789
alert(x.length); // 10
// wString object
x = new wString("𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡");
alert(x); // 𝟘𝟙𝟚𝟛𝟜𝟝𝟞𝟟𝟠𝟡
alert(x.substr(4,5)) // 𝟜𝟝𝟞𝟟𝟠
alert(x.substring(2,4)) // 𝟚𝟛
alert(x.replace("𝟜𝟝𝟞","x")); // 𝟘𝟙𝟚𝟛x𝟟𝟠𝟡
alert(x.length); // 10

How do I get the unicode/hex representation of a symbol out of the HTML using JavaScript/jQuery?

Say I have an element like this...
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mo class="symbol">α</mo>
</math>
Is there a way to get the unicode/hex value of alpha α, &#x03B1, using JavaScript/jQuery? Something like...
$('.symbol').text().unicode(); // I know unicode() doesn't exist
$('.symbol').text().hex(); // I know hex() doesn't exist
I need &#x03B1 instead of α and it seems like anytime I insert &#x03B1 into the DOM and try to retrieve it right away, it gets rendered and I can't get &#x03B1 back; I just get α.

Using mostly plain JavaScript, you should be able to do:
function entityForSymbolInContainer(selector) {
var code = $(selector).text().charCodeAt(0);
var codeHex = code.toString(16).toUpperCase();
while (codeHex.length < 4) {
codeHex = "0" + codeHex;
}
return "&#x" + codeHex + ";";
}
Here's an example: http://jsfiddle.net/btWur/

charCodeAt will get you the decimal value of the string:
"α".charCodeAt(0); //returns 945
0x03b1 === 945; //returns true
toString will then get the hex string
(945).toString(16); // returns "3b1"
(Confirmed to work in IE9 and Chrome)

If you would try to convert Unicode character out of BMP (basic multilingual plane) in ways above - you are up for a nasty surprise. Characters out of BMP are encoded as multiple UTF16 values for example:
"🔒".length = 2 (one part for shackle one part for lock base :) )
so "🔒".charCodeAt(0) will give you 55357 which is only 'half' of number while "🔒".charCodeAt(1) will give you 56594 which is the other half.
To get char codes for those values you might wanna use use following string extension function
String.prototype.charCodeUTF32 = function(){
return ((((this.charCodeAt(0)-0xD800)*0x400) + (this.charCodeAt(1)-0xDC00) + 0x10000));
};
you can also use it like this
"&#x"+("🔒".charCodeUTF32()).toString(16)+";"
to get html hex codes.
Hope this saves you some time.

for example in case you need to convert this hex code to unicode
e68891e4bda0e4bb96
pick two character time by time ,
if the dec ascii code is over 127 , add a % before
return url decode string
function hex2a(hex) {
var str = '';
for (var i = 0; i < hex.length; i += 2){
var dec = parseInt(hex.substr(i, 2), 16);
character = String.fromCharCode(dec);
if (dec > 127)
character = "%"+hex.substr(i,2);
str += character;
}
return decodeURI(str);
}

Detect difference between & and %26 in location.hash

Analyzing the location.hash with this simple javascript code:
<script type="text/javascript">alert(location.hash);</script>
I have a difficult time separating out GET variables that contain a & (encoded as %26) and a & used to separate variables.
Example one:
code=php&age=15d
Example two:
code=php%20%26%20code&age=15d
As you can see, example 1 has no problems, but getting javascript to know that "code=php & code" in example two is beyond my abilities:
(Note: I'm not really using these variable names, and changing them to something else will only work so long as a search term does not match a search key, so I wouldn't consider that a valid solution.)

There is no difference between %26 and & in a fragment identifier (‘hash’). ‘&’ is only a reserved character with special meaning in a query (‘search’) segment of a URI. Escaping ‘&’ to ‘%26’ need be given no more application-level visibility than escaping ‘a’ to ‘%61’.
Since there is no standard encoding scheme for hiding structured data within a fragment identifier, you could make your own. For example, use ‘+XX’ hex-encoding to encode a character in a component:
hxxp://www.example.com/page#code=php+20+2B+20php&age=15d
function encodeHashComponent(x) {
return encodeURIComponent(x).split('%').join('+');
}
function decodeHashComponent(x) {
return decodeURIComponent(x.split('+').join('%'));
}
function getHashParameters() {
var parts= location.hash.substring(1).split('&');
var pars= {};
for (var i= parts.length; i-->0;) {
var kv= parts[i].split('=');
var k= kv[0];
var v= kv.slice(1).join('=');
pars[decodeHashComponent(k)]= decodeHashComponent(v);
}
return pars;
}

Testing on Firefox 3.1, it looks as if the browser converts hex codes to the appropriate characters when populating the location.hash variable, so there is no way JavaScript can know how the original was a single character or a hex code.
If you're trying to encode a character like & inside of your hash variables, I would suggest replacing it with another string.
You can also parse the string in weird ways, like (JS 1.6 here):
function pairs(xs) {
return xs.length > 1 ? [[xs[0], xs[1]]].concat(pairs(xs.slice(2))) : []
}
function union(xss) {
return xss.length == 0 ? [] : xss[0].concat(union(xss.slice(1)));
}
function splitOnLast(s, sub) {
return s.indexOf(sub) == -1 ? [s] :
[s.substr(0, s.lastIndexOf(sub)),
s.substr(s.lastIndexOf(sub) + sub.length)];
}
function objFromPairs(ps) {
var o = {};
for (var i = 0; i < ps.length; i++) {
o[ps[i][0]] = ps[i][1];
}
return o;
}
function parseHash(hash) {
return objFromPairs(
pairs(
union(
location.hash
.substr(1)
.split("=")
.map(
function (s) splitOnLast(s, '&')))))
}
>>> location.hash
"#code=php & code&age=15d"
>>> parseHash(location.hash)
{ "code": "php & code", "age": "15d" }

Just do the same as you do with the first example, but after you have split on the & then call unescape() to convert the %26 to & and the %20 to a space.
Edit:
Looks like I'm a bit out of date and you should be using decodeURIComponent() now, though I don't see any clear explanation on what it does differently to unescape(), apart from a suggestion that it doesn't handle Unicode properly.

This worked fine for me:
var hash = [];
if (location.hash) {
hash = location.href.split('#')[1].split('&');
}

Develop Reference

JavaScript is the programming language of the Web.