Retain subscript string in javascript code - javascript

I have a javascript code containing subscript strings in an array e.g ["Cβ‚‹β‚‚" , "Dβ‚‹β‚‚", etc.] In local environment, the script works fine. However, in live web server, the string became ["C?2", "D?2"] which makes my whole script doesn't execute properly. Subscript characters don't retain. Is there anyway where I could 'escape' these characters?
LOCAL
LIVE

Finally, I found the answer. Thanks Jacob for giving me the hint.
As what #Domorodec suggests, Notepad is only the answer. Copy and paste your code in Notepad and save it with UTF-8 Encoding.
Here's the link for the answer.
Please do note that the character may not be look as what you've expected it to be (weird characters will be printed in the javascript file), however, the code works fine it displayed the expected HTML text normally.

Related

How to retrieve a txt file from raw.githubusercontent.com and find and replace characters?

For a miscellaneous project, I need to use a long text file (containing lots of words), separated by newlines, (\n) and I've tried to replace all the newlines with spaces. I've tried to use not only online tools as well as Vim, Subline, and Atom to find-and-replace, but they all froze (except for Vim, there was no replace). Just note, my text file contains over 370000 words.
I'm thinking about retrieving this text file (located at https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt) with a JS script, but I have no idea how to 1: Retrieve the file, 2: Put it into a variable (or something), 3: Find and replace \n with (space character), and 4: Get that file.
I believe that you'r looking for something along the lines of this, if you'd like to learn more about how I got the data, then take a look at the fetch docs. To have converted the data to a string, I simply used the text function which you can find more documentation here.
Finally I then simply replaced all of the \n chars with a regular space, if you'd like to learn more about the replace method, you can read more into it here.
const url = 'https://raw.githubusercontent.com/dwyl/english-words/master/words_alpha.txt';
fetch(url).then(d => d.text()).then(d => console.log(d.replace(/\n/g, ' ')));
I don't know why you want to use JS for that.
You can do entire operation in Bash or Powershell, using built-in tools.
For Windows there is good text editor named Notepad++
You can open file, press CTRL + A, Edit > Line Operations > Join Lines

Unicode -- What's going on here?

This code:
console.log('πŸ˜€');
console.log('\uD83D\uDE00');
From HTML script tag:
Γ°ΕΈΛœβ‚¬
πŸ˜€
Ran pasted into browser console (same browser):
πŸ˜€
πŸ˜€
What's going on here that causes the first console.log('πŸ˜€'); to fail when it's included with a script tag, but work fine when run in the browser console. The obvious problem seems to be that it isn't being converted to a surrogate pair, since the second line works as expected.
Your HTML file is not saved in the same encoding that the HTTP headers or HTML meta tags advertise. The file is interpreted in the wrong encoding resulting in the wrong characters. That doesn't matter for the unicode escape sequence, which is pure ASCII, it does matter for the non-ASCII literal.
Concrete guess: the file is saved as UTF-8 but advertised as ISO-8859-1.

Cross-Browser - Newline characters in textareas

In my web application (JSP, JQuery...) there is a form which, along with other fields, has a textarea where the user can input notes freely. The value is saved to the database as is.
The problem happens when the value has newline characters and is loaded back to the textarea; it sometimes "breaks" the Jquery code. Explaining further:
The value is loaded to the textarea using Jquery:
$('#p_notas').text("value_from_db");
When the user hits Enter to insert a new paragraph, the resulting value will include a newline character (or more than one char). This char is the problem as it varies from browser to browser and I haven't found out which one is causing the problem.
The error I get is a console error: SyntaxError: unterminated string literal. The page doesn't load correctly.
I'm not able to reproduce the problem. I tried with Chrome, Firefox and IE Edge (with several combinations of user agent and document mode).
We advise our users to use IE8+, Firefox or Chrome but we can't control it.
What I wanted to know is which character is causing the problem and how can I solve it.
Thanks
EDIT: Summing up - What are the differences in newline characters for the different browsers? Can I do anything to make them uniform?
EDIT 2: Looking at the page in the debugger, what I get is:
Case 1 (No problem)
$('#p_notas').text("This is the text I inserted \r\n More text");
Case 2 (Problem)
$('#p_notas').text("This is the text I inserted
More text");
In case 2 I get the Javascript error "SyntaxError: unterminated string literal." because it is interpreted as two lines of code
EDIT 3: #m02ph3u5 I tried using '\r' '\n' '\r\n' '\n\r' and I couldn't reproduce the problem.
EDIT 4: I'm going to try and replace all line breaks with '\n\r'
EDIT 5: In case it is of interest, what I did was treat the value before it was saved
value.replace(/(?:\r\n|\r(?=\n)|\n(?=\r))/g, '\n\r')
The problem isn't the browser but the operating system. Quoting from this post:
So, using \r\n will ensure linebreaks on all major operating systems
without issue.
Here's a nice read on the why: why do operating systems implement line breaks differently?
The problem you might be experiencing is saving the value of the textarea and then returning that value including any newlines. What you could do is "normalize" the value before saving, so that you don't have to change the output. In other words: get the value from the textarea, do a find-and-replace and replace every ossible occurrence of a newline (\r, \n) by a value that works on all OS's \r\n. Then, when you get the value from the database later on, it'll always be correct.
I suspect your problem is actually any new line in the entered input is causing an issue. It looks like on the server you are have a templated page something like:
$('#p_notas').text("<%=db.value%>");
So what you end up with client side is:
$('#p_notas').text("some notes that
were entered by the user");
or some other characters that break the JS. Embedded quotes would do it too.
You need to escape the user entered values some how. The preferred "modern" way is to format info you are returning as AJAX. If you are embedding the value within a template what I might do is:
<div style="display:none" id="userdata><%=db.value%></div>
<script>$('#p_notas').text($("#userdata").text());</script>
Of course if it were this exactly you could just embed the data in the text area <textarea><%=db.value%></textarea>
When you output data to the response, you always need to encode it using the appropriate encoding for the context it appears in.
You haven't mentioned which server-side technology you're using. In ASP.NET, for example, the HttpUtility class contains various encoding methods for different contexts:
HtmlEncode for general HTML output;
HtmlAttributeEncode for HTML attributes;
JavaScriptStringEncode for javascript strings;
UrlEncode for values passed in the query-string of a URL;
In some cases, you might need to encode the value more than once. For example, if you're passing a value in a URL via a javascript string, you'd need to UrlEncode the raw value, then JavaScriptStringEncode the result.
Assuming that you're using ASP.NET, and your code currently looks something like this:
$('#p_notas').text("<%# Eval("SomeField") %>");
change it to:
$('#p_notas').text("<%# HttpUtility.JavaScriptStringEncode(Eval("SomeField", "{0}")) %>");

Writing Javascript using UTF-16 character encoding

Here is what I am trying but am not sure how to get this working or if it is even possible -
I have an HTML page MyHTMLPage.htm and I want to src a Javascript from this HTML file. This is pretty straightforward. I plan to include a <script src = "MyJavascript.js"></script> tag in my HTML file and that should take care of it.
However, I want to create my Javascript file using UTF-16 encoding. So, I plan to use the following tag <script charset="UTF-16" src="MyJavascript.js"></script> in my HTML file to take care of that
Now the problem I am really stuck at is how do I create the Javascript using UTF-16 encoding - E.g. let's say my Javascript code is alert(1); I created my Javascript file with the contents as \u0061\u006c\u0065\u0072\u0074\u0028\u0031\u0029\u003b but that does not seem to execute as valid Javascript at runtime.
To summarize, here is what I have -
MyHTMLPage.html
...
...
...
<script charset="UTF-16" src="MyJavascript.js"></script>
...
...
...
MyJavascript.js
\u0061\u006c\u0065\u0072\u0074\u0028\u0031\u0029\u003b
When I open the HTML page in Firefox, I get the error - "Syntax error - Illegal character" right at the beginning of the MyJavascript.js file. I have also tried adding the BOM character "\ufeff" at the beginning of the above Javascript but I still get the same error.
I know I could create my Javascript file as - "alert(1);" and then save it using UTF-16 encoding using the text editor and then the browser runs it fine however is there a way I could use "\u" notation (or an alternate escape character) and still get the Javascript to execute fine?
Thanks,
You are misunderstanding character encoding. Character encoding is a scheme of how characters are represented as bits behind the scenes.
You would not write \u004a in your file to "make it utf-16" as that is literally a sequence of 6 characters:
\, u, 0, 0, 4, a
And if you saved the above as utf-16, it would be represented as the following bits:
005C0075
00300030
00340061
Had you saved it as utf-8 it would be:
5C753030
3461
Which takes 50% of the space and bandwidth. It takes even less to write that character literally ("J"): just a byte
(4A) in utf-8.
The "\u"-notation is a way to reference any BMP character by just using a small set of ascii characters. If you were
working with a text editor with no unicode support, you could write "\u2665", instead of literally writing "β™₯" and the
browser would show it properly.
If you for some weird reason still want to use utf-16, simply write the code normally, save the file as utf-16 and serve it with the proper charset header.

Parse ANSI escape code colors in the browser?

I have a source that has ANSI escape code color attached to the strings it spits out. These strings are being sent to the browser. I want to parse these ANSI escape codes with javascript in the browser so the it looks like it would in a terminal window.
The goal: ANSI strings -> html spans with styling
Is this possible? First I need to know how to parse ANSI strings in JS.
Thanks!!
https://github.com/mmalecki/ansispan
It's been done before. A quick Google search finds escapes.js as one example.

Categories

Resources