Selecting element by unescaped data attribute - javascript

Without going into specifics why I'm doing this... (it should be encoded to begin with, but it's not for reasons outside my control)
Say I have a bit of HTML that looks like this
<tr data-path="files/kissjake's files"">...</tr> so the actual data-path is files/kissjake's files"
How do I go about selecting that <tr> by its data path?
The best I can currently do is when I bring the variables into JS and do any manipulation, I URLEncode it so that I'm always working with the encoded version. jQuery seems smart enough to determine the data-path properly so I'm not worried about that.
The problem is on one step of the code I need to read from a data-path of another location, and then compare them.
Actually selecting this <tr> is what's confusing me.
Here is my coffeescript
oldPriority = $("tr[data-path='#{path}']").attr('data-priority')
If I interpolate the URLEncoded version of the path, it doesn't find the TR. And I can't URLDecode it because then jQuery breaks as there are multiple ' and " conflicting in the path.
I need some way to select any <tr> that matches a particular data-attribute, even if its not encoded in the html to begin with

First, did you mean to have the extra " in there? You will have to escape that, as it's not valid HTML.
<tr data-path="files/kissjake's files"">...</tr>
To select it, you need to escape inside the selector. Here's an example of how that would look:
$("tr[data-path='files/kissjake\\'s files\"']")
Explanation:
\\' is used to escape the ' inside the CSS selector. Since ' is inside other single quotes, it must be escaped at the CSS level. The reason there are two slashes '\` is we must escape a slash so that it makes it into the selector string.
Simpler example: 'John\\'s' yields the string John\'s.
\" is used to escape the double quote which is contained inside the other double quotes. This one is being escaped on the JS level (not the CSS level), so only one slash is used because we don't need a slash to actually be inside the string contents.
Simpler example: 'Hello \"World\"' yields the string Hello "World".
Update
Since you don't have control over how the HTML is output, and you are doomed to deal with invalid HTML, that means the extra double quote should be ignored. So you can instead do:
$("tr[data-path='files/kissjake\\'s files']")
Just the \\' part to deal with the single quote. The extra double quote should be handled by the browser's lenient HTML parser.

Building off of #Nathan Wall's answer, this will select all <tr> tags with a data-path attribute on them.
$("tr[data-path]");

Related

Despite single quotes being encoded using htmlspecialchars, JavaScript is still complaining that these quotes need to be escaped in the function call

Something strange is occurring and I'm stumped.
I have a link that looks basically like this:
Link
As you can see, I'm calling function uploadVariantPicture with parameter "size:'test2&#039".
However, when I actually click the link, JavaScript complains that the two encoded single quotes aren't being escaped. I'm getting the following error:
SyntaxError: Unexpected identifier 'test2'. Expected ')' to end an argument list.
If I decode the two encoded single quotes and escape them using a backslash, then the function call succeeds. But the problem is I need it encoded. I cannot leave it unencoded and escape the quotes. This won't work for my situation.
Any help is greatly appreciated. I'm super confused.
HTML character entities and escapes are replaced by the HTML parser when parsing source. For quotation marks, it allows inclusion of the same kind of quotation mark in an HTML attribute that is being used to quote the attribute value in source.
E.G.
<element attribute=""">
<element attribute='''>
in source would produce attribute values of " (double quote) and ' (single quote) respectively, despite being the delimters used to quote the attribute value in HTML source.
Hence
Link
will produce an href attribute value of
javascript:uploadVariantPicture('size:'test'');
after removal of the outer double quotes by the HTML parser.
Options could include escaping double quotes (HTML ") inside the href value appropriately (it depends on the syntax accepted by uploadVariantPicture), including backslash escapes before the single quotes as mentioned in the post, or not using the javascript: pseudo protocol at all, in favor of adding an event listener in JavaScript.
Not using javascript: pseudo protocol is highly recommended - basically it's a hold over from HTML3.
Consider attaching an event handler properly using JavaScript instead so you don't have to worry about escaping issues, and so that you don't have to rely on the pollution of the global object for the script to work:
const uploadVariantPicture = (arg) => console.log(arg);
document.querySelector('a').addEventListener('click', () => {
uploadVariantPicture("size:'test2'");
});
<a>Link</a>
I can't think of any situations in which an inline handler would be preferable to addEventListener, unless you were deliberately trying to exploit an XSS vulnerability.

jQuery Escaping Special Characters Fails

I am trying to make a jQuery selector to select, by an arbitrary id, an html element. The ids may contains special characters that need to be escaped. An example is test_E///_AAAAA
I am basically doing exactly what is going on in this working fiddle (which uses v 1.11.0, where I am using v 1.11.3 and have also tested with 2.1.3)
However, in my scaled up environment, it doesn't work. I get Syntax error, unrecognized expression: #test_E\\/\\/\\/_AAAAA
There must be some obscure factoid about jQuery that is the difference between this working and not working. I, being a novice, have no hope of identifying it.
I notice that I am not alone though. A commentator on this thread had the same issue.
The code files are thousands of lines long, and I'm probably prohibited from posting more than a couple lines by my employer. I'm just looking for a hint, a clue, a shot in the dark about what would cause a perfectly reasonable selection string to be rejected.
You just need enough backslashes :)
ID:
The ID of the element is test_E\\/\\/\\/_AAAAA. Note that backslashes don't have any special meaning in HTML, so there really are six backslashes in the ID.
jQuery selector: Backslashes, forward slashes, and several other characters have special meaning in jQuery selectors, so we need to escape them with a backslash. The selector therefore needs to be #test_E\\\\\/\\\\\/\\\\\/_AAAAA. This tells jQuery to look for an element whose ID contains test_E, then two backslashes, then one forward slash, and so on.
JavaScript string literal: To represent that selector using a JavaScript string literal, each backslash needs to be escaped. So the string literal would be "#test_E\\\\\\\\\\/\\\\\\\\\\/\\\\\\\\\\/_AAAAA".
var selectionString = "#test_E\\\\\\\\\\/\\\\\\\\\\/\\\\\\\\\\/_AAAAA";
snippet.log("actual id: " + $("p")[0].id);
snippet.log("selection string given to jQuery: " + selectionString);
snippet.log("text: " + $(selectionString).text());
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.0/jquery.min.js"></script>
<!-- Provides the `snippet` object, see http://meta.stackexchange.com/a/242144 -->
<script src="http://tjcrowder.github.io/simple-snippets-console/snippet.js"></script>
<p id="test_E\\/\\/\\/_AAAAA">This is a test :)</p>
As you can see, this is extremely ugly, hard to understand, and hard to get right. I highly recommend avoiding such IDs. Another option is to use good old document.getElementById(), which only requires the string literal escapes:
$(document.getElementById('test_E\\\\/\\\\/\\\\/_AAAAA')).text()
The code in the fiddle doesn't work either. I have tried it in IE, Firefox and Chrome, and neither of them finds the element.
You need to escape a slash to use it in a # selector. If you use a backslash, you have to escape it twice, once to put it in a string, and once for the selector.
To match the id test\A you need the selector #test\\A which as a string is "#test\\\\A".
To match the id test/A you need the selector #test\/A which as a string is "#test\\/A".
To match the id test_E\\/\\/\\/_AAAAA you need the selector #test_E\\\\\/\\\\\/\\\\\/_AAAAA which as a string is "#test_E\\\\\\\\\\/\\\\\\\\\\/\\\\\\\\\\/_AAAAA".
Demo: https://jsfiddle.net/Guffa/463849xj/4/
Generally you should avoid unusual characters in an identity. Even if you can make it work, there is still a risk that some browser handles it differently.
Update:
The error message is shown with the selector unescaped, so as the error message shows the selector #test_E\\/\\/\\/_AAAAA it means that you actually use the string "#test_E\\\\/\\\\/\\\\/_AAAAA". That leaves the slashes unescaped, which causes the syntax error.

Match attribute value of XML string in JS

I've researched stackoverflow and find similar results but it is not really what I wanted.
Given an xml string: "<a b=\"c\"></a>" in javascript context, I want to create a regex that will capture the attribute value including the quotation marks.
NOTE: this is similar if you're using single quotation marks.
Currently I have a regular expression tailored to the XML specification:
[_A-Za-z][\w\.\-]*(?:=\"[^\"]*\")?
[_A-Za-z][\w\.\-]* //This will match the attribute name.
(?:=\"[^\"]*\")? //This will match the attribute value.
\"[^\"]*\" //This part concerns me.
My question now is, what if the xml string looks like this:
<shout statement="Hi! \"Richeve\"."></shout>
I know this is a dumb question to ask but I just want to capture rare cases that this scenario might happen (I know the coder can use single quotes on this scenario) but there are cases that we don't know the current value of the attribute given that the attribute value changes dynamically at runtime.
So to make this clearer, the result of that using the correct regex should be:
"Hi! \"Richeve\"."
I hope my question is clear. Thanks for all the help!
PS: Note that the language context is Javascript and I know it is tempting to use lookbehinds but currently lookbehinds are not supported.
PS: I know it is really hard to parse XML but I have an elegant solution to this :) so I just need this small problem to be solved. So this problem only main focus is capturing quotation marked string tokens containing quotation marks inside the string token.
The standard pattern for content with matching delimiters and embedded escaped delimiters goes like this:
"[^"\\]*(?:\\.[^"\\]*)*"
Ignoring the obvious first and last characters in the pattern, here's how the rest of the pattern works:
[^"\\]*: Consume all characters until a delimiter OR backslash (matching Hi! in your example)
(?:\\.[^"\\]*)* Try to consume a single escaped character \\. followed by a series of non delimiter/backslash characters, repeatedly (matching \"Richeve first and then \". next in your example)
That's it.
You can try to use a more generic delimiter approach using (['"]) and back references, or you can just allow for an alternate pattern with single quotes like so:
("[^"\\]*(?:\\.[^"\\]*)*"|'[^'\\]*(?:\\.[^'\\]*)*')
Here's another description of this technique that might also help (see the section called Strings): http://www.regular-expressions.info/examplesprogrammer.html
Description
I'm pretty really sure embedding double quotes inside a double quoted attribute value is not legal. You could use the unicode equivalent of a double quote \x22 inside the value.
However to answer the question, this expression will:
allow escaped quotes inside attribute values
capture the attribute statement 's value
allow attributes to appear in any order inside the tag
will avoid many of the edge cases which will trip up pattern matching inside html text
doesn't use lookbehinds
<shout\b(?=\s)(?=(?:[^>=]|='(?:[^']|\\')*'|="(?:[^"]|\\")*"|=[^'"][^\s>]*)*?\sstatement=(['"])((?:\\['"]|.)*?)\1(?:\s|\/>|>))(?:[^>=]|='(?:[^']|\\')*'|="(?:[^"]|\\")*"|=[^'"][^\s>]*)*>.*?<\/shout>
Example
Pretty Rubular
Ugly RegexPlanet set to Javascript
Sample Text
Note the difficult edge case in the first attribute :)
<shout onmouseover=' statement="He said \"I am Inside the onMouseOver\" " ; if ( 6 > a ) { funRotate(statement) } ; ' statement="Hi! \"Richeve\"." title="sometitle">SomeString</shout>
Matches
Group 0 gets the entire tag from open to close
Group 1 gets the quote surrounding the statement attribute value, this is used to match the closing quote correctly
Group 2 gets the statement attribute value which may include escaped quotes like \" but not including the surrounding quotes
[0][0] = <shout onmouseover=' statement="He said \"I am Inside the onMouseOver\" " ; if ( 6 > a ) { funRotate(statement) } ; ' statement="Hi! \"Richeve\"." title="sometitle">SomeString</shout>
[0][1] = "
[0][2] = Hi! \"Richeve\".

Jquery embedded quote in attribute

I have a custom attribute that is being filled from a database. This attribute can contain an embedded single quote like this,
MYATT='Tony\'s Test'
At some pont in my code I use jquery to copy this attribute to a field like this,
$('#MY_DESC').val($(recdata).attr('MYATT'));
MY_DESC is a text field in a dialog box. When I display the dialog box all I see in the field is
Tony\
What I need to see is,
Tony's Test
How can I fix this so I can see the entire string?
Try:
MYATT='Tony&#x27s Test'
I didn't bother verifying this with the HTML spec, but the wikipedia entry says:
The ability to "escape" characters in this way allows for the characters < and & (when written as < and &, respectively) to be interpreted as character data, rather than markup. For example, a literal < normally indicates the start of a tag, and & normally indicates the start of a character entity reference or numeric character reference; writing it as & or & or & allows & to be included in the content of elements or the values of attributes. The double-quote character ("), when used to quote an attribute value, must also be escaped as " or " or " when it appears within the attribute value itself. The single-quote character ('), when used to quote an attribute value, must also be escaped as ' or ' (should NOT be escaped as &apos; except in XHTML documents) when it appears within the attribute value itself. However, since document authors often overlook the need to escape these characters, browsers tend to be very forgiving, treating them as markup only when subsequent text appears to confirm that intent.
In case you won't use double-quotes, put your custom attribute into them :)
If not, I suggest escape the value.
Before setting the value of your text field, you might try running a regular expression against the string to remove all backslashes from the string.
If you do this:
alert($(recdata).attr('MYATT'));
You will see the same result of "Tony\" meaning that the value isn't being properly consumed by the browser. The escaped \' value isn't working in this case.
Do you have the means to edit these values as they are being produced? Can you parse them to include escape values before being rendered?

Using JavaScript single and double quotes for href's

I am having problem with escaping the single and double quotes inside the hrefs JavaScript function.
I have this JavaScript code inside href. It's like -
click this
Now, since double quotes inside double quote is not valid, I need to escape the inner double quotes for it to be treated as part of the string -
so, I need to do this -
click this
The problem is, even the above code is not working. The JavaScript code is getting truncated at -- myFunc(
I tried with the single quote variation too - but even that doesn't seem to work (meaning that if I have a single quote inside my string literal then the code gets truncated).
This is what I did with a single quote:
<a href = 'javascript:myFunc("fileDir/fileName.doc" , true)'> click this </a>
This works, but if I have a single quote inside the string then the code gets truncated in the same way as that of double quotes one.
Using backslashes to escape quotes is how it works in JavaScript, but you're not actually writing JavaScript code there: you're writing HTML. You can do it by using the HTML escaping method: character entities.
" // "
' // '
For example:
...
In case anyone needs to escape some thing like this:
<a href="www.google.com/search?q="how+to+escape+quotes+in+href""</a>
You can use ASCII code for double quotes %22:
<a href="www.google.com/search?q=%22how+to+escape+quotes+in+href%22"</a>
It is especially useful if you pass the link to JavaScript from PHP
As a general best practice, use double-quotes in HTML and single-quotes in JavaScript. That will solve most of your problems. If you need a single-quote in a JavaScript string, you can just escape it using \' - and you probably shouldn't be nesting literal strings any deeper than that.
As noted elsewhere, HTML entities are a possibility if the code is embedded in HTML. But you'll still have to deal with escaping quotes in strings in your JavaScript source files, so it's best to just have a consistent strategy for dealing with JavaScript.
If you are following this strategy and end up with a double-quote embedded in your JavaScript embedded in your HTML, just use the HTML entity ".
Normally, this kind of code is working without problems:
Click this
With this code, do you have any problem?

Categories

Resources