Sneaker Dev Logo
Back to Blog

Deobfuscating Imperva's utmvc Anti-Bot Script

March 4, 2023

yog

In part 1 we successfully revealed all of the strings in the script. In this part, we are going to tackle the control flow obfuscation. Control flow obfuscation is a technique to jumble up the order...
Deobfuscating Imperva's utmvc Anti-Bot ScriptThis article may have been republished from another source and might not have been originally written for this site.

⚠️ Some information, tools, or techniques discussed may have changed or evolved since the publishing of this article.

Originally published at https://yoghurtbot.github.io/2023/03/04/Deobfuscating-Incapsula-s-UTMVC-Anti-Bot-Part-2/

Part 2 - Control Flow Flattening

Flattening Control Flow

In part 1 we successfully revealed all of the strings in the script. In this part, we are going to tackle the control flow obfuscation. Control flow obfuscation is a technique to jumble up the order of code in order to make it harder to follow. It jumps around and is usually found in a switch

Virus Bulletin on Twitter: "Sophos' @hackingump1 writes an introduction to control  flow flattening in Emotet. Control flow flattening is an obfuscation  technique that hides program flow by putting all function blocks next

block. I’ve included a visual representation of what that might look like below.

We can see examples of this in the script we’ve partially deobfuscated so far:

1    var _0x3100a2 = "4|2|3|1|6|5|7|0|9|8"["split"]("|"),
2      _0x6753ae = 0x0;
3    while (!![]) {
4      switch (_0x3100a2[_0x6753ae++]) {
5        case "0":
6          _0x4e96ba["Pgl"](_0x14d4e9, _0x4e96ba["uLq"](_0x2bcd36, _0x4bc649));
7          continue;
8        case "1":
9          _0x4bc649["push"](["'v8b33affa616d7e2343cc7cd58fb6cd20c99ad6b16413b7c5af014cfde3a957ad'.toString()", "value"]);
10          continue;
11        case "2":
12          if (!_0x530384["btoa"]) _0x530384["btoa"] = _0x48c673;
13          continue;
14        case "3":
15          _0x4e96ba["gGv"](_0x3abf6a);
16          continue;
17        case "4":
18          if (_0x1e92fd) {
19            try {
20              _0x2e9a92["log"] = _0x4e96ba["uLq"](_0x19195e, _0x1e92fd);
21            } catch (_0x3fc75d) {}
22          }
23          continue;
24        case "5":
25          var _0x290d1c = _0x20a23a["substr"](0x0, 0x2);
26          continue;
27        case "6":
28          var _0x20a23a = "bO+/vQxkbu+/vXQgXu+/ve+/vUHvv73vv71d77+9";
29          continue;
30        case "7":
31          var _0x5a434c = _0x20a23a["substr"](0x2);
32          continue;
33        case "8":
34          _0x4fd281["createElement"]("img")["src"] = _0x4e96ba["LZO"]("/_Incapsula_Resource?SWKMTFSR=1&e=", _0x530384["Math"]["random"]());
35          continue;
36        case "9":
37          if (_0x163515) {
38            _0x4bc649["push"]([_0x163515, "value"]);
39            _0x4e96ba["uLq"](_0x14d4e9, _0x4e96ba["FRS"](_0x2bcd36, _0x4bc649));
40          }
41          continue;
42      }
43      break;
44    }
45

We can see that _0x3100a2

is a variable which contains string literals, that is split

by |

. The swtich

statement then loops through each element and runs the code under each switch statement. The end goal here is to simplify the code to something like this:

1//Code from switch "4"
2if (_0x1e92fd) {
3try {
4  _0x2e9a92["log"] = _0x4e96ba["uLq"](_0x19195e, _0x1e92fd);
5} catch (_0x3fc75d) {}
6}
7
8//Code from switch "2"
9if (!_0x530384["btoa"]) _0x530384["btoa"] = _0x48c673;
10
11//Code from switch "3"
12_0x4e96ba["gGv"](_0x3abf6a);
13
14//Code from switch "1"
15_0x4bc649["push"](["'v8b33affa616d7e2343cc7cd58fb6cd20c99ad6b16413b7c5af014cfde3a957ad'.toString()", "value"]);
16... etc
17

And here is our resulting Babel plugin:

1const t = require("@babel/types");
2const parser = require("@babel/parser");
3const traverse = require("@babel/traverse").default;
4
5const flattenControlFlowVisitor = {
6    SwitchStatement(path){
7        const { node } = path;
8        if(t.isMemberExpression(node.discriminant) &&
9            t.isIdentifier(node.discriminant.object) &&
10            t.isUpdateExpression(node.discriminant.property) &&
11            node.discriminant.property.operator === "++" &&
12            node.discriminant.property.prefix === false)
13        {
14            //We're in the right switch statement
15
16            //Get the switch order variable name
17            //e.g.     var _0x48d663 = "3|6|2|0|5|1|4"["split"]("|"), ---> _0x48d663
18            const switchOrderVar = node.discriminant.object.name;
19
20            //Get the bindings of the variable and get the switch order into an array
21            const switchOrder = path.scope.getBinding(switchOrderVar).path.node.init.callee.object.value.split("|")
22
23            let orderedNodes = []
24
25            //Loop through the switch order
26            for(const sw of switchOrder){
27                //Get the switch cases that belong to the switch
28                const switchCase = path.node.cases.find(c => c.test.value === sw);
29                
30                //Get the nodes under the switch excluding the continue statement
31                const nodesInSwitchCase = switchCase.consequent.filter(c => !t.isContinueStatement(c))
32                
33                //Drop them into an array
34                //cloneDeepWithoutLoc to avoid issues!
35                orderedNodes.push(...nodesInSwitchCase.map(n => t.cloneDeepWithoutLoc(n)))
36            }
37            
38            //Replace the parent while statement
39            const whileStatement = path.parentPath.parentPath;
40            whileStatement.replaceWithMultiple(orderedNodes);
41        }
42    }
43}
44
45traverse(ast, flattenControlFlowVisitor);
46traverse(ast, flattenControlFlowVisitor); //Run it twice!
47

You’ll notice that we need to run this visitor twice, this is because there are nested switch statements. One improvement that could be made to this visitor is to make it recursive. This would make it more efficient and mean that we don’t have to traverse the AST twice. We’ve now simplified the ugly control flow code into this:

1function _0x4d748b(_0x1b013d) {
2var _0x3a900f = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
3var _0x2026be, _0x30a9be, _0x105a8e;
4var _0x337f59, _0x32ce20, _0x52c47a;
5_0x105a8e = _0x1b013d["length"];
6_0x30a9be = 0x0;
7_0x2026be = "";
8while (_0x388c13["wag"](_0x30a9be, _0x105a8e)) {
9  var _0x4a9695 = "6|8|7|2|0|1|4|3|5"["split"]("|"),
10    _0x1a9326 = 0x0;
11  _0x337f59 = _0x388c13["YnZ"](_0x1b013d["charCodeAt"](_0x30a9be++), 0xff);
12  if (_0x388c13["PNJ"](_0x30a9be, _0x105a8e)) {
13    _0x2026be += _0x3a900f["charAt"](_0x337f59 >> 0x2);
14    _0x2026be += _0x3a900f["charAt"](_0x388c13["fza"](_0x388c13["yuG"](_0x337f59, 0x3), 0x4));
15    _0x2026be += "==";
16    break;
17  }
18  _0x32ce20 = _0x1b013d["charCodeAt"](_0x30a9be++);
19  if (_0x388c13["PNJ"](_0x30a9be, _0x105a8e)) {
20    _0x2026be += _0x3a900f["charAt"](_0x388c13["ydV"](_0x337f59, 0x2));
21    _0x2026be += _0x3a900f["charAt"](_0x388c13["AfV"](_0x388c13["XCm"](_0x337f59, 0x3) << 0x4, (_0x32ce20 & 0xf0) >> 0x4));
22    _0x2026be += _0x3a900f["charAt"](_0x388c13["pYD"](_0x32ce20, 0xf) << 0x2);
23    _0x2026be += "=";
24    break;
25  }
26  _0x52c47a = _0x1b013d["charCodeAt"](_0x30a9be++);
27  _0x2026be += _0x3a900f["charAt"](_0x388c13["Oed"](_0x337f59, 0x2));
28  _0x2026be += _0x3a900f["charAt"](_0x388c13["VuS"](_0x388c13["jpt"](_0x337f59, 0x3) << 0x4, _0x388c13["ydV"](_0x388c13["EML"](_0x32ce20, 0xf0), 0x4)));
29  _0x2026be += _0x3a900f["charAt"](_0x388c13["sWH"](_0x32ce20 & 0xf, 0x2) | _0x388c13["ydV"](_0x388c13["uQC"](_0x52c47a, 0xc0), 0x6));
30  _0x2026be += _0x3a900f["charAt"](_0x388c13["EML"](_0x52c47a, 0x3f));
31}
32return _0x2026be;
33}
34

Removing Proxy References

The next obfuscation technique we’re going to tackle is proxy references. Proxy references are calls to functions that execute another function. The end goal here is to replace any calls to proxy functions with their intended function calls.

We can see examples of proxy calls in the code, e.g:

1"Pjt": function _0x1068f3(_0x1dcc0d, _0x4fccf2) {
2  return _0x1dcc0d + _0x4fccf2;
3},
4

Here we have a function that is defined as Pjt

, and all this is doing is doing a simple addition on the two parameters.

First thing we need to do is traverse through the script and identify any objects that define the proxy functions. You’ll see that the proxy references are contained in an object like so:

1var _0x388c13 = {
2"wag": function _0x67d457(_0x2f20be, _0x4a591a) {
3  return _0x2f20be < _0x4a591a;
4},
5

The goal is to build a lookup table of the object variable name (_0x388c13

) and all the proxy functions it contains. We can achieve this with this visitor:

1this.proxyFuncVars = {}
2path.traverse({
3    ObjectProperty(path){
4        const { node } = path;
5        if (t.isFunctionExpression(node.value) && t.isReturnStatement(node.value.body.body[0])){
6            //Found a proxy expression
7            const varDecl = path.getStatementParent()
8            if (!this.proxyFuncVars[varDecl.node.declarations[0].id.name]) {
9                this.proxyFuncVars[varDecl.node.declarations[0].id.name] = [];
10            }
11
12            this.proxyFuncVars[varDecl.node.declarations[0].id.name].push([node.key.value, node.value]);
13        }
14    }
15})
16

We now have a lookup table that looks like this:

image-20230306151919010

The next thing that we need to do is traverse CallExpressions

and see if it belongs in our lookup table. If it does, we can simply replace the CallExpression

with the proxy function. Here is a visitor that does just that:

1function findProxyFunction(lookupTable, varName, funcName){
2    for(const key of Object.keys(lookupTable)){
3        if (key === varName){
4            for(const func of lookupTable[key]){
5                if (func[0] === funcName){
6                    return func[1]
7                }
8            }
9        }
10    }
11    return null
12}
13
14CallExpression(path){
15    const { node } = path;
16    if (
17        t.isMemberExpression(node.callee) &&
18        t.isIdentifier(node.callee.object) &&
19        findProxyFunction(this.proxyFuncVars, node.callee.object.name, node.callee.property.name)
20    ){
21        const varName = node.callee.object.name;
22        const funcName =  node.callee.property.name;
23
24        const proxyFunc = findProxyFunction(this.proxyFuncVars, varName, funcName);
25        if (proxyFunc){
26            //We found a proxy function, so do a replacement
27            if (t.isBinaryExpression(proxyFunc.body.body[0].argument)){
28                const funcBinaryExpression = proxyFunc.body.body[0].argument;
29                path.replaceWith(t.binaryExpression(funcBinaryExpression.operator, node.arguments[0], node.arguments[1]))
30            } else if (t.isCallExpression(proxyFunc.body.body[0].argument)){
31                const funcName = node.arguments.slice(1);
32                path.replaceWith(t.callExpression(node.arguments[0], funcName))
33            }
34        }
35    }
36},
37

Caution:The above code doesn’t consider the ordering of the arguments. For the utmvc script it’s not a problem since they’re always in order, however you should consider the argument positioning for other obfuscated code

We’ve now transformed the code and removed all the proxy references!

Before:

1_0x25c494[_0x25c494.length] = _0x5cf4da.bfY(_0x4ce1f7, _0x5cf4da.szJ(_0x412285, "=undefined"));

After:

1_0x25c494[_0x25c494.length] = _0x4ce1f7(_0x412285 + "=undefined");

yog

Blog: https://yoghurtbot.github.io

Twitter: https://twitter.com/yoghurtbot

GitHub: https://github.com/yoghurtbot