Sneaker Dev Logo
Back to Blog

Babel Traverse - Part 1 - Taking a look at how @babel/traverse works

February 7, 2023

Nero

Of course, we used it on this blog to reverse engineer a good part of Incapsula in the past, so it can be used for anything that involves playing with how a js file looks (also ts file)
Babel Traverse - Part 1 - Taking a look at how @babel/traverse worksThis article may have been republished from another source and might not have been originally written for this site.

⚠️ Some information, tools, or techniques discussed may have changed or evolved since the publishing of this article.

Originally published at https://nerodesu017.github.io/posts/2023-02-07-babel-traverse-part-1

Babel Traverse - Part 1 - Taking a look at how @babel/traverse works

Nero - Feb 7 2023

Today we will take a look at how @babel/traverse (or babel-traverse, we'll use them interchangebly) plugin works. In part 1, we'll take a look at the helper functions from the entrypoint file. After this, we shall get into the actual traversal, Scopes, NodePaths, how renaming works and more

For those of you that don't yet know what the babel suite is: Babel is a toolchain that is mainly used to convert ECMAScript 2015+ code into a backwards compatible version of JavaScript in current and older browsers or environments.

Of course, we used it on this blog to reverse engineer a good part of Incapsula in the past, so it can be used for anything that involves playing with how a js file looks (also ts file)

Now, a big picture of how babel works looks like this

Parse a javascript file into an AST (Abstract Syntax TREE) using the @babel/parser package

Traverse this AST using @babel/traverse and do the wanted transformations

Generate the code from this AST using @babel/generator

Most of the magic happens at the @babel/traverse level, so that's what is of most use for us at the the moment

Babel-traverse Structure

1.
2└── babel-traverse/
3    ├── cache.ts
4    ├── context.ts
5    ├── hub.ts
6    ├── index.ts
7    ├── traverse-node.ts
8    ├── types.ts
9    ├── path/
10    │   └── ...
11    └── scope/
12        ├── binding.ts
13        ├── index.ts
14        └── lib/
15            └── renamer.ts
16

index.ts - entry point

Taking a look at index.ts

we have the next functions(methods) and properties related to the exported traverse

1└── traverse/
2    ├── visitors/
3    │   ├── explode()
4    │   ├── verify()
5    │   └── merge()
6    ├── verify()
7    ├── explode()
8    ├── cheap()
9    ├── node()
10    ├── clearNode()
11    ├── removeProperties()
12    ├── hasType()
13    └── cache/
14        ├── path
15        ├── scope
16        ├── clear()
17        ├── clearPath()
18        └── clearScope()
19

1. cache

Cache tracks 2 main things the

path

and thescope

, each one of them being a WeakMap; - We'll understand more about these later, when we'll get to the actual traversal and modification of the AST, because we'll see that babel doesn't actually deal just with the AST's nodes, but also, creates it's own AST-wrapper (to say so) that is a Path-Tree. - It is also taking in account the scopes (so it can say where variables are referenced and such) Useful for when you want to recompute the whole Path-Tree/Scopes without generating and reparsing the whole script - example:js traverse.cache.clear();

2. hasType()

useful just to see if a tree (or subtree) has a specific NodeType

we can also specify the

denylistTypes: Array<string>

, which doesn't search in blacklisted nodes (blacklisted by their NodeType) - example:

1
2  Program: {
3  Body: [
4  ExpressionStatement {
5  expression: CallExpression {
6  callee: MemberExpression {
7  object: Identifier {
8  name: "console"
9  },
10  computed: false,
11  property: Identifier {
12  name: "log"
13  }
14  }
15  arguments: [
16  StringLiteral {
17  value: "Hello World"
18  }
19  ]
20  }
21  }
22  ]
23  }
24
25// This Program AST has the next NodeTypes: ["Program", "ExpressionStatement", "CallExpression", "MemberExpression", "Identifier", "StringLiteral"]
26
27

3. removeProperties()

It passes each subnode (and itself) to

clearNode()

4. clearNode()

removes all properties from the PASSED node that start with "_" along with additional metadata properties like location data and raw token data, while also deleting the associated node from the

traverse.cache.path

WeakMap from earlier

5. node()

TO BE DOCUMENTED IN FUTURE PARTS - actual traversal

6. cheap()

TO BE DOCUMENTED IN FUTURE PARTS - actual traversal

7. explode()

As per the actual documentation, explode does the next thing:

1explode() will take a visitor object with all of the various shorthands
2that we support, and validates & normalizes it into a common format, ready
3to be used in traversal
4The various shorthands are:
5`Identifier() { ... }` -> `Identifier: { enter() { ... } }`
6`"Identifier|NumericLiteral": { ... }` -> `Identifier: { ... }, NumericLiteral: { ... }`
7Aliases in `@babel/types`: e.g. `Property: { ... }` -> `ObjectProperty: { ... }, ClassProperty: { ... }`
8Other normalizations are:
9Visitors of virtual types are wrapped, so that they are only visited when
10their dynamic check passes
11`enter` and `exit` functions are wrapped in arrays, to ease merging of
12visitors

in short,

explode()

normalizes the visitors you pass to thetraverse()

function, a middleman that makes your work easier

8. verify ()

again, middleman function, gets called by

explode()

, so we don't regard this function (at least for the moment, if it is actually of interest, let me know and I could dive deeper in the next posts!)

9. visitors.merge()

This function, as the name implies, is used for merging visitors. It is not documented or anything, but let's take a look at how it is used in some plugins:

1. babel-helper-module-transforms/src/rewirte-this.ts > rewriteThisVisitor()

1let environmentVisitor = {
2  [skipKey]: (path) => path.skip(),
3
4  "Method|ClassProperty"(path: NodePath<t.Method | t.ClassProperty>) {
5    skipAllButComputedKey(path);
6  },
7};
8const rewriteThisVisitor: Visitor = traverse.visitors.merge([
9  environmentVisitor,
10  {
11    ThisExpression(path: NodePath<t.ThisExpression>) {
12      path.replaceWith(unaryExpression("void", numericLiteral(0), true));
13    },
14  },
15]);
16

2. babel-helper-create-class-features-plugin/src/misc.ts > findBareSupers()

1const findBareSupers =
2  traverse.visitors.merge <
3  NodePath <
4  t.CallExpression >
5  [] >
6  [
7    {
8      Super(path: NodePath<t.Super>) {
9        const { node, parentPath } = path;
10        if (parentPath.isCallExpression({ callee: node })) {
11          this.push(parentPath);
12        }
13      },
14    },
15    environmentVisitor,
16  ];
17

From what we can see, this is actually just for merging visitors, as in, you can have a base one that you can expand one without actually modifying it. But, what exactly is it different from just doing it like this:

1const visitor1 = {
2  // ...
3};
4const visitor2 = {
5  // ...
6};
7const finalVisitor = {
8  ...visitor1,
9  ...visitor2,
10};
11

well, let's look at a basic example (I'm not sure it covers all the cases, but it will show you that

visitors.merge()

is, indeed, a little better than just the Spread Syntax):

1const parser = require("@babel/parser");
2const traverse = require("@babel/traverse");
3const generate = require("@babel/generator");
4
5const file = `function a() {
6  let x = 33;
7}`;
8
9const AST1 = parser.parse(file);
10const AST2 = parser.parse(file);
11
12let visitor1 = {
13  NumericLiteral() {
14    console.log("Inside visitor1");
15  },
16};
17let visitor2 = {
18  NumericLiteral() {
19    console.log("Inside visitor2");
20  },
21};
22traverse.default(AST1, {
23  ...visitor1,
24  ...visitor2,
25});
26
27console.log();
28
29traverse.default(AST2, {
30  ...traverse.visitors.merge([visitor1, visitor2]),
31});
32

This here, will output:

1Inside visitor2
2Inside visitor1
3Inside visitor2

So we can see that we can multiple visitors for a specified NodeType, and

visitor.merge()

keeps all these visitors, and it keeps them in the order you feed them to it. All in all, isn't that bad to use over the Spread Syntax.

As a quick recap, here's what we need to know

cache - to be used for when we want to recompute the Program's NodePaths and Scopes

hasType() - if we want to see if the node contains a specific NodeType inside itself, while also being able to blacklist certain subnodes (with a specific NodeType) from being traversed

removeProperties() - clears the node and its subnodes making use of

clearNode()

, doesn't seem to recompute Scopes, and we are not sure if it is okay to delete location data if we in the future will want to access that and get raw strings from the main file. We haven't yet went into how @babel/traverse computes NodePaths, Scopes and such, so we don't know if it will recompute location data - clearNode() - should use removeProperties()

to clear the node recursively - node() - about traversal, not yet researched

cheap() - about traversal, not yet researched

explode() - normalizes visitor

verify() - called by

explode()

, checks specific visitor by their own standards/rules - visitors.merge() - merges multiple objects containing visitors, better than Spread Syntax


Nero

Blog: https://nerodesu017.github.io

GitHub: https://github.com/nerodesu017