Babel Traverse - Part 1 - Taking a look at how @babel/traverse works
February 7, 2023
• Nero
Of course, we used it on this blog to reverse engineer a good part of Incapsula in the past, so it can be used for anything that involves playing with how a js file looks (also ts file)
This article may have been republished from another source and might not have been originally written for this site.⚠️ Some information, tools, or techniques discussed may have changed or evolved since the publishing of this article.
Originally published at https://nerodesu017.github.io/posts/2023-02-07-babel-traverse-part-1
Babel Traverse - Part 1 - Taking a look at how @babel/traverse works
Nero - Feb 7 2023
Today we will take a look at how @babel/traverse (or babel-traverse, we'll use them interchangebly) plugin works. In part 1, we'll take a look at the helper functions from the entrypoint file. After this, we shall get into the actual traversal, Scopes, NodePaths, how renaming works and more
For those of you that don't yet know what the babel suite is: Babel is a toolchain that is mainly used to convert ECMAScript 2015+ code into a backwards compatible version of JavaScript in current and older browsers or environments.
Of course, we used it on this blog to reverse engineer a good part of Incapsula in the past, so it can be used for anything that involves playing with how a js file looks (also ts file)
Now, a big picture of how babel works looks like this
Parse a javascript file into an AST (Abstract Syntax TREE) using the @babel/parser package
Traverse this AST using @babel/traverse and do the wanted transformations
Generate the code from this AST using @babel/generator
Most of the magic happens at the @babel/traverse level, so that's what is of most use for us at the the moment
Babel-traverse Structure
1.
2└── babel-traverse/
3 ├── cache.ts
4 ├── context.ts
5 ├── hub.ts
6 ├── index.ts
7 ├── traverse-node.ts
8 ├── types.ts
9 ├── path/
10 │ └── ...
11 └── scope/
12 ├── binding.ts
13 ├── index.ts
14 └── lib/
15 └── renamer.ts
16index.ts - entry point
Taking a look at index.ts
we have the next functions(methods) and properties related to the exported traverse
1└── traverse/
2 ├── visitors/
3 │ ├── explode()
4 │ ├── verify()
5 │ └── merge()
6 ├── verify()
7 ├── explode()
8 ├── cheap()
9 ├── node()
10 ├── clearNode()
11 ├── removeProperties()
12 ├── hasType()
13 └── cache/
14 ├── path
15 ├── scope
16 ├── clear()
17 ├── clearPath()
18 └── clearScope()
191. cache
Cache tracks 2 main things the
path
and thescope
, each one of them being a WeakMap; - We'll understand more about these later, when we'll get to the actual traversal and modification of the AST, because we'll see that babel doesn't actually deal just with the AST's nodes, but also, creates it's own AST-wrapper (to say so) that is a Path-Tree. - It is also taking in account the scopes (so it can say where variables are referenced and such) Useful for when you want to recompute the whole Path-Tree/Scopes without generating and reparsing the whole script - example:js traverse.cache.clear();
2. hasType()
useful just to see if a tree (or subtree) has a specific NodeType
we can also specify the
denylistTypes: Array<string>
, which doesn't search in blacklisted nodes (blacklisted by their NodeType) - example:
1
2 Program: {
3 Body: [
4 ExpressionStatement {
5 expression: CallExpression {
6 callee: MemberExpression {
7 object: Identifier {
8 name: "console"
9 },
10 computed: false,
11 property: Identifier {
12 name: "log"
13 }
14 }
15 arguments: [
16 StringLiteral {
17 value: "Hello World"
18 }
19 ]
20 }
21 }
22 ]
23 }
24
25// This Program AST has the next NodeTypes: ["Program", "ExpressionStatement", "CallExpression", "MemberExpression", "Identifier", "StringLiteral"]
26
273. removeProperties()
It passes each subnode (and itself) to
clearNode()
4. clearNode()
removes all properties from the PASSED node that start with "_" along with additional metadata properties like location data and raw token data, while also deleting the associated node from the
traverse.cache.path
WeakMap from earlier
5. node()
TO BE DOCUMENTED IN FUTURE PARTS - actual traversal
6. cheap()
TO BE DOCUMENTED IN FUTURE PARTS - actual traversal
7. explode()
As per the actual documentation, explode does the next thing:
1explode() will take a visitor object with all of the various shorthands
2that we support, and validates & normalizes it into a common format, ready
3to be used in traversal
4The various shorthands are:
5`Identifier() { ... }` -> `Identifier: { enter() { ... } }`
6`"Identifier|NumericLiteral": { ... }` -> `Identifier: { ... }, NumericLiteral: { ... }`
7Aliases in `@babel/types`: e.g. `Property: { ... }` -> `ObjectProperty: { ... }, ClassProperty: { ... }`
8Other normalizations are:
9Visitors of virtual types are wrapped, so that they are only visited when
10their dynamic check passes
11`enter` and `exit` functions are wrapped in arrays, to ease merging of
12visitorsin short,
explode()
normalizes the visitors you pass to thetraverse()
function, a middleman that makes your work easier
8. verify ()
again, middleman function, gets called by
explode()
, so we don't regard this function (at least for the moment, if it is actually of interest, let me know and I could dive deeper in the next posts!)
9. visitors.merge()
This function, as the name implies, is used for merging visitors. It is not documented or anything, but let's take a look at how it is used in some plugins:
1. babel-helper-module-transforms/src/rewirte-this.ts > rewriteThisVisitor()
1let environmentVisitor = {
2 [skipKey]: (path) => path.skip(),
3
4 "Method|ClassProperty"(path: NodePath<t.Method | t.ClassProperty>) {
5 skipAllButComputedKey(path);
6 },
7};
8const rewriteThisVisitor: Visitor = traverse.visitors.merge([
9 environmentVisitor,
10 {
11 ThisExpression(path: NodePath<t.ThisExpression>) {
12 path.replaceWith(unaryExpression("void", numericLiteral(0), true));
13 },
14 },
15]);
162. babel-helper-create-class-features-plugin/src/misc.ts > findBareSupers()
1const findBareSupers =
2 traverse.visitors.merge <
3 NodePath <
4 t.CallExpression >
5 [] >
6 [
7 {
8 Super(path: NodePath<t.Super>) {
9 const { node, parentPath } = path;
10 if (parentPath.isCallExpression({ callee: node })) {
11 this.push(parentPath);
12 }
13 },
14 },
15 environmentVisitor,
16 ];
17From what we can see, this is actually just for merging visitors, as in, you can have a base one that you can expand one without actually modifying it. But, what exactly is it different from just doing it like this:
1const visitor1 = {
2 // ...
3};
4const visitor2 = {
5 // ...
6};
7const finalVisitor = {
8 ...visitor1,
9 ...visitor2,
10};
11well, let's look at a basic example (I'm not sure it covers all the cases, but it will show you that
visitors.merge()
is, indeed, a little better than just the Spread Syntax):
1const parser = require("@babel/parser");
2const traverse = require("@babel/traverse");
3const generate = require("@babel/generator");
4
5const file = `function a() {
6 let x = 33;
7}`;
8
9const AST1 = parser.parse(file);
10const AST2 = parser.parse(file);
11
12let visitor1 = {
13 NumericLiteral() {
14 console.log("Inside visitor1");
15 },
16};
17let visitor2 = {
18 NumericLiteral() {
19 console.log("Inside visitor2");
20 },
21};
22traverse.default(AST1, {
23 ...visitor1,
24 ...visitor2,
25});
26
27console.log();
28
29traverse.default(AST2, {
30 ...traverse.visitors.merge([visitor1, visitor2]),
31});
32This here, will output:
1Inside visitor2
2Inside visitor1
3Inside visitor2So we can see that we can multiple visitors for a specified NodeType, and
visitor.merge()
keeps all these visitors, and it keeps them in the order you feed them to it. All in all, isn't that bad to use over the Spread Syntax.
As a quick recap, here's what we need to know
cache - to be used for when we want to recompute the Program's NodePaths and Scopes
hasType() - if we want to see if the node contains a specific NodeType inside itself, while also being able to blacklist certain subnodes (with a specific NodeType) from being traversed
removeProperties() - clears the node and its subnodes making use of
clearNode()
, doesn't seem to recompute Scopes, and we are not sure if it is okay to delete location data if we in the future will want to access that and get raw strings from the main file. We haven't yet went into how @babel/traverse computes NodePaths, Scopes and such, so we don't know if it will recompute location data - clearNode() - should use removeProperties()
to clear the node recursively - node() - about traversal, not yet researched
cheap() - about traversal, not yet researched
explode() - normalizes visitor
verify() - called by
explode()
, checks specific visitor by their own standards/rules - visitors.merge() - merges multiple objects containing visitors, better than Spread Syntax