🐧 Ciel, Yiwei Gong

Recreate deno from scratch (#re-deno) #1

Sun Jun 24, 2018

Note: this article compiled and tested on CentOS 7, based on commit 246 on the master branch.

In jsconf2018, Ryan Dahl gave a talk, Design Mistakes in Node, and brought his next generation server-side TypeScript runtime, deno, to the public. At this moment, deno is still a very early stage project. It has no stable API and actually, even a very basic runnable binary is not provided. Re-deno is a project which aims to recreate deno from scratch, and to understand the underlying technology of deno’s implementation.

Since deno was created, it has already been refactored and recreated few times. The first version of deno, Dahl used a golang V8 binding to handle the communication between TypeScript and the native API. In the recent commits, Dahl dropped golang to avoid double GC problems and started using a C++ and Rust backend. So, currently the communication between TypeScript and V8 is handled by a C++ and C library, libdeno and the language logic, compiler driver, deno’s native APIs are implemented in Rust.

Architecture and design

There are three layers of API to consider: * L1: the low-level message passing API exported by libdeno (L1), * L2: the protobuf messages used internally (L2), * L3: the final deno namespace exported to users (L3).

To be more specific, the L1 API is implemented in C and C++, the L2 API is going to be implemented in Rust and the L3 API is going to be implemented in TypeScript.

L1 API

deno.d.ts

pub(channel: string, msg: ArrayBuffer): null | ArrayBuffer;

The only interface to make calls outside of V8. In the TypeScript environment, all native APIs must communicate deno native APIs with the pub function. Users send an ArrayBuffer and synchronously receive an ArrayBuffer back. The channel parameter specifies the purpose of the message.

Another interface is to receive a message from the native library.

sub(channel: string, cb: (msg: ArrayBuffer) => void): void;

The user is able to subscribe a channel and receive an asynchronous callback if the data is available. sub() is not strictly necessary to implement deno. All communication could be done through pub if there was a message to poll the event loop. In this case, sub API may remove in the future.

The L1 API is the bare minimum requirement to run TypeScript in deno. All native APIs in deno will be using L1 API to communicate the native world. These two APIs are all implemented in C and C++ and bound to V8 JavaScript VM.

function print(x: string): void;

A way to print to stdout. Although this could be easily implemented thru pub() this is an important debugging tool to avoid intermediate infrastructure.

L2 API

msg.proto

The L2 API defines the language logic and they are implemented based on L1 API. For example, the native APIs like setTimeout, clearTimeout and readFileSync, they are all defined in L2 API. To be more clear, in L2 API, it uses Protocol Buffers to exchange binary data by calling pub function.

The L2 API will be implemented in Rust to handle all native functions like reading a file from disk, starting an HTTP client and etc. Moreover, module resolving and code preparation will also be handled in the L2 layer.

L3 API

This is the high-level API which provided by deno. It defines a set of TypeScript interfaces (for example, deno.readFileSync) to the deno environment. However, the intention is to expose functionality as simply as possible. There should be little or no “ergonomics” APIs. (For example, deno.readFileSync only deals with ArrayBuffers and does not have an encoding parameter to return strings.) The intention is to make very easy to extend and link in external modules which can then add this functionality.

And one of the most important things: > Deno does not aim to be API compatible with Node in any respect. Deno will export a single flat namespace “deno” under which all core functions are defined. We leave it up to users to wrap Deno’s namespace to provide some compatibility with Node.

Dive into details - an overview to L1 API and implementation

We will start with the L1 API first. And actually all heavy work has been handled by the V8 engine, the L1 API is only to build a communication channel between V8 and deno environment, and provide an access point to L2 API.

Let’s start with the public interface of libdeno, include/deno.h.

First of all, neither Rust nor Go support calling directly into C++ functions, therefore the public interface to libdeno is done in C, though the implementation is still C++.

// The pub API
int deno_pub(Deno* d, const char* channel, deno_buf buf);

Deno* deno_new(void* data, deno_sub_cb cb);

typedef void (*deno_sub_cb)(Deno* d, const char* channel, deno_buf buf);
void deno_set_response(Deno* d, deno_buf buf);

Note that the pub API is provided with two different functions. First, deno_pub is the access point which links to the TypeScript pub function, then internally it routes to a response API deno_sub_cb, setting by deno_new, to handle the pub call. deno_set_response is used to set the return value of the pub function.

By knowing this, let’s take a look at the current main function, main.cc:

Note: at this moment, the Rust API is not ready (actually even the C API is not ready as well :|), so deno and deno_sub_cb now are written in C++, which suppose to be the L2 API and written in Rust. Dahl creates another deno_rs binary, written in Rust, and most likely this will be the main entry point of deno in the future.

int main(int argc, char** argv) {
    // Set V8 environment, this will be discussed later
    // when we start looking at the communication between libdeno and V8.
    deno_init();
    deno_set_flags(&argc, argv);
    global_argv = argv;
    global_argc = argc;
    
    // Register MessagesFromJS as the pub hander. When pub is called,
    // MessagesFromJS will be invoked, channel and deno_buf will be
    // injected into MessagesFromJS
    Deno* d = deno_new(NULL, MessagesFromJS);

    bool r = deno_execute(d, "deno_main.js", "denoMain();");
    if (!r) {
        printf("Error! %s\n", deno_last_exception(d));
        exit(1);
    }
    
    // Free deno from the memory
    deno_delete(d);
}

At the moment, MessagesFromJS is not workable and only prints some information of the deno environment. But we can still have a peek to understand what has been done inside MessagesFromJS.

void MessagesFromJS(Deno* d, const char* channel, deno_buf buf) {
    // Log channel name, we will see this later in the output
    printf("MessagesFromJS %s\n", channel);

    // Channel is ignored at this moment, we perfrom a START event for any channel.
    // In the future, this will be replaced by a router to find the corresponding handler.
    deno::Msg response;
    response.set_command(deno::Msg_Command_START);
    char cwdbuf[1024];
    std::string cwd(getcwd(cwdbuf, sizeof(cwdbuf)));
    response.set_start_cwd(cwd);

    // Now we push the command line argument argv to response data, so later in TypeScript,
    // we will receive this data back.
    for (int i = 0; i < global_argc; ++i) {
        printf("arg %d %s\n", i, global_argv[i]);
        response.add_start_argv(global_argv[i]);
    }
    printf("response.start_argv_size %d \n", response.start_argv_size());

    std::string output;
    CHECK(response.SerializeToString(&output));

    deno_buf bufout{output.c_str(), output.length()};
    // Return data
    deno_set_response(d, bufout);
}

So we know, currently MessagesFromJS is not completed yet. But it is the handler to handle all pub function call. For now, MessagesFromJS only logs the channel name, and return some information like argv to TypeScript environment.

Now it’s time for us to take a look of the TypeScript entry point, main.ts.

window["denoMain"] = () => {
    // First of all, it prints the version of TypeScript
    deno.print(`ts.version: ${ts.version}`);

    // Then it sends a pub function call to the deno environment. Now
    // we know MessagesFromJS will be invoked. res is the data returned
    // from MessagesFromJS.

    // Remember that we have some print functions inside MessagesFromJS function.
    // The stdout now is transferred back to the C++ deno environment and
    // it will print the message by printf function in MessagesFromJS.
    const res = deno.pub("startDeno2", emptyArrayBuffer());
    deno.print(`before`);

    // Now it returns to TypeScript world. Code will be excuted by V8 engine
    // and we are able to write logic in common TypeScript syntax.
    const resUi8 = new Uint8Array(res);
    // Remember we use protobuf to send and receive data. Now we are able to decode
    // the data that we received from deno environment.
    const msg = pb.Msg.decode(resUi8);
    deno.print(`after`);
    const {
        startCwd: cwd,
        startArgv: argv,
        startDebugFlag: debugFlag,
        startMainJs: mainJs,
        startMainMap: mainMap
    } = msg;

    deno.print(`cwd: ${cwd}`);
    deno.print(`debugFlag: ${debugFlag}`);

    for (let i = 0; i < argv.length; i++) {
        deno.print(`argv[${i}] ${argv[i]}`);
    }
};

Now if we build and execute the deno binary, we will get this output:

$ ./deno
ts.version: 2.9.1
MessagesFromJS startDeno2
arg 0 ./deno
response.start_argv_size 1
before
after
cwd: /home/ciel/out
debugFlag: false
argv[0] ./deno

$ ./deno hello-world
ts.version: 2.9.1
MessagesFromJS startDeno2
arg 0 ./deno
arg 1 hello-world
response.start_argv_size 2
before
after
cwd: /home/ciel/out
debugFlag: false
argv[0] ./deno
argv[1] hello-world

So, before before, messages are printed by MessagesFromJS and after after, messages are printed by deno.print in TypeScript (Although deno.print is also a native function and uses printf in C as underlying implementation).

In conclusion, at this moment, we have a very basic L1 API which allows us to inject native code to the TypeScript environment. Moreover, deno plans to only provide pub API to handle the communication. Native APIs like readFileSync will be implemented on top of pub message passing API and most likely it will be in Rust.