Previously we had two encodings for JS files:
1. If a file contains only ASCII characters, encode it as a one-byte
string (interpreted as uint8_t array during loading).
2. If a file contains any characters with code point above 127,
encode it as a two-byte string (interpreted as uint16_t array
during loading).
This was done because V8 only supports Latin-1 and UTF16 encoding
as underlying representation for strings. To store the JS code
as external strings to save encoding cost and memory overhead
we need to follow the representations supported by V8.
Notice that there is a gap in the Latin1 range (128-255) that we
encoded as two-byte, which was an undocumented TODO for a long
time. That was fine previously because then files that contained
code points beyond the 0-127 range contained code points >255.
Now we have undici which contains code points in the range 0-255
(minus a replaceable code point >255). So this patch adds handling
for the 128-255 range to reduce the size overhead caused by encoding
them as two-byte. This could reduce the size of the binary by
~500KB and helps future files with this kind of code points.
Drive-by: replace `’` with `'` in undici.js to make it a Latin-1
only string. That could be removed if undici updates itself to
replace this character in the comment.
PR-URL: https://github.com/nodejs/node/pull/51605
Reviewed-By: Daniel Lemire <daniel@lemire.me>
Reviewed-By: Ethan Arrowood <ethan@arrowood.dev>
Commit 938212f added -msign-return-address=all to _all_ cflags but that
is wrong when cross-compiling, it should only be added to the target's
cflags.
The flag being deprecated, it is also changed to
`-mbranch-protection=standard`.
Fixes: https://github.com/nodejs/node/issues/42888
Co-Authored-By: Michaël Zasso <targos@protonmail.com>
PR-URL: https://github.com/nodejs/node/pull/51256
Fixes: https://github.com/nodejs/build/issues/3319
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Co-authored-by: Daniel Lemire <daniel@lemire.me>
PR-URL: https://github.com/nodejs/node/pull/50322
Reviewed-By: Jacob Smith <jacob@frende.me>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
PR-URL: https://github.com/nodejs/node/pull/50322
Reviewed-By: Jacob Smith <jacob@frende.me>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Antoine du Hamel <duhamelantoine1995@gmail.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Geoffrey Booth <webadmin@geoffreybooth.com>
This makes it easier to locate indeterminism in the snapshot, with
the following command:
$ ./configure --write-snapshot-as-array-literals
$ make V=
$ mv out/Release/obj/gen/node_snapshot.cc ./node_snapshot.cc
$ make V=
$ diff out/Release/obj/gen/node_snapshot.cc ./node_snapshot.cc
PR-URL: https://github.com/nodejs/node/pull/49312
Refs: https://github.com/nodejs/build/issues/3043
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
- Build a static table of octal strings and use it instead of
building octal strings repeatedly during printing.
- Print a newline and an offset for every 64 bytes in the case
of printing array literals so it's easier to locate
variation in snapshot blobs.
- Rework the printing routines so that the differences are only
made in a WriteByteVectorLiteral routine. We can update this
for compression support in the future.
- Rename Snapshot::Generate() that write the data as C++ source
instead of a blob as Snaphost::GenerateAsSource() for clarity,
and move the file stream operations into it to streamline
error handling.
PR-URL: https://github.com/nodejs/node/pull/48851
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Previously we hard-code a wrapper id to be used in BaseObjects
to avoid accidentally triggering cppgc on these non-cppgc-managed
objects, but hard-coding can be be hacky and result in mismatch
when we start to create CppHeap ourselves. This patch makes it
more robust by deducing non-cppgc id from the effective cppgc id,
if there is one.
PR-URL: https://github.com/nodejs/node/pull/48660
Refs: 9327503128
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
Incremental compilation of Node.js is slow. Currently on a powerful
Linux machine, it takes about 9 seconds and 830 MB of memory to compile
`gen/node_javascript.cc` with g++. This is the longest step when
recompiling a small change to a Javascript file.
`gen/node_javascript.cc` contains a lot of large binary literals of our
Javascript source code. It is well-known that embedding large binary
literals as C/C++ arrays is slow. One workaround is to include the data
as string literals instead. This is particularly nice for the Javascript
included via js2c, which look better as string literals anyway.
Add a build flag `NODE_JS2C_USE_STRING_LITERALS` to js2c. When this flag
is set, we emit string literals instead of array literals, i.e.:
```c++
// old: static const uint8_t X[] = { ... };
static const uint8_t *X = R"JS2C1b732aee(...)JS2C1b732aee";
// old: static const uint16_t Y[] = { ... };
static const uint16_t *Y = uR"JS2C1b732aee(...)JS2C1b732aee";
```
This requires some modest refactoring in order to deal with the flag
being on or off, but the new code itself is actually shorter.
I only enabled the new flag on Linux/macOS, since those are systems that
I have available for testing. On my Linux system with gcc, it speeds up
compilation by 5.5s (9.0s -> 3.5s). On my Mac system with clang, it
speeds up compilation by 2.2s (3.7s -> 1.5s). (I don't think this flag
will work with MSVC, but it'd probably speed up clang on windows.)
The long-term goal here is probably to allow this to occur incrementally
per Javascript file & in parallel, to avoid recompiling all of
`gen/node_javascript.cc`. Unfortunately the necessary gyp incantations
seem impossible (or at least, far beyond me). Anyway, a 60% speedup is a
nice enough win.
Refs: https://github.com/nodejs/node/issues/47984
PR-URL: https://github.com/nodejs/node/pull/48160
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Incremental compilation of Node.js is slow. Currently on a powerful
Linux machine, it takes about 5.8 seconds to compile
`gen/node_snapshot.cc` with g++.
As in the previous PR which dealt with `node_js2c`, we add a new build
define `NODE_MKSNAPSHOT_USE_STRING_LITERALS` which is used by
`node_mksnapshot`. When this flag is set, we emit string literals
instead of array literals for the snapshot blob and for the code cache,
i.e.:
```c++
// old: static const uint8_t X[] = { ... };
static const uint8_t *X = "...";
```
I only enabled the new flag on Linux/macOS, since those are systems that
I have available for testing. On my Linux system with gcc, it speeds up
compilation of this file by 3.7s (5.8s -> 2.1s). On my Mac system with
clang, it speeds up compilation by 1.7s (3.4s -> 1.7s).
Again, the right thing here is probably to generate separate files for
the snapshot blob and for each code cache output, but this is a nice
intermediate speedup.
Refs: https://github.com/nodejs/node/issues/47984
Refs: https://github.com/nodejs/node/pull/48160
PR-URL: https://github.com/nodejs/node/pull/48162
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
This makes it easier to use third-party dependencies in this tool
(e.g. adding compression using algorithms not available in Python).
It is also much faster - locally js2c.py takes ~1.5s to generate the
output whereas this version takes ~0.1s - and consumes less memory
(~110MB v.s. 66MB).
This also modifies the js2c.py a bit to simplify the output, making
it easier to compare with one generated by the C++ version. Locally
the output from the two are identical. We'll remove js2c.py in a
subsequent commit when the C++ version is used by default.
PR-URL: https://github.com/nodejs/node/pull/46997
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
NodeBIO's memory buffer structure does not support BIO_C_FILE_SEEK and B
IO_C_FILE_TELL. This prevents OpenSSL PEM_read_bio_PrivateKey from readi
ng some private keys. So I switched to OpenSSL'w own protected memory bu
ffers.
Fixes: https://github.com/nodejs/node/issues/47008
PR-URL: https://github.com/nodejs/node/pull/47160
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Don't link intermediate executables with LTO in order to speed up
overall build time.
Signed-off-by: Konstantin Demin <rockdrilla@gmail.com>
PR-URL: https://github.com/nodejs/node/pull/47313
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Move the bindings used by TextEncoder to a new binding for
more self-contained code.
PR-URL: https://github.com/nodejs/node/pull/46658
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Drive-by: Replace the SFINAE with a static_assert because we don't
have (or need) an implementation for non-scalar AliasedBufferBase
otherwise. Add forward declarations to memory_tracker.h now that
aliased-buffer.h no longer includes util-inl.h.
PR-URL: https://github.com/nodejs/node/pull/46817
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
PR-URL: https://github.com/nodejs/node/pull/46579
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Locally the hashing of the binding names sometimes has significant
presence in the profile of bindings, because there can be collisions,
which makes the cost of adding a new binding data non-trivial,
but it's wasteful to spend time on hashing them or dealing with
collisions at all, since we can just use the EmbedderObjectType
enum as the key, as the string names are not actually used beyond
debugging purposes and can be easily matched with a macro.
And since we can just use the enum as the key, we do not even
need the map and can just use an array with the enum as indices
for the lookup.
PR-URL: https://github.com/nodejs/node/pull/46620
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Chengzhong Wu <legendecas@gmail.com>
Python 3.9 on IBM i now properly returns "os400" for sys.platform
instead of claiming to be AIX as it did previously. While the IBM i PASE
environment is compatible with AIX, it is a subset and has numerous
differences which makes it beneficial to distinguish, however this means
that it now needs explicit support here.
PR-URL: https://github.com/nodejs/node/pull/46739
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
See documentation in dataqueue/queue.h for details
Co-authored-by: flakey5 <73616808+flakey5@users.noreply.github.com>
PR-URL: https://github.com/nodejs/node/pull/45258
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
PR-URL: https://github.com/nodejs/node/pull/46410
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
PR-URL: https://github.com/nodejs/node/pull/46410
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Benjamin Gruenbaum <benjamingr@gmail.com>
Reviewed-By: Tiancheng "Timothy" Gu <timothygu99@gmail.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Robert Nagy <ronagy@icloud.com>
PR-URL: https://github.com/nodejs/node/pull/45486
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Jiawen Geng <technicalcute@gmail.com>
PR-URL: https://github.com/nodejs/node/pull/46194
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Tobias Nießen <tniessen@tnie.de>
Reviewed-By: Luigi Pinca <luigipinca@gmail.com>
Reviewed-By: Richard Lau <rlau@redhat.com>
Reviewed-By: Colin Ihrig <cjihrig@gmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
Remove the dependency on ICU for this part, as well as the
hacky way of converting embedder main sources to UTF-8 via
V8 APIs. Allow `UnionBytes` to own the memory its pointing
to in order to simplify the code on the `BuiltinLoader` side.
PR-URL: https://github.com/nodejs/node/pull/46119
Reviewed-By: Yagiz Nizipli <yagiz@nizipli.com>
Reviewed-By: James M Snell <jasnell@gmail.com>
Reviewed-By: Rafael Gonzaga <rafael.nunu@hotmail.com>
Reviewed-By: Darshan Sen <raisinten@gmail.com>
PR-URL: https://github.com/nodejs/node/pull/45803
Reviewed-By: Robert Nagy <ronagy@icloud.com>
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Reviewed-By: Anna Henningsen <anna@addaleax.net>
Reviewed-By: Michael Dawson <midawson@redhat.com>
Generalize the finalizer's second pass callback to make it
cancellable and simplify the code around the second pass callback.
With this change, it is determined that Reference::Finalize or
RefBase::Finalize are called once, either from the env's shutdown,
or from the env's second pass callback.
All existing node-api js tests should pass without a touch. The
js_native_api cctest is no longer applicable with this change,
just removing it.
PR-URL: https://github.com/nodejs/node/pull/44141
Refs: https://github.com/nodejs/node/issues/44071
Reviewed-By: Michael Dawson <midawson@redhat.com>