No, node_modules are not heavy

Posted 2020-07-04

Javascript became the most popular language in the world and with popularity "hardcore" engineers started to harass js developers and ecosystem. I have no interest in finding where this toxicity came from, but one of the stated issues is the size of a node_modules directory. I always wondered how this is that different from other toolchains: I have a lot of experience with C# (backend), Java/Kotlin (Android), Go (backend), Python (ML), ObjectiveC/Swift (iOS), Scala (backend) and some experience with ocaml/f#/haskel, and experience with NodeJS is so far the most predictable and depdendency tree is actually very compact.

TLDR: Sizes of installations for various languages to build UI app:

node_modules meme

Main difference from other dependency systems

One of the main power of NodeJS is an ability to use multiple versions of a single library within the same build. JS ecosystem is huge mostly because of how it is easy to publish anything, low bar for starting coding and the fact that this is most used language in the world that could run almost anywhere. Since JS is quite old at lack some huge stdlib you just need to use a lot small libraries that solves some specific problem for you. There was a lot of attempts to build some stdlib, but community simply too versatile to have a one-fits-all solution. You can think of it as a Scala ecosystem with a lot of ideas and experimental libraries that sometimes not that practical. Diversity is a huge win. For example because of React we now see declarative paradigms of UI programming in almost any ecosystem. Because you could have so many types of libraries you can choose between performance and easy of use. You can pick native bindings for sql, but could choose JS one, you can have a dozen of different ORMs or write yor own. All this makes JS ecosystem very strong and diverse.

NodeJS benchmark application

We are going to measure NodeJS size with Docker with base image ubuntu:20.04. Base image size is ~74MB and we will substract this size to get net size of nodejs setup.

Using docker

FROM ubuntu:20.04
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && apt install -y nodejs npm

We get size:

REPOSITORY          SIZE
test-node           601MB

So far our plain node installation is 527MB. This is not the smallest installation possible, there are a slim packages avaiable that have size ~55MB total, but we are here to test dev environment, not production deployment.

Let's add some npm packages:

npm install express typescript webpack next

We get image size:

REPOSITORY          SIZE
test-node           783MB

And resulting installation of nodejs with some frameworks is 709MB. This makes node_modules to be 182MB (709MB - 527MB), where Typescript size is ~80MB. NextJS is another ~80MB.

Android toolchain

Well you could say that this is awful comparing NodeJS to Android, but i would say no, it is completely adequate: you are going to build an app for the most used OS in the world that makes this toolchain the most popular one.

To measure Android SDK actual size on disk i am going to use Docker images for predictable results. I found that a very minimal and clean Dockerfile for Android SDK already exist. It was built on top of ubuntu:20.04. To measure Android SDK size we will simply subsctract size of result image from base one.

REPOSITORY             SIZE
thyrlian/android-sdk   988MB

So, just to compile your android app you need a minimum of a wooping 914MB. Also this docker image DOES NOT have an actual SDK in it, just java, kotlin, SDK manager and needed dependencies. Adding required platforms would boost it for another couple of megabytes. We can also try to count emulators as you probabbly will need couple of them instead of dozen of hardware devices, each emulator will be at least 1gb, much less than an additional browser (that requires only 100MB free on disk).

And you would say: this disk space is reused! Yes and no. SDK could be reused, that't true, but android development is a combination of Android Studio, Gradle and Java/Kotlin. This trio deeply integrated and if you want to upgrade something you have to upgrade something else. Often you just pick to stick to specific versions for a while until something will require an upgrade and if it would require upgrade of a Kotlin than it could mean that you have to update your code, some libraries and what's not. Sometimes latest versions simply not working. Sometimes libraries just dissapear from google repositories and you just have to upgrade and once in a while it happens just in the day of release.

iOS toolchain

Measuring size of an iOS toolchain is a very tricky part. But XCode requires you to have at least 20GB of disk space. Each version of iOS would eat another gigabyte. This list could go and go. Any fat binary usualy have 100MB+ size. Cocoapods and friends download sources for your projects and you have to wait for minutes to compile. My current app on top MBP takes 10+ minutes right now. During build they spawn a large amount of binary files that could became quite large.

Talking about dependency tree it is much more flat, but it is mostly about dificulities of reusing libraries. For example it is almost impossible to mix two versions of a single library, like, React Native use one version of Yoga and Texture uses another one and they are incompatible. Most of the libraries if need something they just vendor it or implement small feature by themself instead of relying on a community. Dependencies between projects are mostly whithin a single entity (organization, community) and almost never reused between. We all wish if it could work, but it is simply impossible to have anything deep in practice. This leads that usually every library have their own LRU cache, their own collections and etc. All of them have bugs and not developed within a single community.

Golang

Installing a golang in a Docker is quite simple, but we need to also install gcc since a lot of packages requires.

For empty installation we are getting

REPOSITORY          SIZE
test-node           733MB

So our pure installation of a Go is 659MB

But this is not fair, we are benchmarking building ui apps, web or desktop. Google have UI library (gxui) for go that requires dependencies:

libxi-dev libxcursor-dev libxrandr-dev libxinerama-dev mesa-common-dev libgl1-mesa-dev libxxf86vm-dev

Lets's install them:

REPOSITORY          SIZE
test-node           858MB

So installation that is ready to use without application dependencies is 784MB.

Before go modules dependencies was quite fragile in golang world, but not everyone moved to them yet. In golang an actual dependency tree could go very deep just like NodeJS.

Results

If you pick any generic ecosystem that wasn't built for small developer environment footprint you will always get to use a lot of disk space.

Everything is in Public Domain