blog.

Writing a Dockerfile with yarn@berry

We have a very simple node app written in typescript that we’d like to Docker-ize. It uses npm and has the following directory structure:

./
├─ src/
│  └─ index.ts
├─ package.json
├─ package-lock.json
└─ tsconfig.json

For production, src/index.ts is built to dist/index.js, which is specified as main in package.json.

The Dockerfile could be as simple as:

FROM docker.io/node

WORKDIR /usr/src/app

COPY . .
RUN npm install && npm run build

USER node:node

CMD ["node", "."]

Optimizing the Dockerfile

Obviously, this Dockerfile isn’t optimized; it’s best to extract out a “builder” stage, copying production files from this stage to the final image. This benefit will be amplified as our app grows, since development dependencies and source files won’t be represented in the final image.

Additionally, the default node image in the Docker registry uses a debian base; we can do better by using the alpine variant.

FROM docker.io/node:alpine AS builder

WORKDIR /usr/src/app

COPY ./package.json ./package-lock.json ./
RUN npm install

COPY ./src ./src
COPY ./tsconfig.json ./

RUN npm run build && \
    npm prune --omit dev

FROM docker.io/node:alpine

COPY --from=builder /usr/src/app/dist ./dist
COPY --from=builder /usr/src/app/node_modules ./node_modules
COPY --from=builder /usr/src/app/package.json ./

USER node:node

CMD ["node", "."]

As an added bonus, if we don’t change anything in package.json, subsequent builds will be much quicker because Docker is able to use cached versions of layers all the way until the COPY ./src ./src line.

Speaking of, notice that COPY . . was changed:

diff --git a/Dockerfile b/Dockerfile

- COPY . .
+ COPY ./src ./src
+ COPY ./tsconfig.json ./

This, again, helps with build-time. Since the COPY doesn’t include any other miscellaneous configuration files (linter configs, CI configs, .gitignores, etc.), we’re able to iterate freely on those without incurring a Docker cache miss.

This could also be accomplished with liberal use of .dockerignore.

diff --git a/.dockerignore b/.dockerignore

+ .eslintrc
+ .git
+ .gitignore
+ .prettierrc
...

I recommend using both strategies together — the Docker daemon can’t even see anything not in .dockerignore, so ignoring large directories like .git will speed up the “Sending build context to Docker daemon” step, too.

Challenges with yarn@berry

The node Docker image doesn’t include yarn@berry; a copy of yarn@berry is actually intended to be stored in each project inside the .yarn/releases directory. This restriction is because lockfiles and the like may differ between different versions of yarn, so pinning yarn itself provides guaranteed compatibility.

# FROM docker.io/node:alpine AS builder

COPY ./.yarn/releases ./.yarn/releases

Another option is to use Corepack, a built-in Node feature that reads the packageManager field from package.json; it just has to be enabled.

# FROM docker.io/node:alpine AS builder

RUN corepack enable
diff --git a/package.json b/package.json

+ "packageManager": "yarn@3.4.1"

Pruning devDependencies

yarn@berry doesn’t have a command analogous to npm prune, meaning there’s no built-in way to remove development dependencies. jq to the rescue!

# FROM docker.io/node:alpine AS builder

RUN yarn run build && \
    apk add jq && \
    jq -r '.devDependencies | keys | .[]' package.json \
        | xargs yarn remove

This does modify the package.json and yarn.lock, but it’s not the end of the world since version control isn’t a concern in this context.

Plug’n’Play

Plug’n’Play (PnP) is a “hack” for node’s module resolution that provides safe and deterministic package resolution. I highly recommend checking out yarn’s documentation to learn more if you’re not familiar — it’s super cool!

PnP uses a couple additional files and directories to work. (yarn auto-generates these during yarn install.) These additional files need to be COPY-d from the builder into the final image instead of the node_modules directory.

# FROM docker.io/node:alpine AS builder

COPY ./.yarnrc.yml ./package.json ./yarn.lock ./

RUN yarn install

# FROM docker.io/node:alpine

COPY --from=builder /.yarn/cache ./.yarn/cache
COPY --from=builder /.yarn/install-state.gz ./.yarn/
COPY --from=builder \
    /usr/src/app/.pnp.cjs \
    /usr/src/app/.pnp.loader.mjs \
    /usr/src/app/.yarnrc.yml \
    /usr/src/app/package.json \
    /usr/src/app/yarn.lock \
    ./

In this implementation, .pnp.cjs (and .pnp.loader.mjs) need to be integrated into the node invocation. yarn@berry provides a shortcut for this: yarn node.

CMD ["yarn", "node", "."]

Zero-Installs

Additionally, PnP enables a feature called zero-installs. In a project using zero-installs, .zip files of each dependency are stored in .yarn/cache and intended to be committed, providing absolute reproducibility and eliminating the need to re-download packages, speeding up yarn install significantly.

That directory needs to be COPY-d from our host into the image; then, we can use yarn install --immutable instead to ensure we’re using zero-installs.

# FROM docker.io/node:alpine AS builder

COPY ./.yarn/cache ./.yarn/cache
COPY ./.yarnrc.yml ./package.json ./yarn.lock ./

RUN yarn install --immutable

Putting it all together, here’s the final Dockerfile:

FROM docker.io/node:alpine AS builder

RUN corepack enable

WORKDIR /usr/src/app

COPY ./.yarn/cache ./.yarn/cache
COPY ./.yarnrc.yml ./package.json ./yarn.lock ./

RUN yarn install --immutable

COPY ./src ./src
COPY ./tsconfig.json ./

RUN yarn run build && \
    apk add jq && \
    jq -r '.devDependencies | keys | .[]' package.json \
        | xargs yarn remove

FROM docker.io/node:alpine

RUN corepack enable

WORKDIR /usr/src/app

COPY --from=builder /.yarn/cache ./.yarn/cache
COPY --from=builder /.yarn/install-state.gz ./.yarn/

COPY --from=builder /usr/src/app/dist ./dist
COPY --from=builder \
    /usr/src/app/.pnp.cjs \
    /usr/src/app/.pnp.loader.mjs \
    /usr/src/app/.yarnrc.yml \
    /usr/src/app/package.json \
    /usr/src/app/yarn.lock \
    ./

USER node:node

CMD ["yarn", "node", "."]

Definitely more complex than the npm version, but we have all the benefits of yarn@berry: PnP for safety and determinism, zero-installs for speed, and so much more.