Writing a
Dockerfile
with
yarn@berry
We have a very simple node
app written in typescript
that we’d like to Docker-ize. It uses npm
and has the following directory structure:
./
├─ src/
│ └─ index.ts
├─ package.json
├─ package-lock.json
└─ tsconfig.json
For production, src/index.ts
is built to dist/index.js
, which is specified as main
in package.json
.
The Dockerfile
could be as simple as:
FROM docker.io/node
WORKDIR /usr/src/app
COPY . .
RUN npm install && npm run build
USER node:node
CMD ["node", "."]
Optimizing the Dockerfile
Obviously, this Dockerfile
isn’t optimized; it’s best to extract out a “builder” stage, copying production files from this stage to the final image. This benefit will be amplified as our app grows, since development dependencies and source files won’t be represented in the final image.
Additionally, the default node
image in the Docker registry uses a debian
base; we can do better by using the alpine
variant.
FROM docker.io/node:alpine AS builder
WORKDIR /usr/src/app
COPY ./package.json ./package-lock.json ./
RUN npm install
COPY ./src ./src
COPY ./tsconfig.json ./
RUN npm run build && \
npm prune --omit dev
FROM docker.io/node:alpine
COPY --from=builder /usr/src/app/dist ./dist
COPY --from=builder /usr/src/app/node_modules ./node_modules
COPY --from=builder /usr/src/app/package.json ./
USER node:node
CMD ["node", "."]
As an added bonus, if we don’t change anything in package.json
, subsequent builds will be much quicker because Docker is able to use cached versions of layers all the way until the COPY ./src ./src
line.
Speaking of, notice that COPY . .
was changed:
diff --git a/Dockerfile b/Dockerfile
- COPY . .
+ COPY ./src ./src
+ COPY ./tsconfig.json ./
This, again, helps with build-time. Since the COPY
doesn’t include any other miscellaneous configuration files (linter configs, CI configs, .gitignore
s, etc.), we’re able to iterate freely on those without incurring a Docker cache miss.
This could also be accomplished with liberal use of .dockerignore
.
diff --git a/.dockerignore b/.dockerignore
+ .eslintrc
+ .git
+ .gitignore
+ .prettierrc
...
I recommend using both strategies together — the Docker daemon can’t even see anything not in .dockerignore
, so ignoring large directories like .git
will speed up the “Sending build context to Docker daemon” step, too.
Challenges with yarn@berry
The node
Docker image doesn’t include yarn@berry
; a copy of yarn@berry
is actually intended to be stored in each project inside the .yarn/releases
directory. This restriction is because lockfiles and the like may differ between different versions of yarn
, so pinning yarn
itself provides guaranteed compatibility.
# FROM docker.io/node:alpine AS builder
COPY ./.yarn/releases ./.yarn/releases
Another option is to use Corepack, a built-in Node feature that reads the packageManager
field from package.json
; it just has to be enabled.
# FROM docker.io/node:alpine AS builder
RUN corepack enable
diff --git a/package.json b/package.json
+ "packageManager": "yarn@3.4.1"
Pruning devDependencies
yarn@berry
doesn’t have a command analogous to npm prune
, meaning there’s no built-in way to remove development dependencies. jq
to the rescue!
# FROM docker.io/node:alpine AS builder
RUN yarn run build && \
apk add jq && \
jq -r '.devDependencies | keys | .[]' package.json \
| xargs yarn remove
This does modify the package.json
and yarn.lock
, but it’s not the end of the world since version control isn’t a concern in this context.
Plug’n’Play
Plug’n’Play (PnP) is a “hack” for node
’s module resolution that provides safe and deterministic package resolution. I highly recommend checking out yarn
’s documentation to learn more if you’re not familiar — it’s super cool!
PnP uses a couple additional files and directories to work. (yarn
auto-generates these during yarn install
.) These additional files need to be COPY
-d from the builder into the final image instead of the node_modules
directory.
# FROM docker.io/node:alpine AS builder
COPY ./.yarnrc.yml ./package.json ./yarn.lock ./
RUN yarn install
# FROM docker.io/node:alpine
COPY --from=builder /.yarn/cache ./.yarn/cache
COPY --from=builder /.yarn/install-state.gz ./.yarn/
COPY --from=builder \
/usr/src/app/.pnp.cjs \
/usr/src/app/.pnp.loader.mjs \
/usr/src/app/.yarnrc.yml \
/usr/src/app/package.json \
/usr/src/app/yarn.lock \
./
In this implementation, .pnp.cjs
(and .pnp.loader.mjs
) need to be integrated into the node
invocation. yarn@berry
provides a shortcut for this: yarn node
.
CMD ["yarn", "node", "."]
Zero-Installs
Additionally, PnP enables a feature called zero-installs. In a project using zero-installs, .zip
files of each dependency are stored in .yarn/cache
and intended to be committed, providing absolute reproducibility and eliminating the need to re-download packages, speeding up yarn install
significantly.
That directory needs to be COPY
-d from our host into the image; then, we can use yarn install --immutable
instead to ensure we’re using zero-installs.
# FROM docker.io/node:alpine AS builder
COPY ./.yarn/cache ./.yarn/cache
COPY ./.yarnrc.yml ./package.json ./yarn.lock ./
RUN yarn install --immutable
Putting it all together, here’s the final Dockerfile
:
FROM docker.io/node:alpine AS builder
RUN corepack enable
WORKDIR /usr/src/app
COPY ./.yarn/cache ./.yarn/cache
COPY ./.yarnrc.yml ./package.json ./yarn.lock ./
RUN yarn install --immutable
COPY ./src ./src
COPY ./tsconfig.json ./
RUN yarn run build && \
apk add jq && \
jq -r '.devDependencies | keys | .[]' package.json \
| xargs yarn remove
FROM docker.io/node:alpine
RUN corepack enable
WORKDIR /usr/src/app
COPY --from=builder /.yarn/cache ./.yarn/cache
COPY --from=builder /.yarn/install-state.gz ./.yarn/
COPY --from=builder /usr/src/app/dist ./dist
COPY --from=builder \
/usr/src/app/.pnp.cjs \
/usr/src/app/.pnp.loader.mjs \
/usr/src/app/.yarnrc.yml \
/usr/src/app/package.json \
/usr/src/app/yarn.lock \
./
USER node:node
CMD ["yarn", "node", "."]
Definitely more complex than the npm
version, but we have all the benefits of yarn@berry
: PnP for safety and determinism, zero-installs for speed, and so much more.