Better Continuous Deployment with GitLab CI/CD
This article follows up with a previous article, which details the bare minimum for a CI/CD pipeline from GitLab to the npmjs.com package repository. It's not a bad start for learning how to deploy to npmjs.com from a pipeline, but as a pipeline itself, it's... well, it was my first attempt. This article will detail a better pipeline in terms of maintainability, build-safety, and testing. (NPM will still be used as an example, but the broader concepts will be applicable to other stacks.)
Requirements for a good GitLab CI/CD pipeline
A good pipeline should be able to do more than just authenticate and push to production.
I want to protect the production product from ever being in a non-compiling state.
I want to protect the production product from ever being in a state where some tests fail.
I want to deploy to production whenever my code compiles and my tests succeed without manual intervention.
I want my "main" branch to always be representative of the production code.
Therefore, I'd like my process to look something like this:
Push code to a "dev" branch whenever I fix a bug or complete a feature.
Run the compilation.
Run the test suite.
If both of those are successful, merge to "main."
Deploy from "main."
Authenticating our CI/CD pipeline with environment variables.
Obviously, you can't put passwords or authentication tokens in a script in a publicly-visible open source project. Fortunately, GitLab allows secure storage and use of environment variables in CI/CD pipelines with these two protections:
Masking an environment variable protects the variable from being seen in the console output. It is easy to imagine a scenario where an error message (or just a simple scripting mistake) could lead to this kind of information being printed to the console, and once the toothpaste is out of the tube and on the internet, there's no putting it back in--you have to revoke that token and generate a new one. Masking prevents this easy-to-make security mistake.
Protecting an environment variable is a kind of access control. A protected environment variable can only be used in protected branches or on protected tags, and it can't be seen by all contributors.
A critically sensitive authentication token like an NPM publish token or a GitLab personal access token should be both protected and masked.
Generating a token for GitLab CI/CD
GitLab CI/CD pipelines do come with a CI_JOB_TOKEN environment variable, but it's a bit of a blunt instrument in terms of permissions--it doesn't have many of them, and you can't edit them, so the most secure and least annoying practice is to go ahead and create a fresh GitLab personal access token and give it exactly the permissions it needs and no more.
To create a GitLab personal access token:
Log into GitLab on the web.
Click on your profile photo on the top right of the screen to open the menu.
Click on preferences in the open menu.
Under "User Settings" on the left, select "Access Tokens" near the middle of the vertical navigation menu.
Give your token a meaningful name. Mine is named "merge-token" because it will only be used to merge dev branches into main branches in automated pipelines. For this purpose, it's probably impractical to set an expiration date, and that's okay.
I would recommend only giving the token read and write access to repositories, so that if the token is leaked the attacker at least won't have access to the whole GitLab API.
Once the token is created, save it in a password manager.
Generating an automation token in npm
The second token we'll need is from npm. The npm team has made this straightforward.
Go to npmjs.com and log in if you haven't already.
Click on your profile picture at the top right.
Select the fifth item, "Access Tokens."
Click "Generate New Token" on the top right of the page.
Select the middle option, "automation" for the right security settings.
Click "Generate Token."
Save the token in a password manager.
Storing the tokens in GitLab
Both tokens need to be available as environment variables in the pipeline. To add them to the pipeline's context:
Log into GitLab and open the project you intend to automate.
Select "Settings" at the bottom of the menu on the left. This will open a submenu.
Select "CI/CD."
Find the "Variables" section of the CI/CD menu and click "expand" on the right.
Then, for both variales:
Click the green "Add variable" button at the bottom.
Fill in the "Key" text box with "NPM_TOKEN" and "MERGE_TOKEN" respectively.
Fill in the "Value" box with the token from your password manager.
Make sure the "Type" is set to "variable" instead of "file."
Make sure both checkboxes are checked to protect and mask the variable.
(Again: Protecting the variable, while important for security-sensitive information like authentication tokens, makes the variable unavailable on unprotected branches or unprotected tags. Consult the GitLab documentation on protected variables if you are having trouble accessing your variables from the pipeline.)
Build and test automation in the dev branch
By default, GitLab CI/CD comes with three "stages"--build, test, and deploy--which will run in order whenever a commit is pushed. Let's go ahead and implement the first couple of stages.
image: node:latest
compile: # arbitrary name to identify the script
stage: build # indicates its chronological order in the pipeline
script:
- npm ci # the recommended best practice for CI/CD (as opposed to npm i)
- npm run build
only:
- dev # only run this script for the dev branch
test:
stage: test
script:
- npm ci
- npm run build
- npm run test
only:
- dev
Understanding the default state of the repository in GitLab CI/CD
The way that GitLab sets up the repository inside the CI/CD runner by default is optimized to be fast, but not necessarily intuitive.
When it fetches a copy of the code, it doesn't clone the whole repository because the whole git history and the various branches often aren't needed in a CI/CD pipeline. It also rests in a "detached" state from any particular branch. Finally, its default origin is the CI_JOB_TOKEN, which does not have permission to push code.
These are three problems which solvable in three steps.
Swap out the job token for the GitLab personal access token by running the
git remote set-url origin...
command.Get the main branch by running
git pull origin main
.Check out the main branch using the
git checkout
command.
(...or you could just clone a fresh copy of the repository with a sensible origin and not bother figuring out how to make the existing pipeline work, but where is the fun in that?)
Automating a merge in a GitLab pipeline
With that in mind, we end up with a CI/CD stage that looks like this:
merge:
only:
- dev
script:
- git remote set-url origin https://merge-token:${MERGE_TOKEN}@gitlab.com/${CI_PROJECT_NAMESPACE}/${CI_PROJECT_NAME}.git
- git pull origin main
- git checkout main
- git merge origin/dev
- git push origin main
stage: deploy
By the way, CI_PROJECT_NAMESPACE
and CI_PROJECT_NAME
aren't just placeholders--they're real environment variables provided to you automatically by GitLab, which is a nice feature because it means you can reuse this pipeline in similar projects. MERGE_TOKEN
, of course, is the personal access token we created earlier.
Automating the deployment to npm
This is straightforward. To deploy to npmjs.com, authenticate by including your token in the .npmrc, recalling our $NPM_TOKEN
environment variable we create earlier.
deploy:
only:
- main # importantly, deploy only from the main branch
stage: deploy
script:
- echo "//registry.npmjs.org/:_authToken=${NPM_TOKEN}" >> .npmrc
- npm publish
Putting it all together
This is my full-length CI/CD script, which I am applying to an increasing number of projects such as rescript-notifications.
image: node:latest
compile: # arbitrary name to identify the script
stage: build # indicates its chronological order in the pipeline
script:
- npm ci # the recommended best practice for CI/CD (as opposed to npm i)
- npm run build
only:
- dev # only run this script for the dev branch
test:
stage: test
script:
- npm ci
- npm run build
- npm run test
only:
- dev
merge:
only:
- dev
script:
- git remote set-url origin https://merge-token:${MERGE_TOKEN}@gitlab.com/${CI_PROJECT_NAMESPACE}/${CI_PROJECT_NAME}.git
- git pull origin main
- git checkout main
- git merge origin/dev
- git push origin main
stage: deploy
deploy:
only:
- main
stage: deploy
script:
- echo "//registry.npmjs.org/:_authToken=${NPM_TOKEN}" >> .npmrc
- npm publish
Handling NPM version numbers in a CI/CD pipeline
There's one small, annoying, potential issue you might bump up against: version numbers. NPM doesn't allow new code to be deployed under an existing version number, so every time you push, you will need to remember to update the version number in your package.json.
There's a somewhat cumbersome way to manage this automatically. You could create a version number in a GitLab environment variable and then use the GitLab API to update that version number within the pipeline.
However, I personally don't do this and don't recommend it because requiring you to think about version numbers is good. I don't want to autoincrement a patch number that should be a minor version or a minor version that should be a major version. A big part of the point of CI/CD is more quickly delivering value to users, so you don't want to burn off that goodwill by delivering breaking changes in a patch.
Looking forward to more fun with GitLab CI/CD
It feels good to have this process documented for myself, and I hope someone else will be able to get some value out of it as well. My next article will address dual-deployment to npmjs.com and GitLab's own npm registry.