Getting Started with Gulp, Browserify, and npm

I started using Gulp and Browserify lately and, quite frankly, they’re great.  However, transitioning to a new build system is not without its pain points, no matter if you’re moving from something like Grunt or if you’re adding your first build tool to an existing project.  The documentation is good and Gulp itself is easy to use, but most of the browserify examples are geared toward simpler projects where only one bundle is needed.

For example, if you’re creating a website of any significant size, there are likely to be multiple pages with varying functionality. You don’t want to create one giant bundle that contains your entire site’s JavaScript. Monoliths are bad, mmmkay?  With that in mind, here are the main points that we’ll be walking through:

That’s a lot of stuff to cover.  As the title of this article implies, I’m assuming that you’re starting from zero.  If you already know a little about npm, gulp, or browserify, you can probably safely skip ahead.  TL;DR: If you’re already decently acquainted with all of these tools, then you can just take a look at my gulpfile.js.

The entire project as well as a high level summary is available on github. Let’s get started!

Intro to Gulp

Gulp is a “streaming build system.”  Since it uses streams, you’re going to actually do some programming.  This is code over configuration, which I find delightful.  When you run Gulp, it looks for a file called gulpfile.js.  In this file, you will define the tasks that Gulp will execute for you.  To run a gulp build, simple do:

Couldn’t be easier, right?  Since we specified no tasks on the command line, Gulp will execute the default task which can be defined thusly.

task can do all kinds of things.  It can be a function or it can run a set of other tasks.

Now when you run the default task, task1 and task2 will also be executed.  However, if you just wanted one of these tasks to be executed, you could specify that on the command line.

Tasks can also have task dependencies.  That is, a task can define other tasks that should be executed first before it is executed itself.  In this way, you can have synchronous execution of some tasks while still having asynchronous execution for other tasks.

 

Gulp is a very well documented tool, so I’ve only given you the broad strokes here.  If you have any questions about it, you should take a look at the documentation and the API reference.

Now that we have an elementary understanding of our build tool, lets see how it fits into our project.

Project Structure

Example project structure

The focus groups say that pictures are helpful, so here’s a pretty picture of our basic project structure.  There’s package.json (which we’ll get to next), gulpfile.js (where all of our build tasks will be defined), and a couple example applications, but otherwise, it’s just a bunch of empty folders.  We’ll fill it in with some actual files later, but first lets go through a few important paths.

/applications/client – Herein lies all of your glorious client code–your JavaScript.  The apps folder contains the individual page applications. The includes folder contains all of your own homegrown dependencies and business logic like models, views, etc.  These things could be directly under /applications/client, but I like to do it this way to help stay organized.

/applications/server – Here be dragons.  Dragons that aren’t part of this tutorial.  Your server side components would live here.  For example, if this were a PHP or Python project, all that stuff would be in here.

/node_modules – This is where npm installs all of your npm dependencies.  You should probably always add this to .gitignore.  Checking this in will cause serious bloat to your repository, and there’s really no need for that.

/public_html/javascript – All your browserified and otherwise built JavaScript goodies will go in here.  The build process basically owns this folder and you will normally not put anything directly in here yourself.

/tests/client – JavaScript unit tests live here.  I like to copy the folder structure of /applications/client here and have a test file for each individual file from that path.  This way, you know where all of your tests are without having to think much about it. But you can organize your tests however your little heart desires.

/tests/server – Unit tests for your server code.  Also not within the scope of this tutorial.

npm Setup

Okay, let’s actually do something.  Using gulp will require a number of local npm dependencies.  To be able to keep track of those dependencies within your project and so collaborators can easily install/update them, you need to initialize npm.  To do so execute the following command in the root of your project folder and follow the prompts.

At the end, you’ll have a new file called package.json.  This JSON configuration file is used by npm to track a wide range of information including project version, dependencies, repository information, scripts, and a number of other things.  You should have something like this:

If you already have your project in a Git repository, npm will automatically fill in the repository, bugs and homepage properties. It’s only the polite thing to do.

You may have noticed that your package.json file doesn’t have the private and scripts.publish properties that mine does.  I’ve added these to prevent my project from being published to npm. Setting private to true is sufficient to do so, but scripts.publish is there as a fallback in case it gets removed by accident.  If you have passwords, anything proprietary or otherwise private in your project, publishing your entire code base to a public repository could lead to a few tears.  Best to avoid that.

Installing Dependencies

Now that npm is configured for our project, we can start installing dependencies. npm will keep track of all the dependencies you install if you use either the --save or --save-dev flag when you install them.  It does this by storing the names and versions of each dependency in package.json (this is why you needed to do the npm init step first).  Naturally, you can install more than one package at a time.  To initially install all the dependencies that we need, execute the following commands:

All of the above dependencies will only be used as part of the build and test processes.  It may look like a lot, but that’s how npm is: small and very focused modules.  You’ll see what they’re all for later.

For example’s sake, we’ll also assume that our project uses Twitter Bootstrap and jQuery, so let’s install those as well.

Now, you will see that package.json has a devDependencies property that looks like this:

Before I move on to the real meat of things, let me show you some of the joys of npm.  Whenever someone new starts working on the project, they don’t need to know all the dependencies that you installed.  All they have to do is:

And whenever new dependencies are added, existing collaborators can simply do:

Boom. Done.

Bundle and distribute multiple applications with one build script

From here on out, I’ll use a number of constants which I define up near the top of gulpfile.js (with all the require statements).  This makes changing build configurations much easier as almost all settings can be found in one place.  This is mostly for reference for the below code snippets, so don’t worry too much about it now.

And now we’ve reached the good stuff. As I said, most of the browserify examples that I came across were very focused on building one bundle and deploying it to one place.  This didn’t meet my needs at all.  As you can see from the project structure, I have multiple different applications and I didn’t want to make one monolithic bundle that contained all of them.  This is how I broke the problem down.

Creating a cacheable common library

First, I identified major libraries that I knew I would use everywhere.  Libraries like jQuery and Bootstrap are going to be used on almost every page of my website.  Also, especially large libraries like jQuery can really slow down Browserify which slows down the entire build process–definitely something to avoid.  I didn’t want to include these in each application because it would make each bundle considerably larger and I wouldn’t be able to take advantage of browser caching for these common libraries.

After identifying these libraries, I created a gulp task that simply concatenates all of them together into one file.  Here’s that task:

This is very straight forward.  We take all files defined by EXTERNAL_LIBS, concatenate them into one file, minify that file with uglify(), and then write that file to /public_html/javascript/lib/common.min.js.

Building the applications

Now that we have our common library, we can build our applications.

The build task itself is very simple, but nonetheless it is the heart of the entire build problem and indeed this article.  Instead of processing all the applications files as one stream, we process each individually and create its own bundle (lines 7-9). We then do some cleanup on line 12 necessitated by the autobuild task for reasons we’ll discuss later.

The getBundler() and bundle() functions share the work of the build task.  Typically, you wouldn’t see these things broken out into their own functions, but they will be reused later when we do automatic builds with watchify.  Let’s take a look at these functions now.

The getBundler() function takes a file and a set of options.  We always enable source maps and define the browserify transforms that we want executed (lines 8-14).  In the build task, we do not specify any special options, but we will later in the autobuild task.  Lastly, since we externalized certain modules as part of our common library, we need to be sure that we also exclude them from the bundled applications by marking them as external (lines 19-21).

The bundle() function accepts the Vinyl file object for each application and the bundler that was created with getBundler().  We determine that file’s path relative to /applications/client/apps, bundle that application, write the normal version, minify it, and write the minified version. Everything is detailed in the comments.

Since we set the filename for the bundled application using the relative path of the original application, we end up with a folder structure in APPS_DIST_DIR (i.e.: /public_html/javascript/apps) that matches that of the folder structure in /applications/client/apps.  This keeps things neatly organized in the distribution location and prevents bundled applications from overwriting each other.

For example, the /applications/client/apps/login/reset-password/confirm.js application will be bundled and distributed to /public_html/javascript/apps/login/reset-password as confirm.js and confirm.min.js.

Automatically rebuild applications

This is all well and good, but we don’t want to have to manually run a build each time we make a change.  Switching between our IDE, a terminal, and the browser is time consuming and tedious.  The goal is to have a way to automatically rebuild only the applications that have changed.

Enter watchify.  Watchify is a wrapper for browserify that watches an application and all of its dependencies, and will rebundle that application whenever it sees a change in any of those files.  As promised, we will reuse the getBundler() and bundle() functions from the build task which will make things significantly simpler.  Here’s the autobuild task.

Watchify builds necessarily set browserify’s options.fullpaths to true. This causes full, absolute paths to be included in the resultant bundles.  This is fine for development, especially considering the benefits that Watchify provides, but we do not want our bundles to be committed or published like that.  It unnecessarily bloats downloads and exposing file system information could be considered a security risk in some environments.  A manual build must be done to remove these full paths after an automatic build.

To prevent Watchified bundles from being committed, I created a Git pre-commit hook that will check for the existence of and block commits if the AUTOBUILD_FLAG_FILE file is found.  Running a manual build removes this flag file and commits can be made again.  Ensuring that all collaborators have this pre-commit hook installed is done as part of the housekeeping task that we will cover later.

Simplifying require() paths

At this point, we’ve pretty much covered the necessities of our build system.  We can build a cacheable common library, build and distribute multiple applications within our project, and automatically rebuild those applications on the fly.  However, at some point in development, you may find yourself writing require statements that look like this.

If you only had one or two relative paths like this in your whole project, it might not be a big deal.  However, if you ever move a module you will find that all relative require paths are broken.

There are several ways to handle this, but the easiest option that I found is through using the NODE_PATH environment variable.   From the doc:

If the NODE_PATH environment variable is set to a colon-delimited list of absolute paths, then node will search those paths for modules if they are not found elsewhere. (Note: On Windows, NODE_PATH is delimited by semicolons instead of colons.)

With this, we can include our own modules as if we are always in the root of our source location (i.e.: /applications/client).  You can set this variable very easily by simply assigning a value to it whenever you execute gulp.

But this sucks.  It makes running gulp tasks cumbersome and complicated.  The problem is compounded when you add more collaborators to the project, and it is made even worse if you need to change the path later or if you need to add more paths.  Not to worry, there is a solution.  We can set this variable within gulpfile.js itself so that it is always managed by the project.  We just add the following somewhere near the top.

Transforms for templates

In order to be able to unit test all of our code, we’ll inevitably need to include a module that requires a template.  There are a lot of browserify transforms that will allow you to simply include your template via require as if it were a normal module.  It’s a nice bit of syntactical sugar that looks something like this.

But there’s one little, tiny, insignificant problem with that.  You can no longer run unit tests against this code directly.  If you did, you’d get the following error:

Being that we run our unit tests with node and being that node is not an HTML parser, we have a little problem.  Further, we can’t run the unit tests against any of the bundled applications because they don’t have the same exports as the module that we are targeting even if they include it.

My first approach was to create individual standalone applications of each module in the /applications/client/includes folder.  This worked and it even got around the template problem described above (since transforms were actually being applied during the build process) but it was an absolute mess.  I had mini-bundles everywhere in /tests/client and it delayed building and testing time, even for a small project.

The solution is to load templates like you would normally load a file in node.  “But wait,” you might be saying, “we can’t load a file from the file system in a web app!”  And you’re right.  But we can fake it.  You may have noticed earlier that I’ve been using the BRFS transform all along.  Imagine you had a simple script, ohai.js, that looks like this.

The brfs transforms changes fs.readFileSync and fs.readFile calls into simple string assignments giving you something like this.

Now, when we execute the same script directly in node–just like we would with our unit tests–everything continues to work because it uses the actual node fs module.

Now that we have that out of the way, let’s talk about actual unit tests.

Unit tests and linting

Since we need use the NODE_PATH trick for unit tests as well, we’ll execute all of our tests through gulp.  The task for this is as simple as it gets.

Naturally, you are free to use any test harness you wish.  I use the tape test harness to run all of my tests because I like its simplicity. I then pipe its output to the faucet formatter.  TAP has plenty of report formatters available.

To keep unit tests organized, I always mirror the path of my test subject.  For example, if I want to test /applications/client/includes/models/user.js, I’ll create its unit test at /tests/client/includes/models/user.js.

Just as we wanted to have automatic builds, its also nice to have automatic unit tests.  Luckily this is also easy-peasy.

The autotest task simply watches every little bit of our application code and all of our tests for changes.  Whenever a change is made, the test task is executed.

It’s also a good idea to at least have some basic static analysis of our project to ensure a baseline of quality and to catch simple mistakes as soon as possible. To do so, we have a very simple lint task that looks like this.

This will lint our entire application code base and report any silly mistakes we may have made.

Housekeeping

Earlier, I mentioned a Git pre-commit hook.  Client-side Git hooks aren’t actually a part of the repo and must be installed manually by all collaborators.  To simplify this process and to ensure that everyone working on this project has the appropriate hooks installed, I’ve created the following housekeeping task to be in charge of two things:

  1. Making sure that all client-side Git hooks are installed
  2. Periodically installing missing and updating existing npm modules

You may or may not want this task to do anything with your npm modules; if you don’t, just set ALLOW_NPM_MODULE_MANAGEMENT to false.  At any rate, here’s the task.

As you can see, I keep a master version of each client-side Git hook in /assets/git/hooks and just symlink them to .git/hooks. I’ve omitted it from the code examples here for clarity, but I have almost every other task depend on the housekeeping task.  This ensures that the housekeeping task will be ran before any builds are created.

Convenience tasks

Because laziness is a virtue, I’ve also created a few tasks for my convenience.

The auto task simply runs all other automatic tasks.  This conveniently rebuilds the necessary applications and runs unit tests whenever I make any change.  Pretty handy.

Most of our Gulp tasks run asynchronously.  However, occasionally I’ll get into a situation where this asynchronicity makes the build output messy (e.g.: one or more tasks are logging error messages).  When this happens, I’ll use this serial task to run all my tasks synchronously so that I can debug with some sanity.

Finding other recipes

So this might be all well and good, but maybe it’s not quite what you need.  Before you spend the time to figure it out for yourself, you should checkout these Gulp recipes.  Otherwise, searching for “gulp recipe” plus whatever you’re trying to do is likely to turn up something useful. It’ll save you a bunch of time.

Wrap up

So that’s a lot…but I hope it’s at least remotely helpful as it is long.  This outlines the approach that I’ve taken with Gulp after trying it out on only a few projects. If there’s a better way of doing things, please drop me a comment.  I’d be more than happy to have your input.

 

Hi there, I'm Justin!

I like to teaching people about programming. Well, yeah, I guess it's obvious, I also like to write. Yes that was a Digital Underground reference.

 

3 thoughts on “Getting Started with Gulp, Browserify, and npm

  1. Sup, homie! Just thought I’d let ya know that I’m loving this! I’ve been getting into Node.js, and all other technologies listed because of this here post, pardner. Thanks. Can’t wait to read more!

Comments are closed.