Expressive JavaScript: Modules

Original author: Marijn Haverbeke
  • Transfer

Content




A novice programmer writes programs the way ants build an anthill - piece by piece, without thinking about the overall structure. His programs are like sand. They can not stand for long, but when they grow up, they fall apart.

Having understood the problem, the programmer spends a lot of time thinking about the structure. His programs are firmly structured like stone sculptures. They are solid, but when they need to be changed, they have to be abused.

The master programmer knows when you need a structure, and when you need to leave things in a simple form. His programs are like clay - solid but pliable.

Master Yuan-Ma, Programming Book


Each program has a structure. In particular, it is determined by how the programmer divides the code into functions and blocks inside these functions. Programmers are free to create the structure of their program. The structure is determined more by the taste of the programmer than by the functionality of the program.

In the case of large programs, individual functions are already lost in the code, and we need a unit for organizing large-scale code. Modules group program code according to some specific criteria. In this chapter, we will look at the benefits of this division and the technique of creating modules in JavaScript.

Why modules are needed


There are several reasons why authors divide their books into chapters and sections. This helps the reader understand how the book is built and find the parts they need. This helps the author concentrate on each specific part. The benefits of organizing programs in multiple files, or modules, are about the same. Structuring helps strangers find what they need and helps programmers store things related to each other in one place.

Some programs are organized according to the model of plain text, where the sequence is clearly defined, and where the reader is offered a consistent study of the program and a lot of prose (comments) to describe the code. This makes reading the code less intimidating (and reading someone else's code is usually intimidating), but it takes a lot of effort to create the code. Also, such a program is more difficult to change, because the parts of prose are more closely related than the parts of the code. This style is called literary programming. Those chapters of the book that discuss projects can be considered literary code.

Usually structuring something requires energy. In the early stages of the project, when you are still not sure what will be, and what modules are needed at all, I advocate a structureless minimalist code organization. Just place all parts where convenient until the code stabilizes. Thus, you do not have to waste time rearranging pieces of the program, and you will not get yourself into a structure that is not suitable for your program.

Namespace


Most modern PLs have intermediate scopes of visibility (OM) between the global (visible to everyone) and local (visible only to this function). JavaScript doesn’t. By default, everything you need to see outside the top-level function is in the global OB.

Namespace pollution (PI), when non-connected parts of the code share one set of variables, was mentioned in Chapter 4. There, the Math object was cited as an example of an object that groups math-related functionality in the form of a module.

Although JavaScript does not directly offer constructs for creating a module, we can use objects to create namespaces accessible from anywhere. And functions can be used to create isolated private namespaces inside the module. A little further we will discuss a way to build fairly convenient modules that isolate PIs using basic language concepts.

Reuse


In a project that is not broken down into modules, it is not clear what parts of the code are needed for a particular function. In my program spying on enemies (chapter 9), I wrote a function for reading settings files. If I need to use it in another project, I will have to copy the parts of the old program that seem to be related to this function to my new project. And if I find a mistake there, I will fix it only in the project that I'm working on at the moment, and forget to fix it in everyone else.

When you have a lot of such duplicated pieces of code, you will find that you spend a lot of time and effort copying and updating them. If you place the interconnected parts of the programs in separate modules, it will be easier to track, fix and update them, because wherever this functionality is needed, you can simply load this module from a file.

This idea can be used even better if you clearly define the relationships between different modules - who depends on whom. Then you can automate the process of installing and updating external modules (libraries).

If you further develop the idea, imagine an online service that tracks and distributes hundreds of thousands of such libraries, and you can search for the functionality you need among them, and when you find it, your project will automatically download it.

And there is such a service! It is called NPM (npmjs.org). NPM is an online database of modules and a tool for downloading and upgrading the modules on which your program depends. it grew out of Node.js, a JavaScript environment that does not require a browser, which we will discuss in chapter 20, but can also be used in browser programs.

Decoupling


Another task of the modules is to isolate unconnected parts of the code in the same way as object interfaces do. A well-designed module provides an interface for external use. When a module is updated or fixed, the existing interface remains unchanged, so that other modules can use the new, updated version without changing them.

A stable interface does not mean that new functions, methods or variables are not added to it. The main thing is that the existing functionality is not deleted and its meaning does not change. A good interface allows the module to grow without breaking the old interface. And this means - to expose as little as possible the internal kitchen of the module, while the interface language should be sufficiently flexible and powerful for use in various situations.

Interfaces that perform a simple task, such as reading settings from a file, come out in such a natural way. For others, for example, for a text editor, which has many different aspects that require external access (content, styles, user actions, etc.), the interface needs to be carefully thought out.

Using Functions as Namespaces


Functions are the only thing in JavaScript that creates a new scope. If we need the modules to have their own scope, we will have to base them on functions.

Pay attention to the simplest module that associates names with the numbers of the days of the week - as the getDay method of the Date object does.

var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
function dayName(number) {
  return names[number];
}
console.log(dayName(1));
// → Вторник


The dayName function is part of the module interface, but the names variable is not. But I would like not to pollute the global namespace.

You can do this:

var dayName = function() {
  var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
  return function(number) {
    return names[number];
  };
}();
console.log(dayName(3));
// → Четверг


Now names is a local variable of an unnamed function. The function is created and immediately called, and its return value (the dayName function we already need) is stored in a variable. We can write many pages of code in a function, declare a hundred variables there, and all of them will be internal for our module, and not for external code.

A similar pattern can be used to isolate code. The following module writes a value to the console, but does not provide any values ​​for use by other modules.

(function() {
  function square(x) { return x * x; }
  var hundred = 100;
  console.log(square(hundred));
})();
// → 10000


This code displays a square of hundreds, but in reality it could be a module adding a method to some prototype, or customizing a widget on a web page. It is wrapped in a function to prevent pollution of the global OM with the variables used by it.

Why did we enclose the function in parentheses? This is due to a glitch in the JavaScript syntax. If an expression begins with the keyword function, this is a functional expression. And if the instruction begins with function, it is a function declaration that requires a name, and, since it is not an expression, it cannot be called using brackets () after it. You can imagine bracketing as a trick so that a function is forcibly interpreted as an expression.

Objects as Interfaces



Imagine that we need to add another function to our “day of the week” module. We can no longer return a function, but must wrap two functions in an object.

var weekDay = function() {
  var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
  return {
    name: function(number) { return names[number]; },
    number: function(name) { return names.indexOf(name); }
  };
}();
console.log(weekDay.name(weekDay.number("Sunday")));
// → Sunday


When the module is large, it is inconvenient to collect all the returned values ​​into an object at the end of the function, because many returned functions will be large, and it would be more convenient for you to write them somewhere else, next to the code associated with them. It’s convenient to declare an object (usually called exports) and add properties to it every time we need to export something. In the following example, a module function takes an interface object as an argument, allowing code outside the function to create it and store it in a variable. Outside of the this function, it refers to a global scope object.

(function(exports) {
  var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
  exports.name = function(number) {
    return names[number];
  };
  exports.number = function(name) {
    return names.indexOf(name);
  };
})(this.weekDay = {});
console.log(weekDay.name(weekDay.number("Saturday")));
// → Saturday


Disconnect from global scope


This pattern is often used in JavaScript modules intended for the browser. The module will take one global variable and wrap its code in a function so that it has its own private namespace. But there are problems with this template when many modules require the same name, or when you need to download two versions of the module at the same time.

Having twisted something, we can create a system that allows one module to access the interface object of another without access to the global OB. Our goal is the require function, which, receiving the name of the module, downloads its file (from disk or from the network, depending on the platform) and returns the corresponding value with the interface.

This approach solves the problems mentioned earlier, and it has one more advantage - the dependencies of your program become obvious, and therefore it is more difficult to accidentally call a module you do not need without a clear declaration.

We will need two things. First, the readFile function, which returns the contents of the file as a string. There is no such function in standard JavaScript, but different environments, such as a browser or Node.js, provide their own ways to access files. For now, let's pretend that we have such a function. Secondly, we need the ability to execute the contents of this line as code.

We execute data as a code


There are several ways to get data (line of code) and execute it as part of the current program.

The most obvious is the eval statement, which executes a line of code in the current environment. This is a bad idea - it violates some of the properties of the environment that it usually has, such as isolation from the outside world.

function evalAndReturnX(code) {
  eval(code);
  return x;
}
console.log(evalAndReturnX("var x = 2"));
// → 2


The better way is to use the Function constructor. It takes two arguments - a string containing a list of argument names separated by a comma, and a string containing the body of the function.

var plusOne = new Function("n", "return n + 1;");
console.log(plusOne(4));
// → 5


This is what we need. We will wrap the module code in a function, and its scope will become the scope of our module.

Require


Here is the minimum version of the require function:

function require(name) {
  var code = new Function("exports", readFile(name));
  var exports = {};
  code(exports);
  return exports;
}
console.log(require("weekDay").name(1));
// → Вторник


Since the new Function constructor wraps the module code in a function, we do not need to write a function that wraps the namespace inside the module itself. And since exports is an argument to a module function, the module does not need to declare it. This removes a lot of garbage from our example module.

var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
exports.name = function(number) {
  return names[number];
};
exports.number = function(name) {
  return names.indexOf(name);
};


When using such a template, a module usually begins by declaring several variables that load the modules on which it depends.

var weekDay = require("weekDay");
var today = require("today");
console.log(weekDay.name(today.dayNumber()));


Such a simple require option has drawbacks. Firstly, it will load and execute the module each time it is loaded via require - if several modules have the same dependencies, or the require call is inside a function that is called multiple times, time and energy will be lost.

This can be solved by storing already loaded modules in the object, and returning the existing value when it is loaded several times.

The second problem is that the module cannot export the variable directly, only through the export object. For example, a module may need to export only the constructor of the object declared in it. This is not possible right now, because require always uses the exports object as the return value.

The traditional solution is to provide modules with another variable, module, which is an object with the exports property. It initially points to an empty object created by require, but can be overwritten with a different value to export anything else.

function require(name) {
  if (name in require.cache)
    return require.cache[name];
  var code = new Function("exports, module", readFile(name));
  var exports = {}, module = {exports: exports};
  code(exports, module);
  require.cache[name] = module.exports;
  return module.exports;
}
require.cache = Object.create(null);


Now we have a system of modules using one global variable, require, to allow modules to search and use each other without going into the global scope.

This style of module system is called CommonJS, after the pseudo standard that first described it. It is built into the Node.js. Real implementations do much more than I have described. The main thing is that they have a smarter way of moving from the name of the module to its code, which allows loading modules by the relative path to the file, or by the name of the module indicating locally installed modules.

Slow loading of modules


Although it is possible to use the CommonJS style for the browser, it is not very suitable for this. Downloading a file from the Web is slower than from a hard drive. While the script is running in the browser, nothing else happens on the site (for reasons that will become clear by chapter 14). This means that if every require call downloaded something from a distant web server, the page would freeze for a very long time at loading.

You can get around this by running a program like Browserify with your code before putting it on the web. It will look through all the calls to require, process all the dependencies, and collect the necessary code into one large file. The website simply loads this file and receives all the necessary modules.

The second option is to wrap the module code in a function so that the module loader first loads the dependencies in the background, and then calls the function that initializes the module after loading the dependencies. The AMD system (asynchronous module definition) does this.

Our simple addiction program would look like this in AMD:

define(["weekDay", "today"], function(weekDay, today) {
  console.log(weekDay.name(today.dayNumber()));
});


The define function is the most important here. It takes an array of module names, and then a function that takes one argument for each of the dependencies. It will load the dependencies (if they are not already loaded) in the background, allowing the page to work while the file is swinging. When everything is loaded, define calls the function given to it, with the interfaces of these dependencies as arguments.

Modules loaded in this way must contain define calls. As their interface, use what was returned by the function passed to define. Here is the weekDay module:

define([], function() {
  var names = ["Понедельник", "Вторник", "Среда", "Четверг", "Пятница", "Суббота", "Воскресенье"];
  return {
    name: function(number) { return names[number]; },
    number: function(name) { return names.indexOf(name); }
  };
});


To show the minimal implementation of define, let's pretend that we have a backgroundReadFile function that takes a file name and function, and calls this function with the contents of this file as soon as it is loaded. (Chapter 17 will explain how to write such a function.)

To track modules while they are loading, define uses objects that describe the state of the modules, tells us whether they are already available, and provides their interface when available.

The getModule function takes a name and returns such an object, and makes sure that the module is queued for loading. It uses a caching object to not load one module twice.

var defineCache = Object.create(null);
var currentMod = null;
function getModule(name) {
  if (name in defineCache)
    return defineCache[name];
  var module = {exports: null,
                loaded: false,
                onLoad: []};
  defineCache[name] = module;
  backgroundReadFile(name, function(code) {
    currentMod = module;
    new Function("", code)();
  });
  return module;
}


We assume that the download file also contains a define call. The currentMod variable is used to inform this call about the module object being loaded, so that it can update this object after loading. We will return to this mechanism.

The define function itself uses getModule to load or create module objects for the dependencies of the current module. Its task is to plan the launch of the moduleFunction function (containing the module code itself) after loading the dependencies. To do this, it defines the whenDepsLoaded function, which is added to the onLoad array containing all dependencies that have not yet been loaded. This function immediately stops working if there are still unloaded dependencies, so that it will only do its work once the last dependency is loaded. It is also called immediately from define itself, in the case when no dependencies need to be loaded.

function define(depNames, moduleFunction) {
  var myMod = currentMod;
  var deps = depNames.map(getModule);
  deps.forEach(function(mod) {
    if (!mod.loaded)
      mod.onLoad.push(whenDepsLoaded);
  });
  function whenDepsLoaded() {
    if (!deps.every(function(m) { return m.loaded; }))
      return;
    var args = deps.map(function(m) { return m.exports; });
    var exports = moduleFunction.apply(null, args);
    if (myMod) {
      myMod.exports = exports;
      myMod.loaded = true;
      myMod.onLoad.every(function(f) { f(); });
    }
  }
  whenDepsLoaded();
}


When all dependencies are available, whenDepsLoaded calls a function containing the module, passing the dependency interfaces as arguments.

The first thing define does is to store the currentMod value that it had when it was called in the myMod variable. Recall that getModule saved the corresponding module object in currentMod right before executing the module code. This allows whenDepsLoaded to store the return value of the module function in the exports property of this module, set the loaded property of the module to true, and call all functions that were waiting for the module to load.

This code is harder to learn than the require function. Its implementation does not go in a simple and predictable way. Instead, several operations must be performed at uncertain times in the future, making it difficult to learn how this code is executed.

The current implementation of AMD is much smarter in converting module names to URLs and is more reliable than the example shows. The RequireJS project provides a popular implementation of this style of module loader.

Interface Design


Interface development is one of the thinnest points in programming. Any non-trivial functionality can be implemented in many ways. Finding a working method requires discernment and prudence.

The best way to know the value of a good interface is to use many interfaces. Some will be bad, some will be good. Experience will show you what works and what doesn't. Never take for granted a bad interface. Correct it, or enclose it in another interface that suits you best.

Predictability


If a programmer can predict how your interface works, he will not have to get distracted often and watch a hint for its use. Try to follow generally accepted conventions. If there is a module or part of the JavaScript language that does something similar to what you are trying to implement, it would be nice if your interface resembles an existing one. Thus, it will be familiar to people familiar with the existing interface.

In the behavior of your code, predictability is also important. You may be tempted to make the interface too abstruse, ostensibly because it is more convenient to use. For example, you can take any kinds of types and combinations of arguments and do “what you need” with them. Or provide dozens of specialized features that offer slightly different functionality. The code based on your interface can make this a little shorter, but it makes it difficult for people working with it to build a clear mental model of your module.

Composability


Try to use as simple data structures as possible in interfaces. Make functions simple and clear. If applicable, keep the functions clean (see Chapter 3).

For example, often modules offer their own version of array-like collections of objects with their own interface for counting and retrieving elements. Such objects do not have map or forEach methods, and no function that expects a real array can work with them. This is an example of poor composability - a module cannot easily be combined with other code.

An example is a module for spelling checking text, which can be useful in a text editor. The test module can be made so that it works with any complex structures used by the editor itself and calls the internal functions of the editor to provide the user with a choice of writing options. If you do this, the module cannot be used with other programs. On the other hand, if we define the interface of the verification module, which takes a simple string and returns the position at which there is a possible error in the string, and in addition - an array of proposed corrections, then we will have an interface that can be combined with other systems, because the strings and arrays are always available in JavaScript.

Multilayer interfaces


When developing an interface for a complex system (for example, sending emails), you often come to a dilemma. On the one hand, there is no need to overload the user interface with details. No need to force them to study it 20 minutes before they can send an email. On the other hand, I don’t want to hide all the details - when people need to do something complicated with the help of your module, they should have such an opportunity.

Often you have to offer two interfaces: a detailed low-level for complex situations, and a simple high-level for normal use. The second can be built on the basis of the first. In a module for sending emails, a high-level interface can simply be a function that receives a message, the address of the recipient and sender, and sends a letter. Low-level should give access to headers, attached files, HTML letters, etc.

Total


Modules allow you to structure large programs by dividing code into different files and namespaces. If you provide them with well-designed interfaces, they will be easy to use, apply in other projects and continue to use them in the development and evolution of the project itself.

Although JavaScript does not help to make modules at all, its flexible functions and objects make it possible to make a fairly good module system. The scope of functions is used as the internal namespace of the module, and objects are used to store sets of variables.

There are two popular approaches to using modules. One is CommonJS, built on the require function, which calls the modules by name and returns their interface. The other is AMD, which uses the define function, which takes an array of module names and, after loading them, executes a function whose arguments are their interfaces.

Exercises


Month names

Write a simple weekday module that converts month numbers (starting from zero) into names and vice versa. Give it your own namespace, as he will need an internal array with the names of the months, and use pure JavaScript, without a module loading system.

// Ваш код
console.log(month.name(2));
// → March
console.log(month.number("November"));
// → 10


Back to electronic life

I hope that chapter 7 has not yet been erased from your memory. Return to the system developed there and suggest a way to split the code into modules. To refresh your memory - here is a list of functions and types, in order of appearance:

Vector
Grid
directions
directionNames
randomElement
BouncingCritter
elementFromChar
World
charFromElement
Wall
View
WallFollower
dirPlus
LifelikeWorld
Plant
PlantEater
SmartPlantEater
Tiger


No need to create too many modules. A book in which there would be a new chapter on every page would get on your nerves (if only because the headlines would eat up the whole place). No need to make ten files for one small project. Count on 3-5 modules.

Some functions can be made internal, inaccessible from other modules. The right option does not exist here. The organization of modules is a matter of taste.

Circular dependencies

The confusing topic in dependency management is circular dependencies, when module A depends on B, and B depends on A. Many module systems simply prohibit this. CommonJS modules allow a limited version: this works until the modules replace the default export object with a different value and start using each other's interfaces only after loading is complete.

Can you come up with a way to implement a support system for such dependencies? Look at the definition of require and think what needs to be done for this function to do this.

Also popular now: