CINXE.COM
Refactoring Module Dependencies
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta content = 'uft-8' name = 'charset'></meta> <title>Refactoring Module Dependencies</title> <meta http-equiv="Content-type" content="text/html;charset=UTF-8" /> <meta content = 'summary_large_image' name = 'twitter:card'></meta> <meta content = '16665197' name = 'twitter:site:id'></meta> <meta content = '@martinfowler' name = 'twitter:site'></meta> <meta content = 'Refactoring Module Dependencies' property = 'og:title'></meta> <meta content = 'https://martinfowler.com/articles/refactoring-dependencies.html' property = 'og:url'></meta> <meta content = 'An example of taking a program, splitting it into modules by layers, and refactoring these dependencies with the Service Locator and Dependency Injection patterns.' property = 'og:description'></meta> <meta content = 'https://martinfowler.com/articles/refactoring-dependencies/ref-dep_sl.png' property = 'og:image'></meta> <meta content = 'martinfowler.com' property = 'og:site_name'></meta> <meta content = 'article' property = 'og:type'></meta> <meta content = '2015-10-13' property = 'og:article:modified_time'></meta> <meta content = 'width=device-width, initial-scale=1' name = 'viewport'></meta> <link href = 'article.css' rel = 'stylesheet' type = 'text/css'></link> </head> <body><header id = 'banner' style = 'background-image: url("/img/penob.png"); background-repeat: no-repeat'> <div class = 'name-logo'><a href = 'https://martinfowler.com'><img src = '/mf-name-white.png'></img></a></div> <div class = 'search'> <!-- SiteSearch Google --> <form method='GET' action="https://www.google.com/search"> <input type='hidden' name='ie' value='UTF-8'/> <input type='hidden' name='oe' value='UTF-8'/> <input class = 'field' type='text' name='q' size='15' maxlength='255' value=""/> <button class = 'button' type='submit' name='btnG' value=" " title = "Search"/> <input type='hidden' name='domains' value="martinfowler.com"/> <input type='hidden' name='sitesearch' value=""/> <input type='hidden' name='sitesearch' value="martinfowler.com"/> </form> </div> <div class = 'menu-button navmenu-button'><a class = 'icon icon-bars' href = '#navmenu-bottom'></a></div> <nav class = 'top-menu'> <ul> <li><a class = '' href = 'https://refactoring.com'>Refactoring</a></li> <li><a class = '' href = '/agile.html'>Agile</a></li> <li><a class = '' href = '/architecture'>Architecture</a></li> <li><a class = '' href = '/aboutMe.html'>About</a></li> <li><a class = 'tw' href = 'https://www.thoughtworks.com'>Thoughtworks</a></li> <li><a class = 'icon icon-rss' href = '/feed.atom' title = 'feed'></a></li> <li><a class = 'icon icon-twitter' href = 'https://www.twitter.com/martinfowler' title = 'Twitter stream'></a></li> <li class = 'icon'><a href = 'https://toot.thoughtworks.com/@mfowler' title = 'Mastodon stream'><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="currentColor"><path d="M21.2595 13.9898C20.9852 15.4006 18.8033 16.9446 16.2974 17.2439C14.9907 17.3998 13.7041 17.5431 12.3321 17.4802C10.0885 17.3774 8.31809 16.9446 8.31809 16.9446C8.31809 17.163 8.33156 17.371 8.3585 17.5655C8.65019 19.7797 10.5541 19.9124 12.3576 19.9742C14.1779 20.0365 15.7987 19.5254 15.7987 19.5254L15.8735 21.1711C15.8735 21.1711 14.6003 21.8548 12.3321 21.9805C11.0814 22.0493 9.52849 21.9491 7.71973 21.4703C3.79684 20.432 3.12219 16.2504 3.01896 12.0074C2.98749 10.7477 3.00689 9.55981 3.00689 8.56632C3.00689 4.22771 5.84955 2.95599 5.84955 2.95599C7.2829 2.29772 9.74238 2.0209 12.2993 2H12.3621C14.919 2.0209 17.3801 2.29772 18.8133 2.95599C18.8133 2.95599 21.6559 4.22771 21.6559 8.56632C21.6559 8.56632 21.6916 11.7674 21.2595 13.9898ZM18.3029 8.9029C18.3029 7.82924 18.0295 6.97604 17.4805 6.34482C16.9142 5.71359 16.1726 5.39001 15.2522 5.39001C14.187 5.39001 13.3805 5.79937 12.8473 6.61819L12.3288 7.48723L11.8104 6.61819C11.2771 5.79937 10.4706 5.39001 9.40554 5.39001C8.485 5.39001 7.74344 5.71359 7.17719 6.34482C6.62807 6.97604 6.3547 7.82924 6.3547 8.9029V14.1562H8.43597V9.05731C8.43597 7.98246 8.88822 7.4369 9.79281 7.4369C10.793 7.4369 11.2944 8.08408 11.2944 9.36376V12.1547H13.3634V9.36376C13.3634 8.08408 13.8646 7.4369 14.8648 7.4369C15.7694 7.4369 16.2216 7.98246 16.2216 9.05731V14.1562H18.3029V8.9029Z"></path></svg> </a></li> <li class = 'icon'><a href = 'https://www.linkedin.com/in/martin-fowler-com/' title = 'LinkedIn'><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="currentColor"><path d="M4.00098 3H20.001C20.5533 3 21.001 3.44772 21.001 4V20C21.001 20.5523 20.5533 21 20.001 21H4.00098C3.44869 21 3.00098 20.5523 3.00098 20V4C3.00098 3.44772 3.44869 3 4.00098 3ZM5.00098 5V19H19.001V5H5.00098ZM7.50098 9C6.67255 9 6.00098 8.32843 6.00098 7.5C6.00098 6.67157 6.67255 6 7.50098 6C8.3294 6 9.00098 6.67157 9.00098 7.5C9.00098 8.32843 8.3294 9 7.50098 9ZM6.50098 10H8.50098V17.5H6.50098V10ZM12.001 10.4295C12.5854 9.86534 13.2665 9.5 14.001 9.5C16.072 9.5 17.501 11.1789 17.501 13.25V17.5H15.501V13.25C15.501 12.2835 14.7175 11.5 13.751 11.5C12.7845 11.5 12.001 12.2835 12.001 13.25V17.5H10.001V10H12.001V10.4295Z"></path></svg> </a></li> <li class = 'icon'><a href = 'https://bsky.app/profile/martinfowler.com' title = 'BlueSky'><svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 24 24" fill="currentColor"><path d="M12 11.3884C11.0942 9.62673 8.62833 6.34423 6.335 4.7259C4.13833 3.17506 3.30083 3.4434 2.75167 3.69256C2.11583 3.9784 2 4.95506 2 5.52839C2 6.10339 2.315 10.2367 2.52 10.9276C3.19917 13.2076 5.61417 13.9776 7.83917 13.7309C4.57917 14.2142 1.68333 15.4017 5.48083 19.6292C9.65833 23.9542 11.2058 18.7017 12 16.0392C12.7942 18.7017 13.7083 23.7651 18.4442 19.6292C22 16.0392 19.4208 14.2142 16.1608 13.7309C18.3858 13.9784 20.8008 13.2076 21.48 10.9276C21.685 10.2376 22 6.10256 22 5.52923C22 4.95423 21.8842 3.97839 21.2483 3.6909C20.6992 3.44256 19.8617 3.17423 17.665 4.72423C15.3717 6.34506 12.9058 9.62756 12 11.3884Z"></path></svg></a></li> </ul> </nav> </header> <nav id = 'top-navmenu'> <nav class = 'navmenu'> <div class = 'nav-head'> <div class = 'search'> <!-- SiteSearch Google --> <form method='GET' action="https://www.google.com/search"> <input type='hidden' name='ie' value='UTF-8'/> <input type='hidden' name='oe' value='UTF-8'/> <input class = 'field' type='text' name='q' size='15' maxlength='255' value=""/> <button class = 'button' type='submit' name='btnG' value=" " title = "Search"/> <input type='hidden' name='domains' value="martinfowler.com"/> <input type='hidden' name='sitesearch' value=""/> <input type='hidden' name='sitesearch' value="martinfowler.com"/> </form> </div> <div class = 'closediv'> <span class = 'close' title = 'close'></span> </div> </div> <div class = 'nav-body'> <div class = 'topics'> <h2>Topics</h2> <p><a href = '/architecture'>Architecture</a></p> <p><a href = 'https://refactoring.com'>Refactoring</a></p> <p><a href = '/agile.html'>Agile</a></p> <p><a href = '/delivery.html'>Delivery</a></p> <p><a href = '/microservices'>Microservices</a></p> <p><a href = '/data'>Data</a></p> <p><a href = '/testing'>Testing</a></p> <p><a href = '/dsl.html'>DSL</a></p> </div> <div class = 'about'> <h2>about me</h2> <p><a href = '/aboutMe.html'>About</a></p> <p><a href = '/books'>Books</a></p> <p><a href = '/faq.html'>FAQ</a></p> </div> <div class = 'content'> <h2>content</h2> <p><a href = '/videos.html'>Videos</a></p> <p><a href = '/tags'>Content Index</a></p> <p><a href = '/articles/eurogames'>Board Games</a></p> <p><a href = '/photos'>Photography</a></p> </div> <div class = 'tw'> <h2>Thoughtworks</h2> <p><a href = 'https://thoughtworks.com/insights'>Insights</a></p> <p><a href = 'https://thoughtworks.com/careers'>Careers</a></p> <p><a href = 'https://thoughtworks.com/radar'>Radar</a></p> </div> <div class = 'feeds'> <h2>follow</h2> <p><a href = '/feed.atom'>RSS</a></p> <p><a href = 'https://toot.thoughtworks.com/@mfowler'>Mastodon</a></p> <p><a href = 'https://www.linkedin.com/in/martin-fowler-com/'>LinkedIn</a></p> <p><a href = 'https://www.twitter.com/martinfowler'>X (Twitter)</a></p> <p><a href = 'https://boardgamegeek.com/blog/13064/martins-7th-decade'>BGG</a></p> </div> </div> </nav> </nav> <nav id = 'toc-dropdown'> <button class = 'dropdown-button'> <h2>Table of Contents</h2> </button> <div class = 'hidden' id = 'dropdownLinks'> <ul> <li><a href = '#top'>Top</a></li> <li><a href = '#TheStartingPoints'>The Starting Point(s)</a></li> <li><a href = '#Presentation-domain-dataLayering'>Presentation-Domain-Data Layering</a></li> <li><a href = '#PerformingTheSplit'>Performing the split</a></li> <li><a href = '#LinkerSubstitution'>Linker Substitution</a></li> <li><a href = '#DataSourceAsParameterWithEachCall'>Data source as parameter with each call</a> <ul> <li><a href = '#ParamterizingTheDataSourceFileName'>Paramterizing the data source file name</a></li> <li><a href = '#TradeOffsToParameterizing'>Trade offs to parameterizing</a></li> </ul> </li> <li><a href = '#SingularServices'>Singular Services</a></li> <li><a href = '#IntroducingServiceLocator'>Introducing Service Locator</a> <ul> <li><a href = '#RefactoringTheJavascriptToUseTheLocator'>Refactoring the JavaScript to use the locator</a></li> <li><a href = '#Java'>Java</a></li> <li><a href = '#ConsequencesOfUsingAServiceLocator'>Consequences of using a service locator</a></li> </ul> </li> <li><a href = '#SplitPhase'>Split Phase</a></li> <li><a href = '#DependencyInjection'>Dependency Injection</a> <ul> <li><a href = '#JavaExample'>Java example</a></li> <li><a href = '#JavascriptExample'>JavaScript example</a></li> <li><a href = '#Consequences'>Consequences</a></li> </ul> </li> <li><a href = '#FinalThoughts'>Final Thoughts</a></li> </ul> <h3>Sidebars</h3> <ul> <li><a href = '#JavascriptStyle'>JavaScript Style</a></li> <li><a href = '#Es6Modules'>ES6 Modules</a></li> </ul> </div> </nav> <main> <h1>Refactoring Module Dependencies</h1> <section class = 'frontMatter'> <p class = 'abstract'><i> As a program grows in size it's important to split it into modules, so that you don't need to understand all of it to make a small modification. Often these modules can be supplied by different teams and combined dynamically. In this refactoring essay I split a small program using Presentation-Domain-Data layering. I then refactor the dependencies between these modules to introduce the Service Locator and Dependency Injection patterns. These apply in different languages, yet look different, so I show these refactorings in both Java and a classless JavaScript style. </i></p> <p class = 'date'>13 October 2015</p> <hr></hr> <div class = 'front-grid'> <div class = 'author-list'> <div class = 'author'> <div class = 'photo'><a href = '/'><img alt = 'Photo of Martin Fowler' src = '/mf.jpg' width = '80'></img></a></div> <address class = 'name'><a href = '/' rel = 'author'>Martin Fowler</a></address> </div> </div> <div class = 'tags'> <p class = 'tag-link'><a href = /tags/refactoring.html>refactoring</a></p> <p class = 'tag-link'><a href = /tags/API%20design.html>API design</a></p> <p class = 'tag-link'><a href = /tags/application%20architecture.html>application architecture</a></p> </div> <div class = 'contents'> <h2>Contents</h2> <ul> <li><a href = '#TheStartingPoints'>The Starting Point(s)</a></li> <li><a href = '#Presentation-domain-dataLayering'>Presentation-Domain-Data Layering</a></li> <li><a href = '#PerformingTheSplit'>Performing the split</a></li> <li><a href = '#LinkerSubstitution'>Linker Substitution</a></li> <li><a href = '#DataSourceAsParameterWithEachCall'>Data source as parameter with each call</a> <ul> <li><a href = '#ParamterizingTheDataSourceFileName'>Paramterizing the data source file name</a></li> <li><a href = '#TradeOffsToParameterizing'>Trade offs to parameterizing</a></li> </ul> </li> <li><a href = '#SingularServices'>Singular Services</a></li> <li><a href = '#IntroducingServiceLocator'>Introducing Service Locator</a> <ul> <li><a href = '#RefactoringTheJavascriptToUseTheLocator'>Refactoring the JavaScript to use the locator</a></li> <li><a href = '#Java'>Java</a></li> <li><a href = '#ConsequencesOfUsingAServiceLocator'>Consequences of using a service locator</a></li> </ul> </li> <li><a href = '#SplitPhase'>Split Phase</a></li> <li><a href = '#DependencyInjection'>Dependency Injection</a> <ul> <li><a href = '#JavaExample'>Java example</a></li> <li><a href = '#JavascriptExample'>JavaScript example</a></li> <li><a href = '#Consequences'>Consequences</a></li> </ul> </li> <li><a href = '#FinalThoughts'>Final Thoughts</a></li> </ul> <h3>Sidebars</h3> <ul> <li><a href = '#JavascriptStyle'>JavaScript Style</a></li> <li><a href = '#Es6Modules'>ES6 Modules</a></li> </ul> </div> </div> <hr></hr></section> <div class = 'paperBody deep'> <p>As programs go larger than a few hundred lines of code, you need to think about how to split them up into modules. At the very least it's useful to have smaller files to better manage your editing. But more seriously you want to divide up your program so that you don't have to keep it all in your head in order to make changes. </p> <p>A well designed modular structure should allow you to only understand a small part of a larger program when you need to make a small change to it. Sometimes a small change will cross-cut over the modules, but most of the time you'll just need to understand a single module and its neighbors.</p> <p>The hardest part of splitting a program into modules is just deciding on what the module boundaries should be. There's no easy guidelines to follow for this, indeed a major theme of my life's work is to try and understand what good module boundaries will look like. Perhaps the most important part of drawing good module boundaries is paying attention to the changes you make and refactoring your code so that code that changes together is in the same or nearby modules.</p> <p>On top of this is the mechanics of making the separation of how the various parts relate to each other. In the simplest case you have client modules that call suppliers. But often the configuration of these clients and suppliers can get tangled because you don't always want the client program to know too much about how its suppliers fit together.</p> <p>I'm going to explore this problem with an example, where I'll take a hunk of code and see how it can be split into pieces. In fact I'm going to do this twice, using two different languages: Java and JavaScript, which despite their similar names are really very different when it comes to the affordances they have for modularity. </p> <section id = 'TheStartingPoints'> <h2>The Starting Point(s)</h2> <p>We begin with a startup that is doing sophisticated data analysis of sales data. They have this valuable indicator, the Gondorff number, that is an extremely useful predictor for sales of products. Their web application takes a company's sales data, feeds it into their sophisticated algorithm, and then prints a simple table of products and their Gondorff numbers.</p> <p>The code for the initial state is all in a single file, which I'll walk through in sections. First is the code that emits the table in HTML.</p> <p class = 'code-label'>app.js </p> <pre> function emitGondorff(products) { function line(product) { return [ ` <tr>`, ` <td>${product}</td>`, ` <td>${gondorffNumber(product).toFixed(2)}</td>`, ` </tr>`].join('\n'); } return encodeForHtml(`<table>\n${products.map(line).join('\n')}\n</table>`); }</pre> <p class = 'code-remark'>I don't use multi-line strings as the demands of indentation in the output don't line up with indentation in the source code.</p> <p>This isn't the worlds most sophisticated UI, positively pedestrian in a world of single page this and responsive that. The only important thing, for this example, is that the UI needs to call the <code>gondorffNumber</code> function at various points.</p> <aside class = 'sidebar' id = 'JavascriptStyle'> <h2>JavaScript Style</h2> <p>For this example I'm going to use the ES6 standard of JavaScript, which provides a number of valuable advantages over older versions of the language. I'm also going to use a classless style of JavaScript, primarily as this shows a greater contrast to Java.</p> <p>(This doesn't mean I don't like using classes in JavaScript - I do - but avoiding classes allows me to illustrate how these patterns play out without them.)</p> </aside> <p>Next I'll move over to the calculation of the gondorff number.</p> <p class = 'code-label'>app.js </p> <pre> function gondorffNumber(product) { return salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .find(r => r.date.match(/01$/)) .quantity * Math.PI ; } function gondorffEpoch(product) { const countingBase = recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p class = 'code-label'> </p> <pre> function baselineRange(product){ // redacted } function deriveEpoch(countingBase) { // redacted } function hookerExpiry() { // redacted }</pre> <p>That may not look like a million-dollar algorithm to us, but that's thankfully not the important part of this code. The important part is that this logic that's about calculating the gondorff number requires two functions (<code>salesDataFor</code> and <code>recordCounts</code>) that simply return basic data from some kind of data source of sales. These data source functions are not particularly sophisticated, they merely filter some data sourced from a CSV file.</p> <p class = 'code-label'>app.js </p> <pre> function salesDataFor(product, start, end) { return salesData() .filter(r => (r.product === product) && (new Date(r.date) >= start) && (new Date(r.date) < end) ); } function recordCounts(start) { return salesData() .filter(r => new Date(r.date) >= start) .length } function salesData() { const data = readFileSync('sales.csv', {encoding: 'utf8'}); return data .split('\n') .slice(1) .map(makeRecord) ; } function makeRecord(line) { const [product,date,quantityString,location] = line.split(/\s*,\s*/); const quantity = parseInt(quantityString, 10); return { product, date, quantity, location }; }</pre> <p>These functions are entirely boring as far as this discussion's concerned - I show them only out of a sense of completeness. The important thing about them is that they take data from some data source, massage it into simple objects, and provide it in two different flavors to the core algorithmic code.</p> <p>At this point the java version looks very similar, first the HTML generation.</p> <p class = 'code-label'>class App... </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format(" <tr><td>%s</td><td>%4.2f</td></tr>", p, gondorffNumber(p))); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p>The gondorff algorithm</p> <p class = 'code-label'>class App... </p> <pre> public double gondorffNumber(String product) { return salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p class = 'code-label'> </p> <pre> private LocalDate baselineRange(String product) { //redacted } private LocalDate deriveEpoch(long base) { //redacted } private LocalDate hookerExpiry() { // yup, redacted too }</pre> <p>Since the body of the data source code isn't that important, I'll just show the method declarations</p> <p class = 'code-label'>class App </p> <pre> private Stream<SalesRecord> salesDataFor(String product, LocalDate start, LocalDate end) { // unimportant details } private long recordCounts(LocalDate start) { // unimportant details }</pre> </section> <section id = 'Presentation-domain-dataLayering'> <h2>Presentation-Domain-Data Layering</h2> <p>I said earlier that setting module boundaries was a subtle and nuanced art, but one guideline that many people follow is <a href = '/bliki/PresentationDomainDataLayering.html'>Presentation-Domain-Data Layering</a> - separating presentation code (UI), business logic, and data access. There are good reasons for following this kind of split. Each of those three categories involve thinking about different concerns, and often use different frameworks to assist in the task. Furthermore there is also a desire for substitution - multiple presentations using the same core business logic, or the business logic using different data sources in different environments.</p> <p>So for this example I'm going to follow this common split, and I'll also stress the substitution justification. After all this gondorff number is such a valuable metric that many people will want to make use of it - encouraging me to package it as a unit that can easily be reused by multiple applications. Furthermore not all applications will keep their sales data in a csv file, some will use a database or a remote microservice. We want an application developer to be able to take the gondorff code and plug it into her specific data source, which she may write herself or get from yet another developer.</p> <p>But before we embark on the refactoring to enable all this, I do need to stress that presentation-domain-data layering does have its limitations. The general rule of modularity is that we want to confine the consequences of change to one module if we can. But separate presentation-domain-data modules often do have to change together. The simple act of adding a data field will usually cause all three to update. As a result I favor using this approach in smaller scopes, but larger applications need high level modules to be developed along different lines. In particular you shouldn't use the presentation-domain-data layers as a basis for team boundaries.</p> </section> <section id = 'PerformingTheSplit'> <h2>Performing the split</h2> <p>I'll begin splitting into modules by separating the presentation. For the JavaScript case, this is almost merely cutting and pasting code into a new file.</p> <p class = 'code-label'>gondorff.es6 </p> <pre> export default function gondorffNumber … function gondorffEpoch(product) {… function baselineRange(product){… function deriveEpoch(countingBase) { … function hookerExpiry() { … function salesDataFor(product, start, end) { … function recordCounts(start) { … function salesData() { … function makeRecord(line) { …</pre> <aside class = 'sidebar' id = 'Es6Modules'> <h2>ES6 Modules</h2> <p>I'm using the facilities for modules that are part of ECMAScript 6, which became settled as I was writing this. I found <a href = 'http://exploringjs.com/es6/'>Axel Rauschmayer's book, Exploring ES6,</a> very helpful in understanding how these features worked.</p> </aside> <p>By using <code>export default</code> I can import the reference to <code>gondorffNumber</code> and I only have to add an import statement.</p> <p class = 'code-label'>app.es6 </p> <pre> import gondorffNumber from './gondorff.es6'</pre> <p>On the java side, it's almost as straightforward. Again I copy everything other than <code>emitGondorff</code> over to a new class.</p> <p class = 'code-label'>class Gondorff… </p> <pre> public double gondorffNumber(String product) { … private LocalDate gondorffEpoch(String product) { … private LocalDate baselineRange(String product) { … private LocalDate deriveEpoch(long base) { … private LocalDate hookerExpiry() { … Stream<SalesRecord> salesDataFor(String product, LocalDate start, LocalDate end) { … long recordCounts(LocalDate start) {… Stream<SalesRecord> salesData() { … private SalesRecord makeSalesRecord(String line) { …</pre> <p>For the original <code>App</code> class I don't need an import unless I put the new class into a new package, but I do need to instantiate the new class.</p> <p class = 'code-label'>class App... </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format(" <tr><td>%s</td><td>%4.2f</td></tr>", p, <span class = 'highlight'>new Gondorff().</span>gondorffNumber(p))); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p>I now want to do second separation between the calculation logic and the code that offers up the data records.</p> <p class = 'code-label'>dataSource.es6… </p> <pre> export function salesDataFor(product, start, end) {</pre> <p class = 'code-label'> </p> <pre> export function recordCounts(start) {</pre> <p class = 'code-label'> </p> <pre> function salesData() { … function makeRecord(line) { …</pre> <p>A difference between this move and the earlier one is that gondorff file needs to import two functions rather than just one. It can do that with this import, nothing else needs to change.</p> <p class = 'code-label'>Gondorff.es6… </p> <pre> import {salesDataFor, recordCounts} from './dataSource.es6'</pre> <p>The java version is very similar to the previous case, move into a new class, and instantiate the class for a new object.</p> <p class = 'code-label'>class DataSource… </p> <pre> public Stream<SalesRecord> salesDataFor(String product, LocalDate start, LocalDate end) { … public long recordCounts(LocalDate start) {… Stream<SalesRecord> salesData() { … private SalesRecord makeSalesRecord(String line) { …</pre> <p class = 'code-label'>class Gondorff... </p> <pre> public double gondorffNumber(String product) { return <span class = 'highlight'>new DataSource().</span>salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = <span class = 'highlight'>new DataSource().</span>recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <div class = 'figure ' id = 'ref-dep_split.png'><img src = 'refactoring-dependencies/ref-dep_split.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>This separation into files is a mechanical process that's not really that interesting. But it's a necessary first step before we reach the interesting refactorings.</p> </section> <section id = 'LinkerSubstitution'> <h2>Linker Substitution</h2> <p>Dividing up the code into several modules is helpful, but the interesting difficulty in all of this is the desire to distribute the Gondorff calculations as a separate component. Currently the Gondorff calculations assume that the sales data comes from a csv file with a particular path. Separating the data source logic gives me some ability to change that, but the current mechanism I have is awkward and there are other options to explore.</p> <p>So what is the current mechanism? Essentially this is what I'll call Linker Substitution. The term “linker” is something of a throwback to compiled programs like C, where the link stage resolves symbols across separate compilation units. In JavaScript I can achieve the moral equivalent of this by manipulating the lookup path of files for the import command.</p> <p>Let's imagine I want to install this application in an environment where they don't keep sales records in a CSV file, but instead run a query on a SQL database. To make this work I first need to create a CorporateDatabaseDataSource file with exported functions for <code>salesDataFor</code> and <code>recordCounts</code> that return the data in the form that the Gondorff file expects it. I then replace the DataSource file with this new one. Then when I run the application it “links” to the replaced DataSource file.</p> <p>For many dynamic languages that rely on some kind of path lookup mechanism for linking, the Linker Substitution is a pretty nice technique for simple component substitutions. I don't have to do anything with my code to make it work, other than the simple separation into files that I've just done. If I have a build script, I can build the code for different data source environments by simply copying different files into the appropriate points in the path. This illustrates the advantage of keeping a program factored into small pieces - it allows substitution of those pieces, even if the original writer didn't have any substitutions in mind. It enables unforeseen customization. </p> <p>To do Linker Substitution in Java is essentially the same task. I would need to package <code>DataSource</code> in a separate jar file to <code>Gondorff</code>, then instruct the user of <code>Gondorff</code> to create a class called <code>DataSource</code> with the appropriate methods and put it onto the classpath.</p> <p>However with Java I'd do an additional step, applying <a href = 'http://refactoring.com/catalog/extractInterface.html'>Extract Interface</a> on the data source.</p> <pre>public interface DataSource { Stream<SalesRecord> salesDataFor(String product, LocalDate start, LocalDate end); long recordCounts(LocalDate start); }</pre> <pre>public class CsvDataSource implements DataSource {</pre> <p>Using a <a href = '/bliki/RequiredInterface.html'>Required Interface</a> like this is helpful because it makes explicit what functions gondorff is expecting from its data source. </p> <p>One of the downsides of dynamic languages is they lack this explicitness, which can be a problem when combining components that have been developed separately. JavaScript's module system works well here because it statically defines the module dependencies, so they are explicit and can be checked statically. Static declarations have costs and benefits, one of the nice developments in recent language design is trying a more nuanced approach to static declarations rather than just treating languages as purely static or dynamic.</p> <p>Linker Substitution has the advantage that it requires little work on the part of the component author, so fits in with unforeseen customization. But it has its downsides. In some environments, such as Java, it can be fiddly to work with. The code doesn't reveal how the substitution works, so there's no mechanism for controlling the substitution in the code base.</p> <p>An important consequence of this lack of presence in the code is that the substitution cannot occur dynamically - that is once the program has been assembled and run, I can't change the data source. This usually isn't a big deal in production, there are cases where hot-swapping the data source is useful, but they are minority of cases. But the value of dynamic substitution comes with testing. It's very common to want to use <a href = '/bliki/TestDouble.html'>Test Doubles</a> to provide canned data for testing, which often means I'll want to throw in different doubles for different test cases. </p> <p>These demands for greater explicitness in the code base and dynamic substitution for testing, usually lead us to explore other alternatives, ones that allow us to specify how components are wired up explicitly rather than just relying on path lookups.</p> </section> <section id = 'DataSourceAsParameterWithEachCall'> <h2>Data source as parameter with each call</h2> <p>If we want to support calling gondorff with different data sources, then one obvious way to do it is to pass it as a parameter each time we call it. </p> <p>Let's look at this might look in the Java version first, beginning with the current state of the Java version, after extracting the DataSource interface</p> <p class = 'code-label'>class App... </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format( " <tr><td>%s</td><td>%4.2f</td></tr>", p, new Gondorff().gondorffNumber(p) )); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p class = 'code-label'>class Gondorff... </p> <pre> public double gondorffNumber(String product) { return new CsvDataSource().salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = new CsvDataSource().recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>To pass in the data source as a parameter, the resulting code looks like this.</p> <p class = 'code-label'>class App... </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format( " <tr><td>%s</td><td>%4.2f</td></tr>", p, new Gondorff().gondorffNumber(p, <span class = 'highlight'>new CsvDataSource()</span>) )); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p class = 'code-label'>class Gondorff... </p> <pre> public double gondorffNumber(String product, <span class = 'highlight'>DataSource <span class = 'highlight'>dataSource</span></span>) { return <span class = 'highlight'>dataSource</span>.salesDataFor(product, gondorffEpoch(product, dataSource), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product, <span class = 'highlight'>DataSource <span class = 'highlight'>dataSource</span></span>) { final long countingBase = <span class = 'highlight'>dataSource</span>.recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>I can do this refactoring in a few small steps.</p> <ul> <li>Use <a href = 'http://refactoring.com/catalog/addParameter.html'>Add Parameter</a> on <code>gondorffEpoch</code> to add <code>dataSource</code></li> <li>Replace the call to <code>new CsvDataSource()</code> to use the just added <code>dataSource</code> parameter</li> <li>Compile and test</li> <li>Repeat for <code>gondorffNumber</code></li> </ul> <p>Now over to the JavaScript version, again here's the current state. </p> <p class = 'code-label'>app.es6… </p> <pre> import gondorffNumber from './gondorff.es6'</pre> <p class = 'code-label'> </p> <pre> function emitGondorff(products) { function line(product) { return [ ` <tr>`, ` <td>${product}</td>`, ` <td>${gondorffNumber(product).toFixed(2)}</td>`, ` </tr>`].join('\n'); } return encodeForHtml(`<table>\n${products.map(line).join('\n')}\n</table>`); }</pre> <p class = 'code-label'>Gondorff.es6… </p> <pre> import {salesDataFor, recordCounts} from './dataSource.es6'</pre> <p class = 'code-label'> </p> <pre> export default function gondorffNumber(product) { return salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .find(r => r.date.match(/01$/)) .quantity * Math.PI ; } function gondorffEpoch(product) { const countingBase = recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>In this case I can pass both functions as parameters</p> <p class = 'code-label'>app.es6… </p> <pre> import gondorffNumber from './gondorff.es6' <span class = 'highlight'> import * as dataSource from './dataSource.es6'</span></pre> <p class = 'code-label'> </p> <pre> function emitGondorff(products) { function line(product) { return [ ` <tr>`, ` <td>${product}</td>`, ` <td>${gondorffNumber(product, <span class = 'highlight'>dataSource.salesDataFor, dataSource.recordCounts)</span>.toFixed(2)}</td>`, ` </tr>`].join('\n'); } return encodeForHtml(`<table>\n${products.map(line).join('\n')}\n</table>`); }</pre> <p class = 'code-label'>Gondorff.es6… </p> <pre> <span class = 'deleted'>import {salesDataFor, recordCounts} from './dataSource.es6'</span> export default function gondorffNumber(product, <span class = 'highlight'>salesDataFor, recordCounts)</span> { return salesDataFor(product, gondorffEpoch(product, recordCounts), hookerExpiry()) .find(r => r.date.match(/01$/)) .quantity * Math.PI ; } function gondorffEpoch(product, <span class = 'highlight'>recordCounts</span>) { const countingBase = recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>As with the java example, I can apply <a href = 'http://refactoring.com/catalog/addParameter.html'>Add Parameter</a> to <code>gondorffEpoch</code> first, compile and test, and then do the same to <code>gondoffNumber</code> for each function.</p> <p>In this situation I'd be inclined to put both the <code>salesDataFor</code> and <code>recordCounts</code> function onto a single data source object and pass that in instead - essentially using <a href = 'http://refactoring.com/catalog/introduceParameterObject.html'>Introduce Parameter Object</a>. I won't do this in this article, primarily because it's a better demonstration of manipulating first class functions if I don't. But if gondorff needed to use more functions from the data source I would.</p> <section id = 'ParamterizingTheDataSourceFileName'> <h3>Paramterizing the data source file name</h3> <p>As a further step I can parameterize the filename for the datasource. For the java version I do this by adding a field for the filename to the datasource and using <a href = 'http://refactoring.com/catalog/addParameter.html'>Add Parameter</a> to its constructor.</p> <p class = 'code-label'>class CsvDataSource… </p> <pre> private String filename; public CsvDataSource(String filename) { this.filename = filename; }</pre> <p class = 'code-label'>class App… </p> <pre> public String emitGondorff(List<String> products) { <span class = 'highlight'> DataSource <span class = 'highlight'>dataSource</span> = new CsvDataSource("sales.csv");</span> List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format( " <tr><td>%s</td><td>%4.2f</td></tr>", p, new Gondorff().gondorffNumber(p, <span class = 'highlight'>dataSource</span>) )); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p>For the JavaScript version I need to use <a href = 'http://refactoring.com/catalog/addParameter.html'>Add Parameter</a> on the functions that need it on the data source.</p> <p class = 'code-label'>dataSource.es6… </p> <pre> export function salesDataFor(product, start, end, <span class = 'highlight'>filename</span>) { return salesData(<span class = 'highlight'>filename</span>) .filter(r => (r.product === product) && (new Date(r.date) >= start) && (new Date(r.date) < end) ); } export function recordCounts(start, <span class = 'highlight'>filename</span>) { return salesData(<span class = 'highlight'>filename</span>) .filter(r => new Date(r.date) >= start) .length }</pre> <p>Left as it is, this would force me to put the filename parameter into the gondorff functions, but really they shouldn't need to know anything about that. I can fix this by creating a simple adapter.</p> <p class = 'code-label'>dataSourceAdapter.es6… </p> <pre> import * as ds from './dataSource.es6' export default function(filename) { return { salesDataFor(product, start, end) {return ds.salesDataFor(product, start, end, filename)}, recordCounts(start) {return ds.recordCounts(start, filename)} } }</pre> <p>The application code uses this adapter when it passes the data source into the gondorff function.</p> <p class = 'code-label'>app.es6… </p> <pre> import gondorffNumber from './gondorff.es6' <span class = 'deleted'>import * as dataSource from './dataSource.es6'</span> <span class = 'highlight'> import createDataSource from './dataSourceAdapter.es6'</span></pre> <p class = 'code-label'> </p> <pre> function emitGondorff(products) { function line(product) { <span class = 'highlight'> const dataSource = createDataSource('sales.csv');</span> return [ ` <tr>`, ` <td>${product}</td>`, ` <td>${gondorffNumber(product, dataSource.salesDataFor, dataSource.recordCounts).toFixed(2)}</td>`, ` </tr>`].join('\n'); } return encodeForHtml(`<table>\n${products.map(line).join('\n')}\n</table>`); }</pre> </section> <section id = 'TradeOffsToParameterizing'> <h3>Trade offs to parameterizing</h3> <p>Passing in the data source with each call to gondorff gives me the dynamic substitution that I'm looking for. As an application developer I can use any data source I like, I can also easily test by passing in stub data sources whenever I need to.</p> <p>But there are also downsides to using a parameter with each call like this. Firstly I have to pass the data source (or its functions) as a parameter to every function in gondorff that either needs it, or calls another function that needs it. This can result in the data source being a piece of tramp data that wanders around everywhere.</p> <p>The more serious problem is that now every time I have an application module that uses gondorff I have to ensure I can create and configure the data source too. This can easily get messy if I have a more complicated configuration, with a generic component that needs several required components, each of which have their own set of required components. Every time I use gondorff I have to embed the knowledge there as to how I configure the gondorff object. That's a duplication that complicates the code making it harder to understand and use. </p> <p>I can visualize this by looking at the dependencies. Before introducing the data source as a parameter the dependencies look like this:</p> <div class = 'figure ' id = 'ref-dep-01.png'><img src = 'refactoring-dependencies/ref-dep-01.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>When I pass the data source as a parameter it looks like this.</p> <div class = 'figure ' id = 'ref-dep-04.png'><img src = 'refactoring-dependencies/ref-dep-04.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>In these diagrams, I'm distinguishing between a usage dependency and a creation dependency. Usage dependency means that the client module calls functions defined on the supplier. There will always be a usage dependency between gondorff and data source. The creation dependency is a much more intimate dependency, since you usually need to know more about a supplier module in order to configure and create it. (A creation dependency implies a usage dependency.) Using a parameter with each call reduces the dependency from gondorff from creation to usage, but introduces a creation dependency from any applications.</p> <p>As well as the creation depenendency problem, there's also another issue since I don't actually want to vary the data source in the production code. Passing the parameter with each call to gondorff implies that I'm varying the parameter between calls, but here whenever I call <code>gondorffNumber</code> I'm always passing in exactly the same data source. That dissonance is apt to confuse me in six months time.</p> <p>If I have the same configuration for the data source all the time, it makes sense to set it up once and refer to it each time I use it. But if I do that, I might as well set gondorff up once, and use a fully configured gondorff every time I want to use it.</p> <p>So having explored what using a parameter each times looks like, I'll make use of my version control system and do a hard reset to where I was at the beginning of this section so I can explore another path.</p> </section> </section> <section id = 'SingularServices'> <h2>Singular Services</h2> <p>An important property of both gondorff and dataSource is that they both can act as singular service objects. A service object is part of the <a href = '/bliki/EvansClassification.html'>Evans Classification</a>, referring to an object that's oriented around an activity as opposed to entities or values that are focused around data. Often I refer to service objects as “services”, but they are different to services in SOA as they aren't network accessible components. In a functional world, services are often just functions, but sometimes you do find situations where you want to treat a set of functions as a single thing. We see this with data source, where we have two functions, that I can think of as part of a single data source.</p> <p>I also said “singular”, by this I mean it makes conceptual sense to only have one of these for a whole execution context. Since services are usually stateless, it makes sense to only have one around. If something is singular in an execution context, it means that we may refer to it globally within our program. We may even want to force it to be a singleton, perhaps because it's expensive to set up or there are concurrency constraints on resources it's manipulating. There may be only one of them in the entire process we're running in, or there may be more, such as one per thread using thread-specific storage. But either way, as far as our code's concerned, there's only one of them.</p> <p>If we choose to make our gondorff calculator and data source be singular services, then it makes sense to configure them once, during the startup of the application, and then refer to them later on when using them.</p> <p>This introduces a separation in the way services are handled: a separation of configuration and use. There are a couple of ways I can refactor this code to do this separation: introducing either the Service Locator pattern or the Dependency Injection pattern. I'll start with Service Locator.</p> </section> <section id = 'IntroducingServiceLocator'> <h2>Introducing Service Locator</h2> <p>The idea behind the Service Locator pattern is to have a singular point for components to locate services. The locator is a <a href = '/eaaCatalog/registry.html'>Registry</a> of services. In use, a client uses global lookup for the registry, then asks the registry for a particular service. Configuration sets up the locator with all the services that are needed.</p> <p>The first step in the refactoring to introduce it is to create the locator. It's a pretty simple structure, little more than a global record, so my JavaScript version is just a few variables and a simple initializer.</p> <p class = 'code-label'>serviceLocator.es6… </p> <pre> export let salesDataFor; export let recordCounts; export let gondorffNumber; export function initialize(arg) { salesDataFor: arg.salesDataFor; recordCounts: arg.recordCounts; gondorffNumber = arg.gondorffNumber; }</pre> <p class = 'code-remark'><code>export let</code> exports a variable to other modules as a read-only view. <span class = 'foot-ref' data-footnote='footnote-babel-readonly'>1</span></p> <div class = 'post-block-footnote footnote-babel-readonly'> <p><span class = 'num'>1: </span> I developed these examples using Babel. At the moment Babel has <a href = 'https://github.com/babel/babel/issues/2276'>a bug</a> allowing you to reassign exported variables. The specification for ES6 says exported variables are <a href = 'http://exploringjs.com/es6/ch_modules.html#leanpub-auto-imports-are-read-only-views-on-exports'>exported as a read-only view</a>. </p> </div> <p>The Java one is, of course, a bit more long-winded.</p> <p class = 'code-label'>class ServiceLocator… </p> <pre> private static ServiceLocator soleInstance; private DataSource dataSource; private Gondorff gondorff; public static DataSource dataSource() { return soleInstance.dataSource; } public static Gondorff gondorff() { return soleInstance.gondorff; } public static void initialize(ServiceLocator arg) { soleInstance = arg; } public ServiceLocator(DataSource dataSource, Gondorff gondorff) { this.dataSource = dataSource; this.gondorff = gondorff; }</pre> <p class = 'code-remark'>My preference in this situation is to provide an interface of static methods, so that clients of the locator don't need to know about where the data is stored. But I like to use a singleton instance for the data, as that makes it easier to substitution for testing.</p> <p>In both cases, the service locator is a set of attributes.</p> <section id = 'RefactoringTheJavascriptToUseTheLocator'> <h3>Refactoring the JavaScript to use the locator</h3> <p>With the locator defined, the next step is to start moving services over to it, I begin with gondorff. To configure the service locator, I'll write a small module to configure the service locator.</p> <p class = 'code-label'>configureServices.es6… </p> <pre> import * as locator from './serviceLocator.es6'; import gondorffImpl from './gondorff.es6'; export default function() { locator.initialize({gondorffNumber: gondorffImpl}); }</pre> <p>I need to ensure this function is imported and called at application start up.</p> <p class = 'code-label'>some startup file… </p> <pre> import initializeServices from './configureServices.es6';</pre> <p class = 'code-label'> </p> <pre> initializeServices();</pre> <p>To refresh our memories, here's the current application code (after the earlier revert).</p> <p class = 'code-label'>app.es6… </p> <pre> import gondorffNumber from './gondorff.es6'</pre> <p class = 'code-label'> </p> <pre> function emitGondorff(products) { function line(product) { return [ ` <tr>`, ` <td>${product}</td>`, ` <td>${gondorffNumber(product).toFixed(2)}</td>`, ` </tr>`].join('\n'); } return encodeForHtml(`<table>\n${products.map(line).join('\n')}\n</table>`); }</pre> <p>To use the service locator instead, all I need to do is adjust the import statement.</p> <p class = 'code-label'>app.es6… </p> <pre><span class = 'highlight'> <span class = 'deleted'>import gondorffNumber from './gondorff.es6'</span></span> <span class = 'highlight'> import {gondorffNumber} from './serviceLocator.es6';</span></pre> <p>I can run tests with just this change to ensure I didn't mess it up (that sounds better than “to find how I messed that up”). With that change down I do a similar change for the data source.</p> <p class = 'code-label'>configureServices.es6… </p> <pre> import * as locator from './serviceLocator.es6'; import gondorffImpl from './gondorff.es6'; <span class = 'highlight'> import * as dataSource from './dataSource.es6' ;</span> export default function() { locator.initialize({ <span class = 'highlight'> salesDataFor: dataSource.salesDataFor,</span> <span class = 'highlight'> recordCounts: dataSource.recordCounts,</span> gondorffNumber: gondorffImpl }); }</pre> <p class = 'code-label'>Gondorff.es6… </p> <pre> import {salesDataFor, recordCounts} from './<span class = 'highlight'>serviceLocator</span>.es6'</pre> <p>I can use the same refactoring as earlier to parameterize the file name, this time the change only affects the service configuration function.</p> <p class = 'code-label'>configureServices.es6… </p> <pre> import * as locator from './serviceLocator.es6'; import gondorffImpl from './gondorff.es6'; <span class = 'deleted'>import * as dataSource from './dataSource.es6' ;</span> <span class = 'highlight'> import createDataSource from './dataSourceAdapter.es6'</span> export default function() { <span class = 'highlight'> const dataSource = createDataSource('sales.csv');</span> locator.initialize({ salesDataFor: dataSource.salesDataFor, recordCounts: dataSource.recordCounts, gondorffNumber: gondorffImpl }); }</pre> <p class = 'code-label'>dataSourceAdapter.es6… </p> <pre> import * as ds from './dataSource.es6' export default function(filename) { return { salesDataFor(product, start, end) {return ds.salesDataFor(product, start, end, filename)}, recordCounts(start) {return ds.recordCounts(start, filename)} } }</pre> </section> <section id = 'Java'> <h3>Java</h3> <p>The java case looks much the same. I create a configuration class to populate the service locator.</p> <p class = 'code-label'>class ServiceConfigurator… </p> <pre> public class ServiceConfigurator { public static void run() { ServiceLocator locator = new ServiceLocator(null, new Gondorff()); ServiceLocator.initialize(locator); } }</pre> <p>And ensure I have a call to this somewhere in application startup.</p> <p>The current application code looks like this:</p> <p class = 'code-label'>class App… </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format( " <tr><td>%s</td><td>%4.2f</td></tr>", p, new Gondorff().gondorffNumber(p) )); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p>I now use the locator to get the gondorff object.</p> <p class = 'code-label'>class App… </p> <pre> public String emitGondorff(List<String> products) { List<String> result = new ArrayList<>(); result.add("\n<table>"); for (String p : products) result.add(String.format( " <tr><td>%s</td><td>%4.2f</td></tr>", p, <span class = 'highlight'>ServiceLocator.gondorff()</span>.gondorffNumber(p) )); result.add("</table>"); return HtmlUtils.encode(result.stream().collect(Collectors.joining("\n"))); }</pre> <p>To add the data source object into the mix, I start by adding it to the locator.</p> <p class = 'code-label'>class ServiceConfigurator… </p> <pre> public class ServiceConfigurator { public static void run() { ServiceLocator locator = new ServiceLocator(<span class = 'highlight'>new CsvDataSource()</span>, new Gondorff()); ServiceLocator.initialize(locator); } }</pre> <p>Currently the gondorff object looks like this:</p> <p class = 'code-label'>class Gondorff… </p> <pre> public double gondorffNumber(String product) { return new CsvDataSource().salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = new CsvDataSource().recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>Using the service locator changes it thus</p> <p class = 'code-label'>class Gondorff… </p> <pre> public double gondorffNumber(String product) { return <span class = 'highlight'>ServiceLocator.dataSource()</span>.salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = <span class = 'highlight'>ServiceLocator.dataSource()</span>.recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>As with the JavaScript case, parameterizing the filename just alters the service configuration code.</p> <p class = 'code-label'>class ServiceConfigurator… </p> <pre> public class ServiceConfigurator { public static void run() { ServiceLocator locator = new ServiceLocator(new CsvDataSource(<span class = 'highlight'>"sales.csv"</span>), new Gondorff()); ServiceLocator.initialize(locator); } }</pre> </section> <section id = 'ConsequencesOfUsingAServiceLocator'> <h3>Consequences of using a service locator</h3> <p>The immediate effect of using a service locator is altering the dependences between our three components. After the simple division of components, we can see the dependencies look like this.</p> <div class = 'figure ' id = 'ref-dep_sl.png'><img src = 'refactoring-dependencies/ref-dep_sl.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>Introducing the service locator removes all the creation dependencies between the primary modules. <span class = 'foot-ref' data-footnote='footnote-locator-deps'>2</span>. Of course this is ignoring the configure services module, which has all the creation dependencies.</p> <div class = 'post-block-footnote footnote-locator-deps'> <p><span class = 'num'>2: </span> One might argue that the java version of the service locator has dependencies on gondorff and data source due to them being mentioned in the type signatures. I'm discounting that here, since the locator doesn't actually invoke any methods on those classes. I could also remove those static type dependencies with some type gymnastics, although I suspect the cure would be worse than the disease. </p> </div> <div class = 'figure ' id = 'ref-dep_sl-create.png'><img src = 'refactoring-dependencies/ref-dep_sl-create.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>I'm sure some of you might have noticed that the application customization is being done by the service configuration function, which implies any customization is being done by the same linker substitution mechanism that I earlier said we need to get away from. That's true to some extent, but the fact that the service configuration module is clearly independent gives me a lot more flexibility. A library provider can supply a range of data source implementations, and clients can write a service configuration module that will select one at runtime based on configuration parameters such as a configuration file, environment variables, or a command line variable. There's a potential refactoring here to introduce parameters from a configuration file, but I'll leave that for another day. </p> <p>But a particular result of using a Service Locator is that I can now easily substitute services for testing. I can put a test stub in for gondorff's data source like this:</p> <p class = 'code-label'>test… </p> <pre> it('can stub a data source', function() { const data = [ {product: "p", date: "2015-07-01", quantity: 175} ]; const newLocator = { recordCounts: () => 500, salesDataFor: () => data, gondorffNumber: serviceLocator.gondorffNumber }; serviceLocator.initialize(newLocator); assert.closeTo(549.7787, serviceLocator.gondorffNumber("p"), 0.001); });</pre> <p class = 'code-label'>class Tester... </p> <pre> @Test public void can_stub_data_source() throws Exception { ServiceLocator.initialize(new ServiceLocator(new DataSourceStub(), ServiceLocator.gondorff())); assertEquals(549.7787, ServiceLocator.gondorff().gondorffNumber("p"), 0.001); } private class DataSourceStub implements DataSource { @Override public Stream<SalesRecord> salesDataFor(String product, LocalDate start, LocalDate end) { return Collections.singletonList(new SalesRecord("p", LocalDate.of(2015, 7, 1), 175)).stream(); } @Override public long recordCounts(LocalDate start) { return 500; } }</pre> </section> </section> <section id = 'SplitPhase'> <h2>Split Phase</h2> <p>While I was working on this article, I visited Kent Beck. After being fed his home made cheese, our conversation turned to refactoring topics and he told me about an important refactoring that he'd recognized a decade ago, but never got into a decent written form. This refactoring involved taking a complex computation and splitting it into two phases with the first phase passing it's result to the second phase with some intermediate results data structure. A large scale example of this pattern is that used by compilers, which split their work into many phases: tokenizing, parsing, code generation, with data structures such as token streams and parse trees acting as intermediate result.</p> <p>When I got back home and started on this article again, I quickly recognized that introducing a Service Locator like this is an example of the Split Phase refactoring. I've extracted the configuration of the service objects into its own phase using the Service Locator as the intermediate results to pass the result of the configure-services phase to the rest of the program.</p> <p>Splitting computation like this into separate phases is a useful refactoring because it allows us to think separately about the different needs in each phase, there is a clear indication of the results of each phase (in the intermediate results), and each phase can be tested independently by checking or supplying the intermediate results. This refactoring works especially well when we treat the intermediate results as an immutable data structure, giving us the benefits of working with the later phase code without having to reason about mutation behavior on any data generated by the earlier phase.</p> <p>As I write this, it's barely a month since that conversation with Kent, but I feel that the notion of Split Phase is a powerful one to use for refactoring. Like many great patterns it has that notion of obviousness - I feel like it's just putting a name to something that I've been doing for decades. But such a name isn't a small thing, once you name an oft-used technique like this, it makes it easier to talk to other people about and alters my own thinking: giving it a more central role and more deliberate usage than comes when it's done unconsciously.</p> </section> <section id = 'DependencyInjection'> <h2>Dependency Injection</h2> <p>Using a service locator has the downside that the component objects needs to know how the service locator works. This isn't a problem if the gondorff calculator is only used in the context of well-understood range of applications that use the same service locator machinery, but should I want to sell it to make my fortune that coupling is a problem. Even if all my eager buyers use service locators, it's unlikely that they will all use the same API. What I need is a way to configure gondorff with a data source in such a way that doesn't require any machinery other than what's built into the language itself.</p> <p> This is the need that led to a different form of configuration that's called dependency injection. Dependency injection is trumpeted a lot, particularly in the Java world, with all sorts of frameworks to implement it. While these frameworks can be useful, the basic idea is really very simple and I'll illustrate it with refactoring this example to a simple implementation.</p> <section id = 'JavaExample'> <h3>Java example</h3> <p>The heart of the idea is that you should be able to write components like the gondorff object without needing to know about any special conventions or tools for configuring dependent components. The natural way to do this is in Java is for the gondorff object to have a field that holds the data source. That field can be populated by service configuration in the usual ways you populate any field - either with a setter or during construction. Since the gondorff object needs a datasource to do anything useful, my usual approach is to put it into the constructor.</p> <p class = 'code-label'>class Gondorff… </p> <pre><span class = 'highlight'> private DataSource dataSource; public Gondorff(DataSource dataSource) { this.dataSource = dataSource; }</span> <span class = 'highlight'> private DataSource <span class = 'highlight'>getDataSource()</span> { return (dataSource != null) ? dataSource : ServiceLocator.dataSource(); }</span> public double gondorffNumber(String product) { return <span class = 'highlight'>getDataSource()</span>.salesDataFor(product, gondorffEpoch(product), hookerExpiry()) .filter(r -> r.getDate().toString().matches(".*01$")) .findFirst() .get() .getQuantity() * Math.PI ; } private LocalDate gondorffEpoch(String product) { final long countingBase = <span class = 'highlight'>getDataSource()</span>.recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p class = 'code-label'>class ServiceConfigurator… </p> <pre> public class ServiceConfigurator { public static void run() { ServiceLocator locator = new ServiceLocator(new CsvDataSource("sales.csv"), new Gondorff(<span class = 'highlight'>null</span>)); ServiceLocator.initialize(locator); } }</pre> <p>By putting in the accessor <code>getDataSource</code> I can do the refactoring in smaller steps. This code works fine with the configuration done with service locator, I can gradually replace tests that set up the locator with tests that use this new dependency injection mechanism. The first refactoring just adds the field and applies <a href = 'http://refactoring.com/catalog/addParameter.html'>Add Parameter</a>. Callers can use the constructor with a null argument initially and I can work them one at a time to supply a data source, testing after each change. (Of course, since we do all service configuration in the configuration phase, there are usually not many callers. Where we get more callers is the stubbing in tests.)</p> <p class = 'code-label'>class ServiceConfigurator… </p> <pre> public class ServiceConfigurator { public static void run() { DataSource dataSource = new CsvDataSource("sales.csv"); ServiceLocator locator = new ServiceLocator(dataSource, new Gondorff(dataSource)); ServiceLocator.initialize(locator); } }</pre> <p>Once I've done them all, I can remove all references to the service locator from the gondorff object.</p> <p class = 'code-label'>class Gondorff… </p> <pre> private DataSource getDataSource() { <span class = 'deleted'>return (<span class = 'highlight'>dataSource</span> != null) ? dataSource : ServiceLocator.dataSource();</span> return <span class = 'highlight'>dataSource</span>; }</pre> <p class = 'code-remark'>I could also inline <code>getDataSource</code> if I felt so inclined</p> </section> <section id = 'JavascriptExample'> <h3>JavaScript example</h3> <p>Since I'm eschewing classes in the JavaScript example, a way to ensure the gondorff calculator gets the data source functions without extra frameworks is to pass them as parameters with each call.</p> <p class = 'code-label'>Gondorff.es6… </p> <pre> <span class = 'deleted'>import {recordCounts} from './serviceLocator.es6'</span> export default function gondorffNumber(product, salesDataFor, recordCounts) { return salesDataFor(product, gondorffEpoch(product, recordCounts), hookerExpiry()) .find(r => r.date.match(/01$/)) .quantity * Math.PI ; } function gondorffEpoch(product, recordCounts) { const countingBase = recordCounts(baselineRange(product)); return deriveEpoch(countingBase); }</pre> <p>I did this approach before of course, but this time need to ensure that clients don't need to do any set up with each call. I can do this by providing a partially applied gondorff function to clients.</p> <p class = 'code-label'>configureServices.es6… </p> <pre> import * as locator from './serviceLocator.es6'; import gondorffImpl from './gondorff.es6'; import createDataSource from './dataSourceAdapter.es6' export default function() { const dataSource = createDataSource('sales.csv'); locator.initialize({ salesDataFor: dataSource.salesDataFor, recordCounts: dataSource.recordCounts, <span class = 'highlight'> gondorffNumber: (product) => gondorffImpl(product, dataSource.salesDataFor, dataSource.recordCounts)</span> }); }</pre> </section> <section id = 'Consequences'> <h3>Consequences</h3> <p>If we look at the dependencies during the usage phase, the diagram looks like this.</p> <div class = 'figure ' id = 'ref-dep_di.png'><img src = 'refactoring-dependencies/ref-dep_di.png'></img> <p class = 'photoCaption'></p> </div> <div class = 'clear'></div> <p>The only difference between this and the earlier use of service locator is that there's no longer any dependency between gondorff and the service locator, which is the whole point of using dependency injection. (The configuration phase dependencies are the same set of creation dependencies.)</p> <p>Once I've removed the dependency from gondorff to service locator, I can also remove the data source field from the service locator entirely if there aren't any other classes that need to get a data source from the service locator. I could also use dependency injection to provide the gondorff object to application classes, although there's much less value in doing that since the application classes aren't shared and thus aren't disadvantaged by using a locator. It's common to see the service locator and dependency injection patterns used together like this, with a service locator used to get an initial service whose further configuration has been done through dependency injection. Dependency injection containers are often used as service locators by providing a mechanism to look up a service. </p> </section> </section> <section id = 'FinalThoughts'> <h2>Final Thoughts</h2> <p>The key message of this refactoring episode is that of splitting the phase of service configuration from the use of the services. Exactly how you use service locators and dependency injection to perform this is less of an issue, and depends on the specific circumstances you're in. These circumstances may well lead you to a packaged framework to manage these dependencies, or if your case is simple it may be fine to roll your own.</p> </section> <hr class = 'bodySep'></hr> </div> <div class = 'appendix'> <div class = 'footnote-list' id = 'footnote-list'> <h2>Footnotes</h2> <div class = 'footnote-list-item' id = 'footnote-babel-readonly'> <p><span class = 'num'>1: </span> I developed these examples using Babel. At the moment Babel has <a href = 'https://github.com/babel/babel/issues/2276'>a bug</a> allowing you to reassign exported variables. The specification for ES6 says exported variables are <a href = 'http://exploringjs.com/es6/ch_modules.html#leanpub-auto-imports-are-read-only-views-on-exports'>exported as a read-only view</a>. </p> </div> <div class = 'footnote-list-item' id = 'footnote-locator-deps'> <p><span class = 'num'>2: </span> One might argue that the java version of the service locator has dependencies on gondorff and data source due to them being mentioned in the type signatures. I'm discounting that here, since the locator doesn't actually invoke any methods on those classes. I could also remove those static type dependencies with some type gymnastics, although I suspect the cure would be worse than the disease. </p> </div> </div> <section id = 'Acknowledgements'> <h2>Acknowledgements</h2> <p>Pete Hodgson and Axel Rauschmayer gave me valuable help in improving my JavaScript. Ben Wu (伍斌) suggested a useful illustration. Jean-Noël Rouvignac corrected several typos.</p> </section> </div> <div class = 'appendix'> <details id = 'SignificantRevisions'> <summary>Significant Revisions</summary> <p><i>13 October 2015: </i>First published</p> </details> </div> </main> <nav id = 'bottom-navmenu'> <nav class = 'navmenu'> <div class = 'nav-head'> <div class = 'search'> <!-- SiteSearch Google --> <form method='GET' action="https://www.google.com/search"> <input type='hidden' name='ie' value='UTF-8'/> <input type='hidden' name='oe' value='UTF-8'/> <input class = 'field' type='text' name='q' size='15' maxlength='255' value=""/> <button class = 'button' type='submit' name='btnG' value=" " title = "Search"/> <input type='hidden' name='domains' value="martinfowler.com"/> <input type='hidden' name='sitesearch' value=""/> <input type='hidden' name='sitesearch' value="martinfowler.com"/> </form> </div> <div class = 'closediv'> <span class = 'close' title = 'close'></span> </div> </div> <div class = 'nav-body'> <div class = 'topics'> <h2>Topics</h2> <p><a href = '/architecture'>Architecture</a></p> <p><a href = 'https://refactoring.com'>Refactoring</a></p> <p><a href = '/agile.html'>Agile</a></p> <p><a href = '/delivery.html'>Delivery</a></p> <p><a href = '/microservices'>Microservices</a></p> <p><a href = '/data'>Data</a></p> <p><a href = '/testing'>Testing</a></p> <p><a href = '/dsl.html'>DSL</a></p> </div> <div class = 'about'> <h2>about me</h2> <p><a href = '/aboutMe.html'>About</a></p> <p><a href = '/books'>Books</a></p> <p><a href = '/faq.html'>FAQ</a></p> </div> <div class = 'content'> <h2>content</h2> <p><a href = '/videos.html'>Videos</a></p> <p><a href = '/tags'>Content Index</a></p> <p><a href = '/articles/eurogames'>Board Games</a></p> <p><a href = '/photos'>Photography</a></p> </div> <div class = 'tw'> <h2>Thoughtworks</h2> <p><a href = 'https://thoughtworks.com/insights'>Insights</a></p> <p><a href = 'https://thoughtworks.com/careers'>Careers</a></p> <p><a href = 'https://thoughtworks.com/radar'>Radar</a></p> </div> <div class = 'feeds'> <h2>follow</h2> <p><a href = '/feed.atom'>RSS</a></p> <p><a href = 'https://toot.thoughtworks.com/@mfowler'>Mastodon</a></p> <p><a href = 'https://www.linkedin.com/in/martin-fowler-com/'>LinkedIn</a></p> <p><a href = 'https://www.twitter.com/martinfowler'>X (Twitter)</a></p> <p><a href = 'https://boardgamegeek.com/blog/13064/martins-7th-decade'>BGG</a></p> </div> </div> </nav> </nav> <footer id='page-footer'> <div class='tw-logo'> <a href='http://www.thoughtworks.com'> <img src='/thoughtworks_white.png'> </a> </div> <div class='menu-button'> <div class='icon-bars navmenu-button'></div> </div> <div class='copyright'> <p>© Martin Fowler | <a href="http://www.thoughtworks.com/privacy-policy">Privacy Policy</a> | <a href="/aboutMe.html#disclosures">Disclosures</a></p> </div> </footer> <!-- Google Analytics --> <!-- old Google Universal --> <script> window.ga=window.ga||function(){(ga.q=ga.q||[]).push(arguments)};ga.l=+new Date; ga('create', 'UA-17005812-1', 'auto'); ga('send', 'pageview'); </script> <script async src='https://www.google-analytics.com/analytics.js'></script> <!-- New Google GA4 --> <!-- global site tag (gtag.js) - Google Analytics --> <script async src="https://www.googletagmanager.com/gtag/js?id=G-6D51F4BDVF"></script> <script> window.dataLayer = window.dataLayer || []; function gtag(){dataLayer.push(arguments);} gtag('js', new Date()); gtag('config', 'G-6D51F4BDVF'); </script> <!-- End Google Analytics --> <script src = '/jquery-1.11.3.min.js' type = 'text/javascript'></script> <script src = '/mfcom.js' type = 'text/javascript'></script> </body> </html>