Chapter 2, EP.1 - A Philosophy of Software Architecture in the Age of AI
Before we begin: the world of code seen through the metaphor of architecture. Code and buildings may look like utterly different fields, yet a striking similarity runs between them — just as an architect draws blueprints and assembles materials to create space, a developer holds a design in mind and builds spaces of logic out of code.
Programming code and physical buildings may, at first glance, look like entirely different domains, yet a striking similarity exists between them. Just as an architect draws blueprints and combines materials to create space, a developer holds a design in mind and builds spaces of logic out of code. This concept of software architecture has been used for a long time, and it helps us understand the structure of a software system by likening it to the structure of a building in the real world. One advantage of comparing code to architecture is that we can picture both the static structure and the dynamic flow at a glance. In other words, when we view the world of code through an architectural metaphor, directories become spaces (partitions of space), files become the rooms within them, and **paths** come to feel like corridors. The front end is responsible for the visible beauty — like a building's exterior walls and windows, its paint and finishing — while the back end holds up function from within, like the hidden wiring and plumbing. In this way, the buttons in a UI are like the thermostat in a building's boiler room or a central control panel: a simple press on the surface controls an enormous flow inside the internal system. The insides of functions and pipelines become channels for the flow of information, like the space inside a pipe through which water runs, and an AI model built with a modular architecture (the Transformer, for example) calls to mind prefabricated construction or a modern building raised by stacking LEGO blocks. Advances in algorithms and data structures are like the introduction of a “new construction method (新工法)” — a new building technique — in architecture, letting us build larger and more complex software cities. In the end, code is a building made from the structuring of a developer's thought, and today's enormous AI systems can be called a single city composed of countless structures and flows. In this piece, let us examine these analogies one by one and take a philosophical look at the world of programming in the language of architecture.
The directory structure by which we organize files inside a computer closely resembles the internal structure of a real building. A directory is a kind of **space (a large partition)**, and the files contained within it each serve as an independent **room**. In fact, when pointing to a file path in UNIX-like operating systems, people sometimes explain it using this very spatial metaphor. For instance, the expression *“open the file /foo/bar”* can be unpacked as **“pass through the corridor called /foo and enter the room called bar.”** A single file is a single room, and the directory entry for that file (the path made of the file name) is likened to the **door** that leads into that room. So opening a file is, in the end, the same act as opening that room's door and stepping inside. And just as a single room can have several doors, several different paths (hard links and the like) can point to a single file. When we visualize directories and files through this analogy, even a complex file system gets drawn in the mind like a single building. Our workspace inside the computer is, in effect, one digital building.
This spatial analogy has long been used in the field of user experience (UX) as well. The method of classifying information with files and folders can be said to derive from the old office metaphor of organizing documents with file folders and drawers. Yet information in the digital world differs from reality in that, unlike physical paperwork, it does not exist in only one place — through hard links or references you can create several entrances to a single room. Even so, through this spatial metaphor we are able to draw and remember an invisible file system as a spatial map in our minds. In the end, designing a directory structure well can be said to be the same as drawing up a building's floor plan. If we create a structure that is easy for users or developers to remember and easy to move through, then even a complex project can be explored without getting lost, like a well-ordered building.
Looking at this analogy more closely, the roles of the technologies commonly used in web development correspond to the individual elements of architecture. HTML, for example, is the language responsible for a web page's skeleton and structure, which corresponds to a building's foundation and walls. Once you wrap content in HTML tags to create structure, you can dress it up on top with CSS. CSS is responsible for visual expression — setting text color, decorating the background, arranging the layout, and so on — which is the same as beautifying a home with paint, wallpaper, carpet, and pictures. In fact, the MDN Web Docs likewise put it this way: *“CSS is like the paint and wallpaper that make a house look nice.”* Finally, **JavaScript** grants dynamic interaction to the web, and this is likened to the electronics and home appliances inside a house (the oven, the TV, the microwave, and so on). Just as a house needs appliances to actually bring convenience to daily life, JavaScript breathes life into a static page and creates a system that responds to the user in real time.
As another example of likening front-end elements to architectural finishes, things like buttons, icons, and menus in a UI framework can be seen as the switches or levers a user operates. Just like a boiler room's thermostat or a building elevator's buttons, users press buttons on the screen, but behind them an enormous machine is turning. Just as a user can flip a tiny switch on the wall to control the heating and cooling of an entire building, a single tiny button in a web app can trigger a server's complex computation or pull information from a database. In this sense, the components of a UI are the control panel exposed on the surface, while the actual engine or boiler is the back-end system logic, hidden from view. The front end being designed cleanly and elegantly is similar to how people feel more favorably toward a building the better its well-finished, decorated interior and exterior design. But just as a building that is all gloss on the outside while flimsy within soon reveals its problems, no matter how splendid the front end is, if the back end is not solid it amounts to nothing more than a useless shell to the user. In the end, just as a beautiful exterior and a sturdy structure must come into harmony to make a building of high quality, the harmony of front end and back end can be said to make good software.
When we compare code to architecture, let us think about the dynamic aspect — the part where information moves as time passes. A building has **plumbing (pipes)** installed for water and heating, and wiring laid for electricity. They are usually invisible to people, but they are crucial elements that supply the flow of water and electricity throughout the entire building and give it life. In the same way, inside software there exist invisible connectors that carry data and control flow. The call chains of functions, the data pipelines between modules, event buses, API communication, and so on — all of these correspond to such channels of flow.
In particular, the term “pipeline” in software engineering derives directly from plumbing pipes. You can find this concept, for example, in the pipe `|` symbol of Unix systems, or in data-processing pipelines. A pipeline refers to a structure in which several processing stages are connected in series, the output of one stage becoming the input of the next — and it is called that because it resembles the way water flows in one direction through **physical plumbing**. Indeed, the name “pipeline” is said to derive from this similarity to a physical water pipe, in that information flows in only one direction just as water flows in one direction. When we write code and join functions together, or send a data stream flowing through several processors, we are, in a sense, doing a kind of software plumbing work.
These “invisible pipes” may go unnoticed by someone encountering a system for the first time, but the most experienced developers are often likened to “plumber programmers.” Someone has remarked that “software plumbing is the important work of connecting the things that other people don't even want to see or think about.” The point is that, even though the UIs and major modules visible on the surface (the big boxes) may look impressive, properly implementing the **arrows (the connecting lines)** that actually link those boxes is the hardest and most important work of all. Put simply, designing and laying the countless pipes and wires that flow behind a building is harder than building its large, beautiful halls and rooms. But in a building that works well, the users or residents enjoy its conveniences without even being aware that the plumbing exists. The apps and web services we use every day are the same: it is precisely because the complex internal data flows and control logic are smoothly connected that we can use the functions we need by pressing just a few buttons on the surface. So a great software architect, like a plumber, designs even the parts that cannot be seen with meticulous care, performing the role of laying the blood vessels and nerves into the building that is the whole system.
In programming, the pipe-and-filter pattern, the flow pattern, and the like are all concepts related to the construction of these flows. Connecting several functions like a pipeline through **function composition** in functional programming, or expressing data flow as streams in reactive programming, is also similar to assembling plumbing. What matters is that you have to match the diameter of each pipe (the buffer size) and the manner of connection (synchronous/asynchronous, blocking/non-blocking, and so on) well, so that you can build a system in which water neither leaks nor clogs. This connects to designing software so that it prevents bottlenecks, avoids deadlocks, and maintains flow without data loss. In the end, the inside of a function is the space inside a pipe through which water flows, and the execution of a function — in which input is transformed into output and flows out — can be thought of as the process of carrying things along a corridor. Seen this way, even a complex algorithm gets drawn like the plumbing diagram of an enormous plant, and we can fix bugs and optimize just as an engineer tightens valves and installs pumps.
In modern software architecture, **modularity** and **reusability** are core principles. This is in line with how prefabricated construction and modular design are gaining attention in architecture. Just as building a large structure goes up in speed and efficiency when, instead of making the whole thing on-site, you stamp out standardized modules in a factory in advance (rooms or wall units, say) and assemble them on-site, software too becomes easier to maintain and easier to extend when it is developed and combined in modular units. In other words, modular architecture is a way of making independently operable components — like LEGO blocks — and then combining them as needed to form one large system. Each individual module has a clear role and boundary, and communicates through a well-defined interface. In architecture, too, a modular building is assembled by transporting each block by truck and fitting it together on-site; in software, likewise, we build systems by connecting modules with APIs or event buses.
The pinnacle of this modular thinking is precisely the **Transformer** model that has driven recent AI innovation. A Transformer neural network is made up of multiple **layers** stacked repeatedly, which is similar to continuing to raise the floors of a building with a fixed floor height. Each floor (layer) holds a module of identical structure, but in the process of being stacked it gains ever richer expressive power. By one analogy, if a shallow neural network is a small building of one or two floors, a **deep network** can be likened to a high-rise. As it has been put: **“Each layer of a neural network is like a single floor of a skyscraper, so that with every floor you go up, new perspectives and functions are added that could not be obtained from a single floor below alone.”** In fact, the standard explanation of deep learning is that the first layer learns simple patterns (edges or lines, say), the next layer combines those to learn more complex features (shapes or contours, say), and by the highest layer it comes to recognize whole concepts (faces or objects, say). This is similar to the process of constructing a building by laying the underground foundation and then stacking up the first floor, the second floor, one by one, until at last a tall building is complete. Just as you can gaze out at an open view from the top floor that you could never see from the low floors alone, the deeper the layers of a neural network go, the more abstract and holistic the understanding that becomes possible.
The modular structure of a Transformer model is like modern building blocks. The core elements that make up a Transformer — **attention heads**, feed-forward layers, and the like — are a kind of standardized part, arranged many at a time in parallel and in series. These parts can combine freely with one another and can be stacked up repeatedly according to the size of the model. It feels like raising an enormous structure by placing window panels or steel frames of identical specification over and over. Thanks to this module repetition, the Transformer architecture has become relatively easy to grasp as a design concept (since if you understand one layer, ten layers run on a similar principle), and training a large-scale model is also scalable in the form of growing a small model. That is, just as you gather and connect several small houses to make an apartment complex, you combine several small Transformer blocks to build an enormous language model (GPT and the like). Indeed, the idea that “a modular architecture is made up of multiple building blocks” is a trend across modern technology as a whole, from cloud architecture to AI models. Once you have such a reusable, assemblable structure, partial modifications and upgrades become easier too, so you can evolve the system flexibly, as if renovating or extending part of a building. aws.amazon.com (reference website)
In the world of software, algorithms and data structures are the core ideas about how to solve a given problem and how to organize data. If we compare these concepts to architecture, the invention of an efficient algorithm is like the development of a new construction method. For example, in the history of architecture, the introduction of the arch form or the **dome** structure made it possible to cover a wider space than before without columns, and the advent of reinforced-concrete construction made high-rise buildings possible. In this way, the appearance of a new construction method or material brought innovation to the scale, form, and stability of buildings. Likewise, in computer science, the invention of the Quick sort algorithm opened the way to solving the sorting problem far faster than before, and the devising of the hash table data structure made data retrieval extremely efficient — each bringing about a paradigm shift. This is as if architects had gotten their hands on new structural-design techniques or cutting-edge materials; for programmers, it means they have gained tools that let them handle larger and more complex problems.
Taken together, algorithms and data structures are the technical foundation of software architecture. An outstanding algorithm lets you implement the same function more lightly and quickly, so it is comparable to the innovation in architecture of making a stronger structure with lighter materials. Furthermore, in the age of AI, new machine-learning algorithms and optimization techniques are pouring out, and through them we have become able to erect kinds of “intelligent software buildings” that could not be built before. The discovery of deep learning's **backpropagation** algorithm was the introduction of rebar into AI architecture, and the introduction of the **attention** mechanism is close to having created the arch technique that holds up a complex structure. In this way, the new construction methods of the code world raise the range and performance of the systems we can build, letting us create larger and more elaborate code cities.
Our writing of code can be seen not merely as listing a series of commands for a computer, but as the act of structuring our thought process and implementing it externally. From a philosophical perspective, an entire codebase reflects the **architecture of thought** of the person (or people) who made the program. Through the structure of the code we can glimpse how a problem was approached and broken down, how the data was made to flow, and how exceptional situations were prepared for. When we try to understand a complex software system, it is, in the end, similar to the process of following the map of the thought of the person who designed that system.
In this context, interesting conversations also take place at the frontier of dealing with AI code. In one research conversation, someone expressed it this way: **“It's an architecture of thought in code.”** In other words, it means that through the medium of code, the components of thought (logic, inference, conditionals, loops, and so on) exist organized like a building. In modern AI agent systems in particular, multiple modules or agents cooperate by exchanging messages with one another, and this is a picture in which each is a single object of thought (a thinking being), interwoven organically to form a larger thought process. Just as the human brain is composed of a network of neurons and unfolds thought, an enormous piece of software also unfolds thought through a network of countless functions, objects, and processes. So designing a software architecture can also be seen as the act of designing a kind of way of thinking. It is the work of deciding which module should know what and be kept from knowing what (information hiding), which part should make the decisions (control-flow structure), and when to unfold thought in parallel and when to do it serially (concurrency control). This is quite similar to designing the cognitive structure of a brain.
What is even more interesting is the fact that, in the age of AI, as machines come to imitate or take over part of human thought, programming itself is becoming a delegation of thought. Whereas in the past a person would code every procedure by hand, now, through machine learning, the computer learns patterns from vast data on its own and forms an internal logical structure. This is as if an AI architect, rather than a human, designs the building itself. In the end, if you look inside it, there may stand a building of thought far more complex than code a human wrote directly, and unlike human intuition. The more this is the case, the more we chew over the essential questions anew: “What is code, and who draws the blueprint?” For if code is the externalization of thought, then AI will become an entity that expands and reconstructs our buildings of thought on its own.
Now let us go beyond an individual program and compare an entire AI system, in which countless services and modules are interwoven, to one enormous city. It is a comprehensive picture that includes all of the spatial metaphor, the building metaphor, and the plumbing metaphor mentioned earlier. A city is a complex organism made up of many buildings, roads, bridges, parks, and infrastructure facilities. Likewise, a modern AI system forms one ecosystem in which countless components (microservices, databases, caches, user interfaces, batch processing, and so on) are connected to one another. Software-engineering researchers have long tried to understand such complex systems through software-visualization techniques, and among the famous ones is the **city metaphor**. The tool called *“CodeCity”* represented classes as buildings and packages (or directories) as **districts** of a city, visualizing large-scale object-oriented systems like a city. In fact, when you look into the code of a large-scale project, there are times when it feels like exploring among countless classes (buildings) and modules (districts), like a city traveler strolling through a forest of buildings. In the CodeCity metaphor, the size or complexity of each class is represented by the height or footprint of a building, and dependency relationships among classes are sometimes represented by roads or bridges. Such visualization shows the structure of a complex system at a glance, letting us survey the panorama of a software city. (reference website) wettel.github.io.
The reason the city metaphor is especially useful is, as mentioned earlier, that it lets us call to mind the static structure and the dynamic flow at the same time. If you think about a real city, there is, on the one hand, static infrastructure such as the arrangement of buildings and the road network, and on the other, the dynamic flow in which vehicles constantly move and people are active within it. Good city planning takes both aspects into account. Likewise, in a software city, both the structural aspect (which module is placed where and how it forms a hierarchy) and the dynamic aspect (which data flows where at runtime, and how much traffic arises) are important. For example, if one part of an AI system transmits a large volume of data to another part, this corresponds to a city's highway, and traffic will pile up on that stretch. Some module becomes a bottleneck and causes congestion, while some service may have spare capacity and sit quiet. Likening this picture to a city's traffic flow or population movement makes it easy to understand. And if you think about security, just as a city needs walls or gates to block outside intrusion, a system too needs a firewall or an authentication gateway. Thinking about when a failure occurs: just as, when a fire breaks out in one building, fire trucks are dispatched and there are fire-prevention measures so the flames do not spread to the surroundings, in a software city too, when a problem arises in one component, a structure that reroutes traffic or inserts a circuit breaker to keep it from spreading to the whole is important.
Going further, at the level of enterprise architecture, a large corporation's IT system can also be likened to a nation or a continent made up of several cities. The shift to the cloud corresponds to remodeling an entire city or building a new town, greatly changing the terrain and the buildings of the existing city. Just as Baron Haussmann's renovation of Paris, France, in the 19th century changed the face of Paris, when a company migrates to a cloud architecture the terrain of its IT city is completely transformed anew. When we look at software from this macroscopic perspective, a developer can be said to be closer to an urban planner or an architect than a mere coder. Writing code may be the work of erecting a single building, and designing an entire system is like designing a skyline and drawing the map of a city.
Finally, through the perspective of seeing an AI system as a city, we can also sketch a picture of the future. Just as a smart city makes the city itself intelligent with IoT and AI, an AI-system city too can evolve into an autonomous city that monitors and optimizes itself. For example, APM (application performance management) tools or AIOps grasp the state of the system-city in real time and coordinate problems, much like a city's traffic-control system or environmental-sensor network. In the long run, AI may play the role of the city's **mayor**, automatically handling system resource allocation and optimization, security response, and so on. If that happens, human developers will be able to concentrate on even more creative design and improvement. But no matter how automated it becomes, the basic principle does not change: to build a stable and scalable system, you must design structure and flow in balance, with the mindset of architecture and urban planning. The metaphor of seeing the world of code as a city provides that macroscopic field of vision, and reminds us that the AI systems we build are each a living, breathing city.
Up to now, we have surveyed the world of software through various metaphors that liken code to architecture. From directory structure to front end and back end, pipelines and modular architecture, the innovation of algorithms, and the city metaphor for enormous AI systems — these comparisons do not end as mere amusement. A metaphor is a powerful tool that aids our thinking. When we understand an unfamiliar object through a familiar concept, even a complex system can be grasped intuitively. Of course, metaphors have their limits, and there are differences in that software, unlike architecture, is easy to change (it can be refactored in an Agile way), changes even while running (dynamic reconfiguration, self-modifying code, and so on), and is free from physical constraints. Even so, the architectural metaphor is still usefully employed in software-engineering education and communication, and it enriches our spatial and structural understanding of the code we have made.
As programmers and digital architects, we erect new buildings every day with intangible material. Sometimes a single short script is used like a one-page temporary shed and then disappears; the code of a large platform sometimes becomes entangled like a complex building that has undergone extension and remodeling over and over for decades. On this journey, holding the architectural metaphor in mind is meaningful in that it makes us constantly ponder the balance between aesthetics and structure, between user experience and functionality. Just as a beautiful building is also functional, beautifully structured code tends to be excellent in maintainability, efficiency, and scalability as well. And by taking on the gaze of an urban planner when we look at a large-scale system, we can pursue the health of the whole system without getting buried in partial optimization.
The code of the AI age is becoming ever more complex and vast, but the point that its essence is the result of organizing thought does not change. So we, sitting before the editor again today, are in effect making functions with the mind of building a single small room, connecting those rooms to make corridors, combining modules to raise buildings, and going further to draw the horizon of a digital city. In understanding this enormous creative activity, there is surely no guide as intuitive and rich as the architectural metaphor. I hope these analogies have given new insight and joy to the curious students learning programming, to beginners, and to all adult learners. For the world of code is the world of buildings built with our thought, and within it, we are constructing a magnificent city again today.
References: various discussions on software architecture and metaphor (medium.com, developer.mozilla.org), the room/corridor analogy for file systems (unix.stackexchange.com), the analogy of pipelines and plumbing (encyclopedia.pub, johndcook.com), the concept of modular architecture (aws.amazon.com), the analogy for neural-network layers (medium.com), “code city” visualization and the merits of the city metaphor (wettel.github.io, ewernli.com), and the timely view that *“code is the architecture of thought”* (researchgate.net), among others. As these materials also show, metaphor serves as a powerful bridge to understanding software. I look forward to the buildings of code we imagine rising ever higher and more beautifully in the years to come.