Directory structure with Elixir

How to organize an Elixir/Phoenix project to scale? What decision-making process do you use? A case study from a real estate auction platform.

26 February 2024 — Goulven CLEC'H

  1. What’s the problem?
    1. Inconsistent structures
  2. Main considerations
    1. Context-driven architecture
  3. Three errors avoided
    1. Nesting hell
    2. Scrolling hell
    3. Messy hell
  4. Our final proposal
    1. About test files
    2. About CQRS/ES files
  5. About the decision process
  6. What I learned
    1. Possible improvement

What’s the problem?

Elixir is a functional programming language, where the code is mainly organized as functions (duh!) and modules (a group of functions). However, the Elixir documentation doesn’t give a lot of advice on how to organize a project, as expected from a general-purpose language.

The Phoenix documentation, Elixir’s web framework, does offer a directory structure example for new projects, but it’s mainly detailed for the front-end part (/lib/hello_web/). The back-end part (/lib/hello/) is mostly composed by schemas files (feature.ex, the data structure and its changesets) and context files (features.ex, a module with all the public functions).

But our tiny real-estate start-up has grown over three years, with different developers and new features. Sadly, our back-end directory structure hasn’t evolved with it and has become a mess.

To be honest, I am extensively to blame for this situation. After the departure of the company’s first developer, I remained alone as a junior developer. With the intention of recruiting another CTO, I maintained the project’s initial philosophy as long as possible and delayed making essential architectural decisions.

Even as recruitment difficulties became apparent, it took me some time to step up as the tech lead. I suffered through the situation, which seemed beyond my control. And when we compensated with freelancers/ interns, I didn’t dare nor know how to manage them correctly, leading to less coherent or qualitative code.

Today, the situation is different, as I gradually regained control of the project and built my confidence. I have built a team of five people, including three software developers, destined to grow. And now, it is time to re-organize our project!

Let’s consider an extract of the old structure :

encheres_immo/
├─ accounts.ex <- multi-schemas context
├─ accounts/
│ ├─ user.ex <- schema
│ ├─ organization.ex <- schema
│ ├─ formula.ex <- child schema
│ └─ [... utils]
├─ auctions.ex <- context
├─ auctions/
│ ├─ projections/
│ │ ├─ auction.ex <- schema
│ │ └─ [... utils]
│ ├─ currency.ex <- child schema
│ └─ [... utils]
├─ contacts/
│ ├─ contacts.ex <- context
│ ├─ contact.ex <- schema
│ └─ [... utils]
[...]
├─ application.ex <- app life cycle
├─ release.ex <- utils for migrations and seeding
└─ [... repo related files]

If we can identify three contexts in this example (accounts, auctions, and contacts), their inconsistent structure makes the whole project hard to understand.

Additionally, some of these directory structures simply don’t work. For example, instead of users and organizations context with their related functions, we have an abnormally long and messy accounts.ex. Worse, some schemas file, like user.ex and organization.ex, contain some functions that should also be in context.

This situation arose from poorly documented personal logic, followed by various developers interpreting it in their own way. We can guess the origin of some problems:

  1. The initial developer aimed to group account functions together, but their number made this file unreadable.
  2. Another developer (maybe me) wanted to reduce this file by putting the simplest functions with their schema, making unclear if a function should be in the context or with the the schema.
  3. With an absurdly big accounts/ folder next to a small contacts/ folder, it’s now unclear when a feature should have its folder. This decision became at the discretion of each developer.
  4. Let that sink for a few years and ta-da!

Main considerations

Before starting to iterate, I knew my solution had to address several considerations.

First, our system should be versatile enough to cover all our current and future needs, without requiring a complete overhaul.

Second, « What is conceived well is expressed clearly ». If good readability and consistency are the main objectives, this should make our solution easy to document and, therefore, easy to enforce. A few paragraphs explaining key concepts and a structure diagram should be enough.

Finally, I want to involve the current team as much as possible in decision-making. This approach will not only lead to a better solution but also ensure that the entire team is familiar with the solution, making it easier to implement initially.

We kept Phoenix’s « context-driven architecture » 1, not only because it’s the expected architecture by Elixir/Phoenix developers, but also because reorganizing the project it’s already a time-consuming task, so changing it’s whole philosophy is out of scope.1 — Is there a standard term?

But it was a good opportunity for me to learn about other architectures philosophy, like Clean Architecture, Hexagonal Architecture, and Domain-Driven Design (DDD). I really want to try the first two on a greenfield personal project, but they imply too much changes for our current project. I also definitely took some inspiration from DDD, especially the idea of bounded context to help us define our contexts scope and boundaries, but we needed to keep a strict separation between the back-end (encheres_immo) and front-end parts (encheres_immo_web & encheres_immo_api).22 — This separation is standard in Elixir monolith projects. It has also ruled out a pure vertical architecture.

Contexts still remain the core of our back-end, as they contain all the business logic, that will be used by the front-end parts (web and api). I will also use the term « feature » to describe a context with its related schemas, utils, and sub-features.

Three errors avoided

We made several iterations before finding the right one. Some were deliberately radical and helped us to identify wrong directions.

Here is our initial proposal based on this blog post, with some personal modifications. Each feature is in a folder, inside its parent feature, and so on, following the schema relations.

encheres_immo/
├─ accounts/
│ ├─ accounts.ex <- multi-schemas context
│ ├─ organizations/
│ │ ├─ organizations.ex <- context
│ │ ├─ organization.ex <- schema
│ │ ├─ formulas/
│ │ │ ├─ formulas.ex <- child context
│ │ │ └─ formula.ex <- child schema
│ │ └─ [... other child folders]
│ └─ users/
│ ├─ users.ex <- context
│ ├─ user.ex <- schema
│ └─ [... other child folders]
[...]
├─ application.ex <- app life cycle
├─ release.ex <- utils for migrations and seeding
└─ [... repo related files]

This solution has several advantages: fewer files at the root, coherent structures throughout, and allowing multi-schema contexts (if necessary). Also, module names follow schema relations:

EncheresImmo.Accounts.Organizations.Formulas.Formula
Lib -> Parent -> Parent -> Context -> Schema

One of the downsides of this solution is that it may result in more and more folders nested inside folders, making it challenging to access deeply nested files. With time, module names will be longer and harder to guess, such as : EncheresImmo.Accounts.Organizations.PaymentInfos.Cards.Card.

Finally, complex relations like Users-Orgs or many-to-many tables can be tricky to represent as a nested hierarchy.

For thought, we can try the opposite approach:

encheres_immo/
├─ organizations/
│ ├─ organizations.ex <- context
│ ├─ organization.ex <- schema
│ └─ [... utils]
├─ formulas/
│ ├─ formulas.ex <- context
│ └─ formula.ex <- schema
├─ users/
│ ├─ users.ex <- context
│ ├─ user.ex <- schema
│ └─ [... utils]
[...]
├─ application.ex <- app life cycle
├─ release.ex <- utils for migrations and seeding
└─ [... repo related files]

Like the previous solution, we clean files at the root and get a coherent structure throughout. But this time, every schema has its folder with context/schema/utils, making module names easy to guess.

EncheresImmo.Users.User
Lib -> Context -> Schema

However, there are drawbacks to this solution, including a potentially large number of folders at the root level, with no clear distinction between the central feature and the small sub-feature. Also, folder structure and module names do not represent schema relations.

My colleague tried two solutions, which can be summarized as follows:

encheres_immo/
├─ accounts/
│ ├─ accounts.ex <- multi-schemas context
│ ├─ custom_themes.ex <- context
│ ├─ schema/
│ │ ├─ organization.ex <- schema
│ │ ├─ user.ex <- schema
│ │ ├─ custom_theme.ex <- schema
│ │ └─ [... other schemas]
│ └─ [... utils]
├─ contacts/
│ ├─ contacts.ex <- context
│ ├─ schema/
│ │ └─ contact.ex <- schema
│ └─ [... utils]
[...]
├─ application.ex <- app life cycle
├─ release.ex <- utils for migrations and seeding
└─ [... repo related files]

This architecture features less nesting compared to the first proposal and fewer folders than the second one. Additionally, It makes a clearer distinction between context files and schema files, and allows multi-schemas context.

However, I don’t think there is a necessity for multi-schemas contexts, such as Accounts. Upon reviewing our code, such functions would either belong in a distinct context or are poor programming practices.

Moreover, this solution does not distinguish schemas/contexts associated with more extensive features. When we multiply subsidiary schemas/contexts, this quickly leads to a messy folder:

encheres_immo/
├─ accounts/
│ ├─ accounts.ex <- multi-schemas context
│ ├─ custom_themes.ex <- context
│ ├─ formulas.ex <- context
│ ├─ payment_infos.ex <- context
│ ├─ [... other contexts]
│ ├─ schema/
│ │ ├─ organization.ex <- schema
│ │ ├─ custom_theme.ex <- schema
│ │ ├─ formula.ex <- schema
│ │ ├─ payment_info.ex <- schema
│ │ └─ [... other schemas]
│ └─ [... utils]
[...]

Our final proposal

Not the miracle solution, but I think I found one halfway of all our solutions:

encheres_immo/
├─ organizations/ <- a folder for each feature
│ ├─ organizations.ex <- context
│ ├─ organization.ex <- schema & changesets
│ ├─ ? - utils/ <- utils folder
│ │ └─ [... utils].ex <- utils files
│ └─ ? - formula/ <- subfolder for each child feature
│ ├─ formula.ex <- child schema & changesets
│ ├─ ? - formulas.ex <- child context
│ └─ ? - [... utils] <- child utils
[...]
├─ application.ex <- app life cycle
├─ release.ex <- utils for migrations and seeding
└─ [... repo related files]

A « Feature » is a context folder (example: user) including : a context file with all the business logic and CRUD functions3; a schema file with the data structure (if needed) and its changesets; an optionnal utils folder.

A « Child feature » is a simpler feature, often a schema/ embed (example: formula) with few to none context functions, and few to none utils. A child feature can’t have a child feature itself. If a child feature grows into a feature, we should transfer it to its own folder.

The primary benefit of this solution is its ease of documentation, requiring just one schema and two paragraphs to explain. Furthermore, despite its straightforward approach, it stays versatile enough to ensure compatibility with our existing codebase.

Additionally, even if the module name doesn’t fully describe the entire hierarchy as the first proposal does, it still enables us to grasp the module’s immediate relationships. And it remains concise and straightforward to deduce, like the second proposal :

EncheresImmo.Organizations.Formulas.Formula
Lib -> Feature -> Subfeature -> Schema

Drawbacks include the eventual proliferation of folders at the root, but this should stay slower than the second proposal and without the nesting (seen in the first proposal) or messy (noted in the third proposal) issues. Finally, if this method requires adjustments to the folder structure each time a sub-feature is upgraded to a main feature, those instances should be rare.

Let’s quickly talk about test files, as I’ve intentionally skipped the subject so far.

To respect our legacy code, we’ll maintain a test/ folder at the project’s root. This includes a test_helper.exs file and a support/ folder for shared test functions. test/ (alike lib/) is split between encheres_immo/ and encheres_immo_web/ folders for back-end and front-end parts respectively. This approach, while not my initial choice, is the best compromise between respecting the existing code and making it evolve. It’s also aligned with the structure to which other developers are accustomed, as recommended by the Phoenix documentation.

For my personal project, I plan to explore a slightly different structure: regroup the tests on one feature with the other related files, by creating a [feature name]-test.exs file in the folder of each feature. I will see if it’s possible to make it work, but I think it can improve modularity, readability, and maintenance.

Work in progress.

About the decision process

As I said in the introduction, this is one of my first big decisions as a tech lead. I wanted to involve the team in the process, not only to make a better decision or for democratic management, but also to reassure myself.

It was also the first opportunity to formalize my decision process, and I knew that writing would play a central part. Of course, some discussions happened in meetings, but I tried to keep a written trace of it, so I created a GitHub Issue.

There are multiple benefits in using GitHub Issues. Firstly, these issues are stored in the repository, making them easily accessible. Additionally, writing allows us to express our thoughts clearly, illustrate them with examples, and review proposals from others. By doing so, we can see the progression of the discussion and how we reached the final decision.

We took a few days to discuss the proposal and make the necessary changes, but the effort paid off. By involving everyone, we made a more informed decision that everyone knows about. More oral meetings or rushing the decision alone would not have saved time in the long run.

At the conclusion of our discussions, we have to write down and document our decisions. As this affects the entire project scope, we added the schema and the two paragraphs above in our README.md. If the decision was related to a specific feature, we’ll document it within the @moduledoc of the feature’s context file, ensuring relevance and easy access.

What I learned

This work not only enhanced the day-to-day experience of all developers on the project, but it allowed me to grow both as a developer and tech lead.

Firstly, the journey from navigating a convoluted project structure to a clean, coherent, and balanced architecture taught me the value of versatility in project design. This middle ground proved to be a sustainable solution, marrying simplicity with scalability.

Moreover, the iterative process really helped me understand each approach’s strengths and weaknesses. By examining the problem from multiple perspectives, seeking out new ideas online, and testing them on our codebase… I learned a lot about the language, about project design, but also about my own project!

Finally, the decision-making process highlighted the importance of transparency and active team participation. Using GitHub Issues for documentation promoted clarity, made sure every team member could share their thoughts, and kept everyone aware of the final decision.

It also gave me confidence in my ability to make important decisions for this project, even when it requires reworking the entire architecture.

December 2024:
While our solution has held up so far, the real estate crisis slowed our project’s growth, leaving just two senior developers in the team. So our architecture hasn’t faced the test of a rapid scaling team—yet ;)

Looking back, there are a few things that we could have done better :

We’ll see if these strategies can make our decision-making and documentation better, while fostering a more collaborative and efficient environment… Or if they will slow down and complicate our decision process.