Directory structure with Elixir
How to organize an Elixir/Phoenix project to scale? What decision-making process do you use? A case study from a real estate startup.
26 February 2024 β Goulven CLEC'H
Whatβs the problem?
Elixir is a functional programming language, where the code is mainly organized as functions (duh!) and modules (a group of functions). However, the Elixir documentation doesnβt give a lot of advice on how to organize a project, as expected from a general-purpose language.
The Phoenix documentation, Elixirβs web framework, does offer a directory structure example for new projects, but itβs mainly detailed for the front-end part (/lib/hello_web/
). The back-end part (/lib/hello/
) is mostly composed by βschemasβ files (feature.ex
, the data structure and its changesets) and βcontextβ files (features.ex
, a module with all the public functions).
But our tiny real-estate start-up has grown over three years, with different developers and new features. Sadly, our back-end directory structure hasnβt evolved with it and has become a mess.
To be honest, I am extensively to blame for this situation. After the departure of the companyβs first developer, I remained alone as a junior developer. With the intention of recruiting another CTO, I maintained the projectβs initial philosophy as long as possible and delayed making essential architectural decisions.
Even as recruitment difficulties became apparent, it took me some time to step up as the tech lead. I suffered through the situation, which seemed beyond my control. And when we compensated with freelancers/ interns, I didnβt dare nor know how to manage them correctly, leading to less coherent or qualitative code.
Today, the situation is different, as I gradually regained control of the project and built my confidence. I have built a team of five people, including three software developers, destined to grow. And now, it is time to re-organize our project πͺ
Inconsistent structures
Letβs consider an extract of the old structure :
If we can identify three contexts in this example (accounts
, auctions
, and contacts
), their inconsistent structure makes the whole project hard to understand.
Additionally, some of these directory structures simply donβt work. For example, instead of users
and organizations
context with their related functions, we have an abnormally long and messy accounts.ex
. Worse, some schemas file, like user.ex
and organization.ex
, contain some functions that should also be in context.
This situation arose from poorly documented personal logic, followed by various developers interpreting it in their own way. We can guess the origin of some problems:
- The initial developer aimed to group account functions together, but their number made this file unreadable.
- Another developer (maybe me) wanted to reduce this file by putting the simplest functions with their schema, making unclear if a function should be in the context or with the the schema.
- With an absurdly big βaccountsβ folder next to a small βcontactsβ folder, itβs now unclear when a feature should have its folder. This decision became at the discretion of each developer.
- Let that sink for a few years and ta-da!
Main considerations
Before starting to iterate, I knew my solution had to address several considerations.
First, our system should be versatile enough to cover all existing and future features, while remaining consistent. It should also allow scaling small subfeatures into larger ones.
Second, « What is conceived well is expressed clearly ». If good readability and consistency are the main objectives, this should make our solution easy to document and, therefore, easy to enforce. A few paragraphs explaining key concepts and a structure diagram should be enough.
Finally, I want to involve the current team as much as possible in decision-making. This approach will not only lead to a better solution but also ensure that the entire team is familiar with the solution, making it easier to implement initially.
Three errors to avoid
We made several iterations before finding the right one. Some were deliberately radical and helped us to identify wrong directions.
Nesting hell
Here is our initial proposal based on this blog post, with some personal modifications. Each feature is in a folder, inside its parent feature, and so on, following the schema relations.
This solution has several advantages: fewer files at the root, coherent structures throughout, and allowing multi-schema contexts (if necessary). Also, module names follow schema relations:
One of the downsides of this solution is that it may result in more and more folders nested inside folders, making it challenging to access deeply nested files. With time, module names will be longer and harder to guess, such as : EncheresImmo.Accounts.Organizations.PaymentInfos.Cards.Card
.
Finally, complex relations like Users
-Orgs
or many-to-many tables can be tricky to represent as a nested hierarchy.
Scrolling hell
For thought, we can try the opposite approach:
Like the previous solution, we clean files at the root and get a coherent structure throughout. But this time, every schema has its folder with context/schema/utils, making module names easy to guess.
However, there are drawbacks to this solution, including a potentially large number of folders at the root level, with no clear distinction between the central feature and the small sub-feature. Also, folder structure and module names do not represent schema relations.
Messy hell
My colleague tried two solutions, which can be summarized as follows:
This architecture features less nesting compared to the first proposal and fewer folders than the second one. Additionally, It makes a clearer distinction between context files and schema files, and allows multi-schemas context.
However, I donβt think there is a necessity for multi-schemas contexts, such as Accounts
. Upon reviewing our code, such functions would either belong in a distinct context or are poor programming practices.
Moreover, this solution does not distinguish schemas/contexts associated with more extensive features. When we multiply subsidiary schemas/contexts, this quickly leads to a messy folder:
Our final proposal
Not the miracle solution, but I think I found one halfway of all our solutions:
A βFeatureβ is a data structure (example: user
), with a schema (and changesets), a context (functions to create/ use/ modify/ delete it), utils (optional - additional functions, boilerplates, templates, or behaviours), and child features (optional - embed or small tables).
A βChild featureβ is a simple schema/ embed (example: formula
) with few to none context functions, and few to none utils. A child feature canβt have a child feature itself. If a child feature grows into a feature, we should transfer it to its own folder.
The primary benefit of this solution is its ease of documentation, requiring just one schema and two paragraphs to explain. Furthermore, despite its straightforward approach, it stays versatile enough to ensure compatibility with our existing codebase.
Additionally, even if the module name doesnβt fully describe the entire hierarchy as the first proposal does, it still enables us to grasp the moduleβs immediate relationships. And it remains concise and straightforward to deduce, like the second proposal :
Drawbacks include the eventual proliferation of folders at the root, but this should stay slower than the second proposal and without the nesting (seen in the first proposal) or messy (noted in the third proposal) issues. Finally, if this method requires adjustments to the folder structure each time a sub-feature is upgraded to a main feature, those instances should be rare.
About test files
Letβs quickly talk about test files, as Iβve intentionally skipped the subject so far.
To respect our legacy code, weβll maintain a test/
folder at the projectβs root. This includes a test_helper.exs
file and a support/
folder for shared test functions. test/
(alike lib/
) is split between encheres_immo/
and encheres_immo_web/
folders for back-end and front-end parts respectively. This approach, while not my initial choice, is the best compromise between respecting the existing code and making it evolve. Itβs also aligned with the structure to which other developers are accustomed, as recommended by the Phoenix documentation.
For my personal project, I plan to explore a slightly different structure: regroup the tests on one feature with the other related files, by creating a [feature name]-test.exs
file in the folder of each feature. I will see if itβs possible to make it work, but I think it can improve modularity, readability, and maintenance.
About the decision process
As I said in the introduction, this is one of my first big decisions as a tech lead. I wanted to involve the team in the process, not only to make a better decision or for democratic management, but also to reassure myself.
It was also the first opportunity to formalize my decision process, and I knew that writing would play a central part. Of course, some discussions happened in meetings, but I tried to keep a written trace of it, so I created a GitHub Issue.
There are multiple benefits in using GitHub Issues. Firstly, these issues are stored in the repository, making them easily accessible. Additionally, writing allows us to express our thoughts clearly, illustrate them with examples, and review proposals from others. By doing so, we can see the progression of the discussion and how we reached the final decision.
We took a few days to discuss the proposal and make the necessary changes, but the effort paid off. By involving everyone, we made a more informed decision that everyone knows about. More oral meetings or rushing the decision alone would not have saved time in the long run.
At the conclusion of our discussions, we have to write down and document our decisions. As this affects the entire project scope, we added the schema and the two paragraphs above in our README.md
. If the decision was related to a specific feature, weβll document it within the @moduledoc
of the featureβs context file, ensuring relevance and easy access.
What I learned
This work not only enhanced the day-to-day experience of all developers on the project, but it allowed me to grow both as a developer and tech lead.
Firstly, the journey from navigating a convoluted project structure to a clean, coherent, and balanced architecture taught me the value of versatility in project design. This middle ground proved to be a sustainable solution, marrying simplicity with scalability.
Moreover, the iterative process really helped me understand each approachβs strengths and weaknesses. By examining the problem from multiple perspectives, seeking out new ideas online, and testing them on our codebaseβ¦ I learned a lot about the language, about project design, but also about my own project!
Finally, the decision-making process highlighted the importance of transparency and active team participation. Using GitHub Issues for documentation promoted clarity, made sure every team member could share their thoughts, and kept everyone aware of the final decision.
It also gave me confidence in my ability to make important decisions for this project, even when it requires reworking the entire architecture.
Possible improvement
Looking back, there are a few things that we could have done better :
- Formalizing the process. Adopting a standardized template for Request For Comments (RFC) issues will streamline our discussions when we are working on a big feature or a deep rework. This template should include sections for the issue description, proposed solution, alternatives considered, and the anticipated impact.
- Facilitating engagement. Starting the talk with a set of prepared questions could make it easier for everyone to get involved and share their thoughts. As stated before, involving as many developers as possible not only improves the quality of the solution, but makes it known and understood by everyone.
- Documentation and testing from the start. Opening a Pull Request (PR) alongside the issue, containing an initial draft of the documentation and key test files, will allow us to visualize and refine our ideas through tangible examples. This approach not only facilitates more concrete feedback but also ensures that the documentation and tests evolve alongside the decision-making process. It will also make the job easier for the developer developing the solution afterward.
Weβll see if these strategies can make our decision-making and documentation better, while fostering a more collaborative and efficient environmentβ¦ Or if they will slow down and complicate our decision process.
See you on the next RFC issue! ππ§