elixir-code-smells
Catalog of Elixir-specific code smells
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: springer.com -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.6%) to scientific vocabulary
Keywords
Repository
Catalog of Elixir-specific code smells
Basic Info
- Host: GitHub
- Owner: lucasvegi
- License: mit
- Language: Elixir
- Default Branch: main
- Homepage: https://doi.org/10.1007/s10664-023-10343-6
- Size: 4.55 MB
Statistics
- Stars: 1,492
- Watchers: 40
- Forks: 55
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Catalog of Elixir-specific code smells
Table of Contents
- Introduction
- Design-related smells
- GenServer Envy
- Agent Obsession
- Unsupervised process
- Large messages
- Unrelated multi-clause function
- Complex extractions in clauses [^*]
- Using exceptions for control-flow
- Untested polymorphic behaviors
- Code organization by process
- Large code generation by macros [^*]
- Data manipulation by migration
- Using App Configuration for libraries
- Compile-time global configuration
- "Use" instead of "import"
- Low-level concerns smells
- Traditional code smells
- About
- Acknowledgments
[^]: These code smells were suggested by the Elixir community. [^*]: This code smell emerged from a study with mining software repositories (MSR).
Introduction
Elixir is a functional programming language whose popularity is rising in the industry link. However, there are few works in the scientific literature focused on studying the internal quality of systems implemented in this language.
In order to better understand the types of sub-optimal code structures that can harm the internal quality of Elixir systems, we scoured websites, blogs, forums, and videos (grey literature review), looking for specific code smells for Elixir that are discussed by its developers.
As a result of this investigation, we have initially proposed a catalog of 18 new smells that are specific to Elixir systems. After that, 1 new smell emerged from a study with mining software repositories (MSR) performed by us, and other smells are being suggested by the community, so this catalog is constantly being updated (currently 23 smells). These code smells are categorized into two different groups (design-related and low-level concerns), according to the type of impact and code extent they affect. This catalog of Elixir-specific code smells is presented below. Each code smell is documented using the following structure:
- Name: Unique identifier of the code smell. This name is important to facilitate communication between developers;
- Category: The portion of code affected by smell and its severity;
- Problem: How the code smell can harm code quality and what impacts this can have for developers;
- Example: Code and textual descriptions to illustrate the occurrence of the code smell;
Refactoring: Examples of refactored code are presented to illustrate higher-quality alternatives compared to smelly code;<!--Ways to change smelly code in order to improve its qualities.-->
Treatments: How to remove a code smell in a disciplined way with the assistance of refactoring strategies. When a refactoring should be used alone, it is listed in its own bullet point (i.e., •). Conversely, when a refactoring is part of a sequence of operations to assist the removal, it is listed using a pipeline to define its order (e.g., Refactoring1 |> Refactoring2 |> Refactoring3 |> etc.). This disciplined way to refactor a smell will help you change your code one small step at a time, thus minimizing the chances of introducing bugs or altering the original behavior of the system. All the refactoring strategies mapped to the code smells are part of our Catalog of Elixir Refactorings.
In addition to the Elixir-specific code smells, our catalog also documents 12 traditional code smells discussed in the context of Elixir systems.
The objective of this catalog of code smells is to instigate the improvement of the quality of code developed in Elixir. For this reason, we are interested in knowing Elixir's community opinion about these code smells: Do you agree that these code smells can be harmful? Have you seen any of them in production code? Do you have any suggestions about some Elixir-specific code smell not cataloged by us?...
Please feel free to make pull requests and suggestions (Issues tab). We want to hear from you!
Design-related smells
Design-related smells are more complex, affect a coarse-grained code element, and are therefore harder to detect. In this section, 14 different smells classified as design-related are explained and exemplified:
GenServer Envy
Category: Design-related smell.
Problem: In Elixir, processes can be primitively created by
Kernel.spawn/1,Kernel.spawn/3,Kernel.spawn_link/1andKernel.spawn_link/3functions. Although it is possible to create them this way, it is more common to use abstractions (e.g.,Agent,Task, andGenServer) provided by Elixir to create processes. The use of each specific abstraction is not a code smell in itself; however, there can be trouble when either aTaskorAgentis used beyond its suggested purposes, being treated like aGenServer.Example: As shown next,
AgentandTaskare abstractions to create processes with specialized purposes. In contrast,GenServeris a more generic abstraction used to create processes for many different purposes:Agent: As Elixir works on the principle of immutability, by default no value is shared between multiple places of code, enabling read and write as in a global variable. AnAgentis a simple process abstraction focused on solving this limitation, enabling processes to share state.Task: This process abstraction is used when we only need to execute some specific action asynchronously, often in an isolated way, without communication with other processes.GenServer: This is the most generic process abstraction. The main benefit of this abstraction is explicitly segregating the server and the client roles, thus providing a better API for the organization of processes communication. Besides that, aGenServercan also encapsulate state (like anAgent), provide sync and async calls (like aTask), and more.
Examples of this code smell appear when Agents or Tasks are used for general purposes and not only for specialized ones such as their documentation suggests. To illustrate some smell occurrences, we will cite two specific situations. 1) When a Task is used not only to async execute an action, but also to frequently exchange messages with other processes; 2) When an Agent, beside sharing some global value between processes, is also frequently used to execute isolated tasks that are not of interest to other processes.
Refactoring: When an
AgentorTaskgoes beyond its suggested use cases and becomes painful, it is better to refactor it into aGenServer.Treatments:
Agent Obsession
Category: Design-related smell.
Problem: In Elixir, an
Agentis a process abstraction focused on sharing information between processes by means of message passing. It is a simple wrapper around shared information, thus facilitating its read and update from any place in the code. The use of anAgentto share information is not a code smell in itself; however, when the responsibility for interacting directly with anAgentis spread across the entire system, this can be problematic. This bad practice can increase the difficulty of code maintenance and make the code more prone to bugs.Example: The following code seeks to illustrate this smell. The responsibility for interacting directly with the
Agentis spread across four different modules (i.e,A,B,C, andD).
elixir
defmodule A do
#...
def update(pid) do
#...
Agent.update(pid, fn _list -> 123 end)
#...
end
end
elixir
defmodule B do
#...
def update(pid) do
#...
Agent.update(pid, fn content -> %{a: content} end)
#...
end
end
elixir
defmodule C do
#...
def update(pid) do
#...
Agent.update(pid, fn content -> [:atom_value | [content]] end)
#...
end
end
elixir
defmodule D do
#...
def get(pid) do
#...
Agent.get(pid, fn content -> content end)
#...
end
end
This spreading of responsibility can generate duplicated code and make code maintenance more difficult. Also, due to the lack of control over the format of the shared data, complex composed data can be shared. This freedom to use any format of data is dangerous and can induce developers to introduce bugs.
```elixir # start an agent with initial state of an empty list iex(1)> {:ok, agent} = Agent.start_link fn -> [] end {:ok, #PID<0.135.0>}
# many data format (i.e., List, Map, Integer, Atom) are # combined through direct access spread across the entire system iex(2)> A.update(agent) iex(3)> B.update(agent) iex(4)> C.update(agent)
# state of shared information iex(5)> D.get(agent) [:atom_value, %{a: 123}] ```
- Refactoring: Instead of spreading direct access to an
Agentover many places in the code, it is better to refactor this code by centralizing the responsibility for interacting with anAgentin a single module. This refactoring improves the maintainability by removing duplicated code; it also allows you to limit the accepted format for shared data, reducing bug-proneness. As shown below, the moduleKV.Bucketis centralizing the responsibility for interacting with theAgent. Any other place in the code that needs to access shared data must now delegate this action toKV.Bucket. Also,KV.Bucketnow only allows data to be shared inMapformat.
```elixir defmodule KV.Bucket do use Agent
@doc """
Starts a new bucket.
"""
def start_link(_opts) do
Agent.start_link(fn -> %{} end)
end
@doc """
Gets a value from the `bucket` by `key`.
"""
def get(bucket, key) do
Agent.get(bucket, &Map.get(&1, key))
end
@doc """
Puts the `value` for the given `key` in the `bucket`.
"""
def put(bucket, key, value) do
Agent.update(bucket, &Map.put(&1, key, value))
end
end ```
The following are examples of how to delegate access to shared data (provided by an Agent) to KV.Bucket.
``elixir
# start an agent through aKV.Bucket`
iex(1)> {:ok, bucket} = KV.Bucket.start_link(%{})
{:ok, #PID<0.114.0>}
# add shared values to the keys milk and beer
iex(2)> KV.Bucket.put(bucket, "milk", 3)
iex(3)> KV.Bucket.put(bucket, "beer", 7)
# accessing shared data of specific keys iex(4)> KV.Bucket.get(bucket, "beer") 7 iex(5)> KV.Bucket.get(bucket, "milk") 3 ```
These examples are based on code written in Elixir's official documentation. Source: link
Treatments:
Unsupervised process
Category: Design-related smell.
Problem: In Elixir, creating a process outside a supervision tree is not a code smell in itself. However, when code creates a large number of long-running processes outside a supervision tree, this can make visibility and monitoring of these processes difficult, preventing developers from fully controlling their applications.
Example: The following code example seeks to illustrate a library responsible for maintaining a numerical
Counterthrough aGenServerprocess outside a supervision tree. Multiple counters can be created simultaneously by a client (one process for each counter), making these unsupervised processes difficult to manage. This can cause problems with the initialization, restart, and shutdown of a system.
```elixir defmodule Counter do use GenServer
@moduledoc """
Global counter implemented through a GenServer process
outside a supervision tree.
"""
@doc """
Function to create a counter.
initial_value: any integer value.
pid_name: optional parameter to define the process name.
Default is Counter.
"""
def start(initial_value, pid_name \\ __MODULE__)
when is_integer(initial_value) do
GenServer.start(__MODULE__, initial_value, name: pid_name)
end
@doc """
Function to get the counter's current value.
pid_name: optional parameter to inform the process name.
Default is Counter.
"""
def get(pid_name \\ __MODULE__) do
GenServer.call(pid_name, :get)
end
@doc """
Function to changes the counter's current value.
Returns the updated value.
value: amount to be added to the counter.
pid_name: optional parameter to inform the process name.
Default is Counter.
"""
def bump(value, pid_name \\ __MODULE__) do
GenServer.call(pid_name, {:bump, value})
get(pid_name)
end
## Callbacks
@impl true
def init(counter) do
{:ok, counter}
end
@impl true
def handle_call(:get, _from, counter) do
{:reply, counter, counter}
end
def handle_call({:bump, value}, _from, counter) do
{:reply, counter, counter + value}
end
end
#...Use examples...
iex(1)> Counter.start(0) {:ok, #PID<0.115.0>}
iex(2)> Counter.get() 0
iex(3)> Counter.start(15, C2) {:ok, #PID<0.120.0>}
iex(4)> Counter.get(C2) 15
iex(5)> Counter.bump(-3, C2) 12
iex(6)> Counter.bump(7) 7 ```
- Refactoring: To ensure that clients of a library have full control over their systems, regardless of the number of processes used and the lifetime of each one, all processes must be started inside a supervision tree. As shown below, this code uses a
Supervisorlink as a supervision tree. When this Elixir application is started, two different counters (CounterandC2) are also started as child processes of theSupervisornamedApp.Supervisor. Both are initialized with zero. By means of this supervision tree, it is possible to manage the lifecycle of all child processes (e.g., stopping or restarting each one), improving the visibility of the entire app.
```elixir defmodule SupervisedProcess.Application do use Application
@impl true
def start(_type, _args) do
children = [
# The counters are Supervisor children started via Counter.start(0).
%{
id: Counter,
start: {Counter, :start, [0]}
},
%{
id: C2,
start: {Counter, :start, [0, C2]}
}
]
opts = [strategy: :one_for_one, name: App.Supervisor]
Supervisor.start_link(children, opts)
end
end
#...Use examples...
iex(1)> Supervisor.count_children(App.Supervisor) %{active: 2, specs: 2, supervisors: 0, workers: 2}
iex(2)> Counter.get(Counter) 0
iex(3)> Counter.get(C2) 0
iex(4)> Counter.bump(7, Counter) 7
iex(5)> Supervisor.terminatechild(App.Supervisor, Counter) iex(6)> Supervisor.countchildren(App.Supervisor) %{active: 1, specs: 2, supervisors: 0, workers: 2} #only one active
iex(7)> Counter.get(Counter) #Error because it was previously terminated ** (EXIT) no process: the process is not alive...
iex(8)> Supervisor.restart_child(App.Supervisor, Counter) iex(9)> Counter.get(Counter) #after the restart, this process can be accessed again 0 ```
These examples are based on codes written in Elixir's official documentation. Source: link
Large messages
Category: Design-related smell.
Note: Formerly known as "Large messages between processes".
Problem: In Elixir, processes run in an isolated manner, often concurrently with other. Communication between different processes is performed via message passing. The exchange of messages between processes is not a code smell in itself; however, when processes exchange messages, their contents are copied between them. For this reason, if a huge structure is sent as a message from one process to another, the sender can become blocked, compromising performance. If these large message exchanges occur frequently, the prolonged and frequent blocking of processes can cause a system to behave anomalously.
Example: The following code is composed of two modules which will each run in a different process. As the names suggest, the
Sendermodule has a function responsible for sending messages from one process to another (i.e.,send_msg/3). TheReceivermodule has a function to create a process to receive messages (i.e.,create/0) and another one to handle the received messages (i.e.,run/0). If a huge structure, such as a list with 1000000 different values, is sent frequently fromSendertoReceiver, the impacts of this smell could be felt.
```elixir defmodule Receiver do @doc """ Function for receiving messages from processes. """ def run do receive do {:msg, msgreceived} -> msgreceived {_, _} -> "won't match" end end
@doc """
Create a process to receive a message.
Messages are received in the run() function of Receiver.
"""
def create do
spawn(Receiver, :run, [])
end
end ```
elixir
defmodule Sender do
@doc """
Function for sending messages between processes.
pid_receiver: message recipient.
msg: messages of any type and size can be sent.
id_msg: used by receiver to decide what to do
when a message arrives.
Default is the atom :msg
"""
def send_msg(pid_receiver, msg, id_msg \\ :msg) do
send(pid_receiver, {id_msg, msg})
end
end
Examples of large messages between processes:
```elixir iex(1)> pid = Receiver.create #PID<0.144.0>
#Simulating a message with large content - List with length 1000000 iex(2)> msg = %{from: inspect(self()), to: inspect(pid), content: Enum.tolist(1..1000_000)}
iex(3)> Sender.send_msg(pid, msg) {:msg, %{ content: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, ...], from: "#PID<0.105.0>", to: "#PID<0.144.0>" }} ```
This example is based on a original code by Samuel Mullen. Source: link
Unrelated multi-clause function
Category: Design-related smell.
Note: Formerly known as "Complex multi-clause function".
Problem: Using multi-clause functions in Elixir, to group functions of the same name, is not a code smell in itself. However, due to the great flexibility provided by this programming feature, some developers may abuse the number of guard clauses and pattern matches to group unrelated functionality.
Example: A recurrent example of abusive use of the multi-clause functions is when we’re trying to mix too much-unrelated business logic into the function definitions. This makes it difficult to read and understand the logic involved in the functions, which may impair code maintainability. Some developers use documentation mechanisms such as
@docannotations to compensate for poor code readability, but unfortunately, with a multi-clause function, we can only use these annotations once per function name, particularly on the first or header function. As shown next, all other variations of the function need to be documented only with comments, a mechanism that cannot automate tests, leaving the code prone to bugs.
```elixir @doc """ Update sharp product with 0 or empty count
## Examples
iex> Namespace.Module.update(...)
expected result...
""" def update(%Product{count: nil, material: material}) when material in ["metal", "glass"] do # ... end
# update blunt product def update(%Product{count: count, material: material}) when count > 0 and material in ["metal", "glass"] do # ... end
# update animal... def update(%Animal{count: 1, skin: skin}) when skin in ["fur", "hairy"] do # ... end ```
- Refactoring: As shown below, a possible solution to this smell is to break the business rules that are mixed up in a single unrelated multi-clause function in several different simple functions. Each function can have a specific
@doc, describing its behavior and parameters received. While this refactoring sounds simple, it can have a lot of impact on the function's current clients, so be careful!
```elixir @doc """ Update sharp product
## Parameter
struct: %Product{...}
## Examples
iex> Namespace.Module.update_sharp_product(%Product{...})
expected result...
""" def updatesharpproduct(struct) do # ... end
@doc """ Update blunt product
## Parameter
struct: %Product{...}
## Examples
iex> Namespace.Module.update_blunt_product(%Product{...})
expected result...
""" def updatebluntproduct(struct) do # ... end
@doc """ Update animal
## Parameter
struct: %Animal{...}
## Examples
iex> Namespace.Module.update_animal(%Animal{...})
expected result...
""" def update_animal(struct) do # ... end ```
This example is based on a original code by Syamil MJ (@syamilmj). Source: link
Treatments:
Complex extractions in clauses
Category: Design-related smell.
Note: This smell was suggested by the community via issues (#9).
Problem: When we use multi-clause functions, it is possible to extract values in the clauses for further usage and for pattern matching/guard checking. This extraction itself does not represent a code smell, but when you have too many clauses or too many arguments, it becomes hard to know which extracted parts are used for pattern/guards and what is used only inside the function body. This smell is related to Unrelated multi-clause function, but with implications of its own. It impairs the code readability in a different way.
Example: The following code, although simple, tries to illustrate the occurrence of this code smell. The multi-clause function
drive/1is extracting fields of an%User{}struct for usage in the clause expression (e.g.age) and for usage in the function body (e.g.,name). Ideally, a function should not mix pattern matching extractions for usage in its clauses expressions and also in the function body.
```elixir def drive(%User{name: name, age: age}) when age >= 18 do "#{name} can drive" end
def drive(%User{name: name, age: age}) when age < 18 do "#{name} cannot drive" end ```
While the example is small and looks like a clear code, try to imagine a situation where drive/1 was more complex, having many more clauses, arguments, and extractions. This is the really smelly code!
- Refactoring: As shown below, a possible solution to this smell is to extract only pattern/guard related variables in the signature once you have many arguments or multiple clauses:
```elixir def drive(%User{age: age} = user) when age >= 18 do %User{name: name} = user "#{name} can drive" end
def drive(%User{age: age} = user) when age < 18 do %User{name: name} = user "#{name} cannot drive" end ```
This example and the refactoring are proposed by José Valim (@josevalim)
Treatments:
Using exceptions for control-flow
Category: Design-related smell.
Note: Formerly known as "Exceptions for control-flow".
Problem: This smell refers to code that forces developers to handle exceptions for control-flow. Exception handling itself does not represent a code smell, but this should not be the only alternative available to developers to handle an error in client code. When developers have no freedom to decide if an error is exceptional or not, this is considered a code smell.
Example: An example of this code smell, as shown below, is when a library (e.g.
MyModule) forces its clients to usetry .. rescuestatements to capture and evaluate errors. This library does not allow developers to decide if an error is exceptional or not in their applications.
elixir
defmodule MyModule do
def janky_function(value) do
if is_integer(value) do
#...
"Result..."
else
raise RuntimeError, message: "invalid argument. Is not integer!"
end
end
end
```elixir defmodule Client do
# Client forced to use exceptions for control-flow.
def foo(arg) do
try do
value = MyModule.janky_function(arg)
"All good! #{value}."
rescue
e in RuntimeError ->
reason = e.message
"Uh oh! #{reason}."
end
end
end
#...Use examples...
iex(1)> Client.foo(1) "All good! Result...."
iex(2)> Client.foo("lucas") "Uh oh! invalid argument. Is not integer!." ```
- Refactoring: Library authors should guarantee that clients are not required to use exceptions for control-flow in their applications. As shown below, this can be done by refactoring the library
MyModule, providing two versions of the function that forces clients to use exceptions for control-flow (e.g.,janky_function). 1) a version with the raised exceptions should have the same name as the smelly one, but with a trailing!(i.e.,janky_function!); 2) Another version, without raised exceptions, should have a name identical to the original version (i.e.,janky_function), and should return the result wrapped in a tuple.
```elixir defmodule MyModule do @moduledoc """ Refactored library """
@doc """
Refactored version without exceptions for control-flow.
"""
def janky_function(value) do
if is_integer(value) do
#...
{:ok, "Result..."}
else
{:error, "invalid argument. Is not integer!"}
end
end
def janky_function!(value) do
case janky_function(value) do
{:ok, result} ->
result
{:error, message} ->
raise RuntimeError, message: message
end
end
end ```
This refactoring gives clients more freedom to decide how to proceed in the event of errors, defining what is exceptional or not in different situations. As shown next, when an error is not exceptional, clients can use specific control-flow structures, such as the case statement along with pattern matching.
```elixir defmodule Client do
# Clients now can also choose to use control-flow structures
# for control-flow when an error is not exceptional.
def foo(arg) do
case MyModule.janky_function(arg) do
{:ok, value} -> "All good! #{value}."
{:error, reason} -> "Uh oh! #{reason}."
end
end
end
#...Use examples...
iex(1)> Client.foo(1) "All good! Result...."
iex(2)> Client.foo("lucas") "Uh oh! invalid argument. Is not integer!." ```
This example is based on code written by Tim Austin neenjaw and Angelika Tyborska angelikatyborska. Source: link
Treatments:
Untested polymorphic behaviors
Category: Design-related smell.
Problem: This code smell refers to functions that have protocol-dependent parameters and are therefore polymorphic. A polymorphic function itself does not represent a code smell, but some developers implement these generic functions without accompanying guard clauses, allowing to pass parameters that do not implement the required protocol or that have no meaning.
Example: An instance of this code smell happens when a function uses
to_string()to convert data received by parameter. The functionto_string()uses the protocolString.Charsfor conversions. Many Elixir data types (e.g.,BitString,Integer,Float,URI) implement this protocol. However, as shown below, other Elixir data types (e.g.,Map) do not implement it and can cause an error indasherize/1function. Depending on the situation, this behavior can be desired or not. Besides that, it may not make sense to dasherize aURIor a number as shown next.
```elixir defmodule CodeSmells do def dasherize(data) do tostring(data) |> String.replace("", "-") end end
#...Use examples...
iex(1)> CodeSmells.dasherize("Lucas_Vegi") "Lucas-Vegi"
iex(2)> CodeSmells.dasherize(10) #<= Makes sense? "10"
iex(3)> CodeSmells.dasherize(URI.parse("http://www.code_smells.com")) #<= Makes sense? "http://www.code-smells.com"
iex(4)> CodeSmells.dasherize(%{lastname: "vegi", firstname: "lucas"}) ** (Protocol.UndefinedError) protocol String.Chars not implemented for %{firstname: "lucas", lastname: "vegi"} of type Map ```
- Refactoring: There are two main alternatives to improve code affected by this smell. 1) You can either remove the protocol use (i.e.,
to_string/1), by adding multi-clauses ondasherize/1or just remove it; or 2) You can document thatdasherize/1uses the protocolString.Charsfor conversions, showing its consequences. As shown next, we refactored using the first alternative, removing the protocol and restrictingdasherize/1parameter only to desired data types (i.e.,BitStringandAtom). Besides that, we use@docto validatedasherize/1for desired inputs and to document the behavior to some types that we think don't make sense for the function (e.g.,IntegerandURI).
```elixir defmodule CodeSmells do @doc """ Function that converts underscores to dashes.
## Parameter
data: only BitString and Atom are supported.
## Examples
iex> CodeSmells.dasherize(:lucas_vegi)
"lucas-vegi"
iex> CodeSmells.dasherize("Lucas_Vegi")
"Lucas-Vegi"
iex> CodeSmells.dasherize(%{last_name: "vegi", first_name: "lucas"})
** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1
iex> CodeSmells.dasherize(URI.parse("http://www.code_smells.com"))
** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1
iex> CodeSmells.dasherize(10)
** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1
"""
def dasherize(data) when is_atom(data) do
dasherize(Atom.to_string(data))
end
def dasherize(data) when is_binary(data) do
String.replace(data, "_", "-")
end
end
#...Use examples...
iex(1)> CodeSmells.dasherize(:lucas_vegi) "lucas-vegi"
iex(2)> CodeSmells.dasherize("Lucas_Vegi") "Lucas-Vegi"
iex(3)> CodeSmells.dasherize(10) ** (FunctionClauseError) no function clause matching in CodeSmells.dasherize/1 ```
This example is based on code written by José Valim (@josevalim). Source: link
Treatments:
Code organization by process
Category: Design-related smell.
Problem: This smell refers to code that is unnecessarily organized by processes. A process itself does not represent a code smell, but it should only be used to model runtime properties (e.g., concurrency, access to shared resources, event scheduling). When a process is used for code organization, it can create bottlenecks in the system.
Example: An example of this code smell, as shown below, is a library that implements arithmetic operations (e.g., add, subtract) by means of a
GenSeverprocesslink. If the number of calls to this single process grows, this code organization can compromise the system performance, therefore becoming a bottleneck.
```elixir defmodule Calculator do use GenServer
@moduledoc """
Calculator that performs two basic arithmetic operations.
This code is unnecessarily organized by a GenServer process.
"""
@doc """
Function to perform the sum of two values.
"""
def add(a, b, pid) do
GenServer.call(pid, {:add, a, b})
end
@doc """
Function to perform subtraction of two values.
"""
def subtract(a, b, pid) do
GenServer.call(pid, {:subtract, a, b})
end
def init(init_arg) do
{:ok, init_arg}
end
def handle_call({:add, a, b}, _from, state) do
{:reply, a + b, state}
end
def handle_call({:subtract, a, b}, _from, state) do
{:reply, a - b, state}
end
end
# Start a generic server process iex(1)> {:ok, pid} = GenServer.start_link(Calculator, :init) {:ok, #PID<0.132.0>}
#...Use examples... iex(2)> Calculator.add(1, 5, pid) 6
iex(3)> Calculator.subtract(2, 3, pid) -1 ```
- Refactoring: In Elixir, as shown next, code organization must be done only by modules and functions. Whenever possible, a library should not impose specific behavior (such as parallelization) on its clients. It is better to delegate this behavioral decision to the developers of clients, thus increasing the potential for code reuse of a library.
```elixir defmodule Calculator do def add(a, b) do a + b end
def subtract(a, b) do
a - b
end
end
#...Use examples...
iex(1)> Calculator.add(1, 5) 6
iex(2)> Calculator.subtract(2, 3) -1 ```
This example is based on code provided in Elixir's official documentation. Source: link
Treatments:
Large code generation by macros
Category: Design-related smell.
Note: This smell was suggested by the community via issues (#13).
Problem: This code smell is related to
macrosthat generate too much code. When amacroprovides a large code generation, it impacts how the compiler or the runtime works. The reason for this is that Elixir may have to expand, compile, and execute a code multiple times, which will make compilation slower.Example: The code shown below is an example of this smell. Imagine you are defining a router for a web application, where you could have macros like
get/2. On every invocation of the macro, which can be hundreds, the code insideget/2will be expanded and compiled, which can generate a large volume of code in total.
```elixir defmodule Routes do ...
defmacro get(route, handler) do
quote do
route = unquote(route)
handler = unquote(handler)
if not is_binary(route) do
raise ArgumentError, "route must be a binary"
end
if not is_atom(handler) do
raise ArgumentError, "route must be a module"
end
@store_route_for_compilation {route, handler}
end
end
end ```
- Refactoring: To remove this code smell, the developer must simplify the
macro, delegating to other functions part of its work. As shown below, by encapsulating in the function__define__/3the functionality pre-existing inside thequote, we reduce the code that is expanded and compiled on every invocation of themacro, and instead we dispatch to a function to do the bulk of the work.
```elixir defmodule Routes do ...
defmacro get(route, handler) do
quote do
Routes.__define__(__MODULE__, unquote(route), unquote(handler))
end
end
def __define__(module, route, handler) do
if not is_binary(route) do
raise ArgumentError, "route must be a binary"
end
if not is_atom(handler) do
raise ArgumentError, "route must be a module"
end
Module.put_attribute(module, :store_route_for_compilation, {route, handler})
end
end ```
This example and the refactoring are proposed by José Valim (@josevalim)
Treatments:
Data manipulation by migration
Category: Design-related smell.
Problem: This code smell refers to modules that perform both data and structural changes in a database schema via
Ecto.Migrationlink. Migrations must be used exclusively to modify a database schema over time (e.g., by including or excluding columns and tables). When this responsibility is mixed with data manipulation code, the module becomes less cohesive, more difficult to test, and therefore more prone to bugs.Example: An example of this code smell is when an
Ecto.Migrationis used simultaneously to alter a table, adding a new column to it, and also to update all pre-existing data in that table, assigning a value to this new column. As shown below, in addition to adding theis_custom_shopcolumn in theguitarstable, thisEcto.Migrationchanges the value of this column for some specific guitar models.
```elixir defmodule GuitarStore.Repo.Migrations.AddIsCustomShopToGuitars do use Ecto.Migration
import Ecto.Query
alias GuitarStore.Inventory.Guitar
alias GuitarStore.Repo
@doc """
A function that modifies the structure of table "guitars",
adding column "is_custom_shop" to it. By default, all data
pre-stored in this table will have the value false stored
in this new column.
Also, this function updates the "is_custom_shop" column value
of some guitar models to true.
"""
def change do
alter table("guitars") do
add :is_custom_shop, :boolean, default: false
end
create index("guitars", ["is_custom_shop"])
custom_shop_entries()
|> Enum.map(&update_guitars/1)
end
@doc """
A function that updates values of column "is_custom_shop" to true.
"""
defp update_guitars({make, model, year}) do
from(g in Guitar,
where: g.make == ^make and g.model == ^model and g.year == ^year,
select: g
)
|> Repo.update_all(set: [is_custom_shop: true])
end
@doc """
Function that defines which guitar models that need to have the values
of the "is_custom_shop" column updated to true.
"""
defp custom_shop_entries() do
[
{"Gibson", "SG", 1999},
{"Fender", "Telecaster", 2020}
]
end
end ```
You can run this smelly migration above by going to the root of your project and typing the next command via console:
elixir
mix ecto.migrate
- Refactoring: To remove this code smell, it is necessary to separate the data manipulation in a
mix tasklink different from the module that performs the structural changes in the database viaEcto.Migration. This separation of responsibilities is a best practice for increasing code testability. As shown below, the moduleAddIsCustomShopToGuitarsnow useEcto.Migrationonly to perform structural changes in the database schema:
```elixir defmodule GuitarStore.Repo.Migrations.AddIsCustomShopToGuitars do use Ecto.Migration
@doc """
A function that modifies the structure of table "guitars",
adding column "is_custom_shop" to it. By default, all data
pre-stored in this table will have the value false stored
in this new column.
"""
def change do
alter table("guitars") do
add :is_custom_shop, :boolean, default: false
end
create index("guitars", ["is_custom_shop"])
end
end ```
Furthermore, the new mix task PopulateIsCustomShop, shown next, has only the responsibility to perform data manipulation, thus improving testability:
```elixir defmodule Mix.Tasks.PopulateIsCustomShop do @shortdoc "Populates iscustomshop column"
use Mix.Task
import Ecto.Query
alias GuitarStore.Inventory.Guitar
alias GuitarStore.Repo
@requirements ["app.start"]
def run(_) do
custom_shop_entries()
|> Enum.map(&update_guitars/1)
end
defp update_guitars({make, model, year}) do
from(g in Guitar,
where: g.make == ^make and g.model == ^model and g.year == ^year,
select: g
)
|> Repo.update_all(set: [is_custom_shop: true])
end
defp custom_shop_entries() do
[
{"Gibson", "SG", 1999},
{"Fender", "Telecaster", 2020}
]
end
end ```
You can run this mix task above by typing the next command via console:
elixir
mix populate_is_custom_shop
This example is based on code originally written by Carlos Souza. Source: link
Treatments:
Using App Configuration for libraries
Category: Design-related smells.
Note: Formerly known as "App configuration for code libs".
Problem: The
Application Environmentlink is a mechanism that can be used to parameterize values that will be used in several different places in a system implemented in Elixir. This parameterization mechanism can be very useful and therefore is not considered a code smell by itself. However, whenApplication Environmentsare used as a mechanism for configuring a library's functions, this can make these functions less flexible, making it impossible for a library-dependent application to reuse its functions with different behaviors in different places in the code. Libraries are created to foster code reuse, so this limitation imposed by this parameterization mechanism can be problematic in this scenario.Example: The
DashSplittermodule represents a library that configures the behavior of its functions through the globalApplication Environmentmechanism. These configurations are concentrated in theconfig/config.exsfile, shown below:
```elixir import Config
config :app_config, parts: 3
importconfig "#{configenv()}.exs" ```
One of the functions implemented by the DashSplitter library is split/1. This function has the purpose of separating a string received via parameter into a certain number of parts. The character used as a separator in split/1 is always "-" and the number of parts the string is split into is defined globally by the Application Environment. This value is retrieved by the split/1 function by calling Application.fetch_env!/2, as shown next:
elixir
defmodule DashSplitter do
def split(string) when is_binary(string) do
parts = Application.fetch_env!(:app_config, :parts) # <= retrieve parameterized value
String.split(string, "-", parts: parts) # <= parts: 3
end
end
Due to this type of parameterized value used by the DashSplitter library, all applications dependent on it can only use the split/1 function with identical behavior in relation to the number of parts generated by string separation. Currently, this value is equal to 3, as we can see in the use examples shown below:
```elixir iex(1)> DashSplitter.split("Lucas-Francisco-Vegi") ["Lucas", "Francisco", "Vegi"]
iex(2)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi") ["Lucas", "Francisco", "da-Matta-Vegi"] ```
- Refactoring: To remove this code smell and make the library more adaptable and flexible, this type of configuration must be performed via parameters in function calls. The code shown below performs the refactoring of the
split/1function by adding a new optional parameter of typeKeyword list. With this new parameter it is possible to modify the default behavior of the function at the time of its call, allowing multiple different ways of usingsplit/2within the same application:
```elixir defmodule DashSplitter do def split(string, opts \ []) when isbinary(string) and islist(opts) do parts = Keyword.get(opts, :parts, 2) # <= default config of parts == 2 String.split(string, "-", parts: parts) end end
#...Use examples...
iex(1)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi", [parts: 5]) ["Lucas", "Francisco", "da", "Matta", "Vegi"]
iex(2)> DashSplitter.split("Lucas-Francisco-da-Matta-Vegi") #<= default config is used! ["Lucas", "Francisco-da-Matta-Vegi"] ```
These examples are based on code provided in Elixir's official documentation. Source: link
Treatments:
Compile-time global configuration
Category: Design-related smells.
Note: Formerly known as "Compile-time app configuration".
Problem: As explained in the description of Using App Configuration for libraries, the
Application Environmentcan be used to parameterize values in an Elixir system. Although it is not a good practice to use this mechanism in the implementation of libraries, sometimes this can be unavoidable. If these parameterized values are assigned tomodule attributes, it can be especially problematic. Asmodule attributevalues are defined at compile-time, when trying to assignApplication Environmentvalues to these attributes, warnings or errors can be triggered by Elixir. This happens because, when defining module attributes at compile time, theApplication Environmentis not yet available in memory.Example: The
DashSplittermodule represents a library. This module has an attribute@partsthat has its constant value defined at compile-time by callingApplication.fetch_env!/2. Thesplit/1function, implemented by this library, has the purpose of separating a string received via parameter into a certain number of parts. The character used as a separator insplit/1is always"-"and the number of parts the string is split into is defined by the module attribute@parts, as shown next:
```elixir defmodule DashSplitter do @parts Application.fetchenv!(:appconfig, :parts) # <= define module attribute # at compile-time def split(string) when is_binary(string) do String.split(string, "-", parts: @parts) #<= reading from a module attribute end
end ```
Due to this compile-time configuration based on the Application Environment mechanism, Elixir can raise warnings or errors, as shown next, during compilation:
```elixir warning: Application.fetchenv!/2 is discouraged in the module body, use Application.compileenv/3 instead...
** (ArgumentError) could not fetch application environment :parts for application :app_config because the application was not loaded nor configured ```
- Refactoring: To remove this code smell, when it is really unavoidable to use the
Application Environmentmechanism to configure library functions, this should be done at runtime and not during compilation. That is, instead of callingApplication.fetch_env!(:app_config, :parts)at compile-time to set@parts, this function must be called at runtime withinsplit/1. This will mitigate the risk thatApplication Environmentis not yet available in memory when it is necessary to use it. Another possible refactoring, as shown below, is to replace the use of theApplication.fetch_env!/2function to define@parts, with theApplication.compile_env/3. The third parameter ofApplication.compile_env/3defines a default value that is returned whenever thatApplication Environmentis not available in memory during the definition of@parts. This prevents Elixir from raising an error at compile-time:
```elixir defmodule DashSplitter do @parts Application.compileenv(:appconfig, :parts, 3) # <= default value 3 prevents an error!
def split(string) when is_binary(string) do
String.split(string, "-", parts: @parts) #<= reading from a module attribute
end
end ```
These examples are based on code provided in Elixir's official documentation. Source: link
Treatments:
Remark: This code smell can be detected by Credo, a static code analysis tool. During its checks, Credo raises this warning when this smell is found.
"Use" instead of "import"
Category: Design-related smells.
Note: Formerly known as "Dependency with "use" when an "import" is enough".
Problem: Elixir has mechanisms such as
import,alias, anduseto establish dependencies between modules. Establishing dependencies allows a module to call functions from other modules, facilitating code reuse. A code implemented with these mechanisms does not characterize a smell by itself; however, while theimportandaliasdirectives have lexical scope and only facilitate that a module to use functions of another, theusedirective has a broader scope, something that can be problematic. Theusedirective allows a module to inject any type of code into another, including propagating dependencies. In this way, using theusedirective makes code readability worse, because to understand exactly what will happen when it references a module, it is necessary to have knowledge of the internal details of the referenced module.Example: The code shown below is an example of this smell. Three different modules were defined --
ModuleA,Library, andClientApp.ClientAppis reusing code from theLibraryvia theusedirective, but is unaware of its internal details. Therefore, whenLibraryis referenced byClientApp, it injects intoClientAppall the content present in its__using__/1macro. Due to the decreased readability of the code and the lack of knowledge of the internal details of theLibrary,ClientAppdefines a local functionfoo/0. This will generate a conflict asModuleAalso has a functionfoo/0; whenClientAppreferencedLibraryvia theusedirective, it has a dependency forModuleApropagated to itself:
elixir
defmodule ModuleA do
def foo do
"From Module A"
end
end
```elixir defmodule Library do defmacro using(_opts) do quote do import ModuleA # <= propagating dependencies!
def from_lib do
"From Library"
end
end
end
def from_lib do
"From Library"
end
end ```
```elixir defmodule ClientApp do use Library
def foo do
"Local function from client app"
end
def from_client_app do
from_lib() <> " - " <> foo()
end
end ```
When we try to compile ClientApp, Elixir will detect the conflict and throw the following error:
```elixir iex(1)> c("client_app.ex")
** (CompileError) client_app.ex:4: imported ModuleA.foo/0 conflicts with local function ```
- Refactoring: To remove this code smell, it may be possible to replace
usewithaliasorimportwhen creating a dependency between an application and a library. This will make code behavior clearer, due to improved readability. In the following code,ClientAppwas refactored in this way, and with that, the conflict as previously shown no longer exists:
```elixir defmodule ClientApp do import Library
def foo do
"Local function from client app"
end
def from_client_app do
from_lib() <> " - " <> foo()
end
end
#...Uses example...
iex(1)> ClientApp.fromclientapp() "From Library - Local function from client app" ```
These examples are based on code provided in Elixir's official documentation. Source: link
Treatments:
Low-level concerns smells
Low-level concerns smells are more simple than design-related smells and affect a small part of the code. Next, all 9 different smells classified as low-level concerns are explained and exemplified:
Working with invalid data
Category: Low-level concerns smells.
Problem: This code smell refers to a function that does not validate its parameters' types and therefore can produce internal non-predicted behavior. When an error is raised inside a function due to an invalid parameter value, this can confuse the developers and make it harder to locate and fix the error.
Example: An example of this code smell is when a function receives an invalid parameter and then passes it to a function from a third-party library. This will cause an error (raised deep inside the library function), which may be confusing for the developer who is working with invalid data. As shown next, the function
foo/1is a client of a third-party library and doesn't validate its parameters at the boundary. In this way, it is possible that invalid data will be passed fromfoo/1to the library, causing a mysterious error.
```elixir defmodule MyApp do alias ThirdPartyLibrary, as: Library
def foo(invalid_data) do
#...some code...
Library.sum(1, invalid_data)
#...some code...
end
end
#...Use examples...
# with valid data is ok iex(1)> MyApp.foo(2) 3
#with invalid data cause a confusing error deep inside iex(2)> MyApp.foo("Lucas") ** (ArithmeticError) bad argument in arithmetic expression: 1 + "Lucas" :erlang.+(1, "Lucas") library.ex:3: ThirdPartyLibrary.sum/2 ```
- Refactoring: To remove this code smell, client code must validate input parameters at the boundary with the user, via guard clauses or pattern matching. This will prevent errors from occurring deeply, making them easier to understand. This refactoring will also allow libraries to be implemented without worrying about creating internal protection mechanisms. The next code illustrates the refactoring of
foo/1, removing this smell:
```elixir defmodule MyApp do alias ThirdPartyLibrary, as: Library
def foo(data) when is_integer(data) do
#...some code...
Library.sum(1, data)
#...some code...
end
end
#...Use examples...
#with valid data is ok iex(1)> MyApp.foo(2) 3
# with invalid data errors are easy to locate and fix iex(2)> MyApp.foo("Lucas") ** (FunctionClauseError) no function clause matching in MyApp.foo/1
The following arguments were given to MyApp.foo/1:
# 1
"Lucas"
my_app.ex:6: MyApp.foo/1
```
This example is based on code provided in Elixir's official documentation. Source: link
Treatments:
Complex branching
Category: Low-level concerns smell.
Note: Formerly known as "Complex API error handling".
Problem: When a function assumes the responsibility of handling multiple errors alone, it can increase its cyclomatic complexity (metric of control-flow) and become incomprehensible. This situation can configure a specific instance of "Long function", a traditional code smell, but has implications of its own. Under these circumstances, this function could get very confusing, difficult to maintain and test, and therefore bug-proneness.
Example: An example of this code smell is when a function uses the
casecontrol-flow structure or other similar constructs (e.g.,cond, orreceive) to handle multiple variations of response types returned by the same API endpoint. This practice can make the function more complex, long, and difficult to understand, as shown next.
elixir
def get_customer(customer_id) do
case get("/customers/#{customer_id}") do
{:ok, %Tesla.Env{status: 200, body: body}} -> {:ok, body}
{:ok, %Tesla.Env{body: body}} -> {:error, body}
{:error, _} = other -> other
end
end
Although get_customer/1 is not really long in this example, it could be. Thinking about this more complex scenario, where a large number of different responses can be provided to the same endpoint, is not a good idea to concentrate all on a single function. This is a risky scenario, where a little typo, or any problem introduced by the programmer in handling a response type, could eventually compromise the handling of all responses from the endpoint (if the function raises an exception, for example).
- Refactoring: As shown below, in this situation, instead of concentrating all handlings within the same function, creating a complex branching, it is better to delegate each branch (handling of a response type) to a different private function. In this way, the code will be cleaner, more concise, and readable.
```elixir def getcustomer(customerid) when isinteger(customerid) do case get("/customers/#{customerid}") do {:ok, %Tesla.Env{status: 200, body: body}} -> successapiresponse(body) {:ok, %Tesla.Env{body: body}} -> xerrorapiresponse(body) {:error, } = other -> yerrorapiresponse(other) end end
defp successapiresponse(body) do {:ok, body} end
defp xerrorapi_response(body) do {:error, body} end
defp yerrorapi_response(other) do other end ```
While this example of refactoring get_customer/1 might seem quite more verbose than the original code, remember to imagine a scenario where get_customer/1 is responsible for handling a number much larger than three different types of possible responses. This is the smelly scenario!
This example is based on code written by Zack MrDoops and Dimitar Panayotov dimitarvp. Source: link. We got suggestions from José Valim (@josevalim) on the refactoring.
Complex else clauses in with
Category: Low-level concerns smell.
Note: This smell was suggested by the community via issues (#7).
Problem: This code smell refers to
withstatements that flatten all its error clauses into a single complexelseblock. This situation is harmful to the code readability and maintainability because difficult to know from which clause the error value came.Example: An example of this code smell, as shown below, is a function
open_decoded_file/1that read a base 64 encoded string content from a file and returns a decoded binary string. This function uses awithstatement that needs to handle two possible errors, all of which are concentrated in a single complexelseblock.
elixir
def open_decoded_file(path) do
with {:ok, encoded} <- File.read(path),
{:ok, value} <- Base.decode64(encoded) do
value
else
{:error, _} -> :badfile
:error -> :badencoding
end
end
- Refactoring: As shown below, in this situation, instead of concentrating all error handlings within a single complex
elseblock, it is better to normalize the return types in specific private functions. In this way, due to its organization, the code will be cleaner and more readable.
```elixir def opendecodedfile(path) do with {:ok, encoded} <- fileread(path), {:ok, value} <- basedecode64(encoded) do value end end
defp file_read(path) do case File.read(path) do {:ok, contents} -> {:ok, contents} {:error, _} -> :badfile end end
defp base_decode64(contents) do case Base.decode64(contents) do {:ok, contents} -> {:ok, contents} :error -> :badencoding end end ```
This example and the refactoring are proposed by José Valim (@josevalim)
Treatments:
Alternative return types
Category: Low-level concerns smell.
Note: This smell was suggested by the community via issues (#6).
Problem: This code smell refers to functions that receive options (e.g.,
keyword list) parameters that drastically change its return type. Because options are optional and sometimes set dynamically, if they change the return type it may be hard to understand what the function actually returns.Example: An example of this code smell, as shown below, is when a library (e.g.
AlternativeInteger) has a multi-clause functionparse/2with many alternative return types. Depending on the options received as a parameter, the function will have a different return type.
```elixir defmodule AlternativeInteger do def parse(string, opts) when islist(opts) do case opts[:discardrest] do true -> #only an integer value convert from string parameter _ -> #another return type (e.g., tuple) end end
def parse(string, opts \\ :default) do
#another return type (e.g., tuple)
end
end
#...Use examples...
iex(1)> AlternativeInteger.parse("13") {13, "..."}
iex(2)> AlternativeInteger.parse("13", discard_rest: true) 13
iex(3)> AlternativeInteger.parse("13", discard_rest: false) {13, "..."} ```
- Refactoring: To refactor this smell, as shown next, it's better to add in the library a specific function for each return type (e.g.,
parse_no_rest/1), no longer delegating this to an options parameter.
```elixir defmodule AlternativeInteger do def parsenorest(string) do #only an integer value convert from string parameter end
def parse(string) do
#another return type (e.g., tuple)
end
end
#...Use examples...
iex(1)> AlternativeInteger.parse("13") {13, "..."}
iex(2)> AlternativeInteger.parsenorest("13") 13 ```
This example and the refactoring are proposed by José Valim (@josevalim)
Treatments:
Accessing non-existent Map/Struct fields
Category: Low-level concerns smells.
Note: Formerly known as "Map/struct dynamic access".
Problem: In Elixir, it is possible to access values from
Maps, which are key-value data structures, either strictly or dynamically. When trying to dynamically access the value of a key from aMap, if the informed key does not exist, a null value (nil) will be returned. This return can be confusing and does not allow developers to conclude whether the key is non-existent in theMapor just has no bound value. In this way, this code smell may cause bugs in the code.Example: The code shown below is an example of this smell. The function
plot/1tries to draw a graphic to represent the position of a point in a cartesian plane. This function receives a parameter ofMaptype with the point attributes, which can be a point of a 2D or 3D cartesian coordinate system. To decide if a point is 2D or 3D, this function uses dynamic access to retrieve values of theMapkeys:
```elixir defmodule Graphics do def plot(point) do #...some code...
# Dynamic access to use point values
{point[:x], point[:y], point[:z]}
#...some code...
end
end
#...Use examples... iex(1)> point_2d = %{x: 2, y: 3} %{x: 2, y: 3}
iex(2)> point_3d = %{x: 5, y: 6, z: nil} %{x: 5, y: 6, z: nil}
iex(3)> Graphics.plot(point_2d) {2, 3, nil} # <= ambiguous return
iex(4)> Graphics.plot(point_3d) {5, 6, nil} ```
As can be seen in the example above, even when the key :z does not exist in the Map (point_2d), dynamic access returns the value nil. This return can be dangerous because of its ambiguity. It is not possible to conclude from it whether the Map has the key :z or not. If the function relies on the return value to make decisions about how to plot a point, this can be problematic and even cause errors when testing the code.
- Refactoring: To remove this code smell, whenever a
Maphas keys ofAtomtype, replace the dynamic access to its values per strict access. When a non-existent key is strictly accessed, Elixir raises an error immediately, allowing developers to find bugs faster. The next code illustrates the refactoring ofplot/1, removing this smell:
```elixir defmodule Graphics do def plot(point) do #...some code...
# Strict access to use point values
{point.x, point.y, point.z}
#...some code...
end
end
#...Use examples... iex(1)> point_2d = %{x: 2, y: 3} %{x: 2, y: 3}
iex(2)> point_3d = %{x: 5, y: 6, z: nil} %{x: 5, y: 6, z: nil}
iex(3)> Graphics.plot(point_2d) ** (KeyError) key :z not found in: %{x: 2, y: 3} # <= explicitly warns that graphic.ex:6: Graphics.plot/1 # <= the z key does not exist!
iex(4)> Graphics.plot(point_3d) {5, 6, nil} ```
As shown below, another alternative to refactor this smell is to replace a Map with a struct (named map). By default, structs only support strict access to values. In this way, accesses will always return clear and objective results:
```elixir defmodule Point do @enforce_keys [:x, :y] defstruct [x: nil, y: nil] end
#...Use examples... iex(1)> point = %Point{x: 2, y: 3} %Point{x: 2, y: 3}
iex(2)> point.x # <= strict access to use point values 2
iex(3)> point.z # <= trying to access a non-existent key ** (KeyError) key :z not found in: %Point{x: 2, y: 3}
iex(4)> point[:x] # <= by default, struct does not support dynamic access ** (UndefinedFunctionError) ... (Point does not implement the Access behaviour) ```
These examples are based on code written by José Valim (@josevalim). Source: link
Treatments:
Speculative Assumptions
Category: Low-level concerns smells.
Note: Formerly known as "Unplanned value extraction".
Problem: Overall, Elixir application’s are composed of many supervised processes, so the effects of an error will be localized in a single process, not propagating to the entire application. A supervisor will detect the failing process, and restart it at that level. For this type of design to behave well, it's important that problematic code crashes when it fails to fulfill its purpose. However, some code may have undesired behavior making many assumptions we have not really planned for, such as being able to return incorrect values instead of forcing a crash. These speculative assumptions can give a false impression that the code is working correctly.
Example: The code shown below is an example of this smell. The function
get_value/2tries to extract a value from a specific key of a URL query string. As it is not implemented using pattern matching,get_value/2always returns a value, regardless of the format of the URL query string passed as a parameter in the call. Sometimes the returned value will be valid; however, if a URL query string with an unexpected format is used in the call,get_value/2will extract incorrect values from it:
```elixir defmodule Extract do
@doc """
Extract value from a key in a URL query string.
"""
def get_value(string, desired_key) do
parts = String.split(string, "&")
Enum.find_value(parts, fn pair ->
key_value = String.split(pair, "=")
Enum.at(key_value, 0) == desired_key && Enum.at(key_value, 1)
end)
end
end
#...Use examples...
# URL query string according to with the planned format - OK! iex(1)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "lab") "ASERG"
iex(2)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "university") "UFMG"
# Unplanned URL query string format - Unplanned value extraction! iex(3)> Extract.get_value("name=Lucas&university=institution=UFMG&lab=ASERG", "university") "institution" # <= why not "institution=UFMG"? or only "UFMG"? ```
- Refactoring: To remove this code smell,
get_value/2can be refactored through the use of pattern matching. So, if an unexpected URL query string format is used, the function will be crash instead of returning an invalid value. This behavior, shown below, will allow clients to decide how to handle these errors and will not give a false impression that the code is working correctly when unexpected values are extracted:
```elixir defmodule Extract do
@doc """
Extract value from a key in a URL query string.
Refactored by using pattern matching.
"""
def get_value(string, desired_key) do
parts = String.split(string, "&")
Enum.find_value(parts, fn pair ->
[key, value] = String.split(pair, "=") # <= pattern matching
key == desired_key && value
end)
end
end
#...Use examples...
# URL query string according to with the planned format - OK! iex(1)> Extract.get_value("name=Lucas&university=UFMG&lab=ASERG", "name") "Lucas"
# Unplanned URL query string format - Crash explaining the problem to the client! iex(2)> Extract.getvalue("name=Lucas&university=institution=UFMG&lab=ASERG", "university") ** (MatchError) no match of right hand side value: ["university", "institution", "UFMG"] extract.ex:7: anonymous fn/2 in Extract.getvalue/2 # <= left hand: [key, value] pair
iex(3)> Extract.getvalue("name=Lucas&university&lab=ASERG", "university") ** (MatchError) no match of right hand side value: ["university"] extract.ex:7: anonymous fn/2 in Extract.getvalue/2 # <= left hand: [key, value] pair ```
These examples are based on code written by José Valim (@josevalim). Source: link
Modules with identical names
Category: Low-level concerns smells.
Problem: This code smell is related to possible module name conflicts that can occur when a library is implemented. Due to a limitation of the Erlang VM (BEAM), also used by Elixir, only one instance of a module can be loaded at a time. If there are name conflicts between more than one module, they will be considered the same by BEAM and only one of them will be loaded. This can cause unwanted code behavior.
Example: The code shown below is an example of this smell. Two different modules were defined with identical names (
Foo). When BEAM tries to load both simultaneously, only the module defined in the file (module_two.ex) stay loaded, redefining the current version ofFoo(module_one.ex) in memory. That makes it impossible to callfrom_module_one/0, for example:
elixir
defmodule Foo do
@moduledoc """
Defined in `module_one.ex` file.
"""
def from_module_one do
"Function from module one!"
end
end
elixir
defmodule Foo do
@moduledoc """
Defined in `module_two.ex` file.
"""
def from_module_two do
"Function from module two!"
end
end
When BEAM tries to load both simultaneously, the name conflict causes only one of them to stay loaded:
```elixir iex(1)> c("module_one.ex") [Foo]
iex(2)> c("moduletwo.ex") warning: redefining module Foo (current version defined in memory) moduletwo.ex:1 [Foo]
iex(3)> Foo.frommoduletwo() "Function from module two!"
iex(4)> Foo.frommoduleone() # <= impossible to call due to name conflict ** (UndefinedFunctionError) function Foo.frommoduleone/0 is undefined... ```
- Refactoring: To remove this code smell, a library must standardize the naming of its modules, always using its own name as a prefix (namespace) for all its module's names (e.g.,
LibraryName.ModuleName). When a module file is within subdirectories of a library, the names of the subdirectories must also be used in the module naming (e.g.,LibraryName.SubdirectoryName.ModuleName). In the refactored code shown below, this module naming pattern was used. For this, theFoomodule, defined in the filemodule_two.ex, was also moved to theutilssubdirectory. This refactoring, in addition to eliminating the internal conflict of names within the library, will prevent the occurrence of name conflicts with client code:
elixir
defmodule MyLibrary.Foo do
@moduledoc """
Defined in `module_one.ex` file.
Name refactored!
"""
def from_module_one do
"Function from module one!"
end
end
elixir
defmodule MyLibrary.Utils.Foo do
@moduledoc """
Defined in `module_two.ex` file.
Name refactored!
"""
def from_module_two do
"Function from module two!"
end
end
When BEAM tries to load them simultaneously, both will stay loaded successfully:
```elixir iex(1)> c("module_one.ex") [MyLibrary.Foo]
iex(2)> c("module_two.ex") [MyLibrary.Utils.Foo]
iex(3)> MyLibrary.Foo.frommoduleone() "Function from module one!"
iex(4)> MyLibrary.Utils.Foo.frommoduletwo() "Function from module two!" ```
This example is based on the description provided in Elixir's official documentation. Source: link
Treatments:
Unnecessary macros
Category: Low-level concerns smells.
Problem:
Macrosare powerful meta-programming mechanisms that can be used in Elixir to extend the language. While implementingmacrosis not a code smell in itself, this meta-programming mechanism should only be used when absolutely necessary. Whenever a macro is implemented, and it was possible to solve the same problem using functions or other pre-existing Elixir structures, the code becomes unnecessarily more complex and less readable. Becausemacrosare more difficult to implement and understand, their indiscriminate use can compromise the evolution of a system, reducing its maintainability.Example: The code shown below is an example of this smell. The
MyMathmodule implements thesum/2macro to perform the sum of two numbers received as parameters. While this code has no syntax errors and can be executed correctly to get the desired result, it is unnecessarily more complex. By implementing this functionality as a macro rather than a conventional function, the code became less clear and less objective:
```elixir defmodule MyMath do
defmacro sum(v1, v2) do
quote do
unquote(v1) + unquote(v2)
end
end
end
#...Use examples...
iex(1)> require MyMath MyMath
iex(2)> MyMath.sum(3, 5) 8
iex(3)> MyMath.sum(3+1, 5+6) 15 ```
- Refactoring: To remove this code smell, the developer must replace the unnecessary macro with structures that are simpler to write and understand, such as named functions. The code shown below is the result of the refactoring of the previous example. Basically, the
sum/2macro has been transformed into a conventional named function. Note that therequirecommand is no longer needed:
```elixir defmodule MyMath do
def sum(v1, v2) do # <= macro became a named function!
v1 + v2
end
end
#...Use examples...
# No need to require anymore!
iex(1)> MyMath.sum(3, 5) 8
iex(2)> MyMath.sum(3+1, 5+6) 15 ```
This example is based on the description provided in Elixir's official documentation. Source: link
Treatments:
Dynamic atom creation
Category: Low-level concerns smells.
Note: This smell emerged from a study with mining software repositories (MSR).
Problem: An
atomis a basic data type of Elixir whose value is its own name. They are often useful to identify resources or to express the state of an operation. The creation of anatomdo not characterize a smell by itself; however,atomsare not collected by Elixir's Garbage Collector, so values of this type live in memory while an application is executing, during its entire lifetime. Also, BEAM limit the number ofatomsthat can exist in an application (1_048_576) and eachatomhas a maximum size limited to 255 Unicode code points. For these reasons, the dynamic atom creation is considered a code smell, since in this way the developer has no control over how manyatomswill be created during the execution of the application. This unpredictable scenario can expose an app to unexpected behavior caused by excessive memory usage, or even by reaching the maximum number ofatomspossible.Example: The code shown below is an example of this smell. Imagine that you are implementing a code that performs the conversion of
stringvalues intoatomsto identify resources. Thesestringscan come from user input or even have been received as response from requests to an API. As this is a dynamic and unpredictable scenario, it is possible for identicalstringsto be converted into newatomsthat are repeated unnecessarily. This kind of conversion, in addition to wasting memory, can be problematic for an application if it happens too often.
```elixir defmodule Identifier do ...
def generate(id) when is_bitstring(id) do
String.to_atom(id) #<= dynamic atom creation!!
end
end
#...Use examples...
iex(1)> stringfromuserinput = "myid" "my_id"
iex(2)> stringfromAPIresponse = "myid" "my_id"
iex(3)> Identifier.generate(stringfromuserinput) :myid
iex(4)> Identifier.generate(stringfromAPIresponse) :myid #<= atom repeated was created! ```
When we use the String.to_atom/1 function to dynamically create an atom, it is created regardless of whether there is already another one with the same value in memory, so when this happens automatically, we will not have control over meeting the limits established by BEAM.
- Refactoring: To remove this smell, as shown below, first you must ensure that all the identifier
atomsare created statically, only once, at the beginning of an application's execution:
elixir
# statically created atoms...
_ = :my_id
_ = :my_id2
_ = :my_id3
_ = :my_id4
Next, you should replace the use of the String.to_atom/1 function with the String.to_existing_atom/1 function. This will allow string-to-atom conversions to just map the strings to atoms already in memory (statically created at the beginning of the execution), thus preventing repeated atoms from being created dynamically. This second part of the refactoring is presented below.
```elixir defmodule Identifier do ...
def generate(id) when is_bitstring(id) do
String.to_existing_atom(id) #<= just maps a string to an existing atom!
end
end
#...Use examples...
iex(1)> Identifier.generate("myid") :myid
iex(2)> Identifier.generate("myid2") :myid2
iex(3)> Identifier.generate("nonexistentid") ** (ArgumentError) errors were found at the given arguments: * 1st argument: not an already existing atom ```
Note that in the third use example, when a string different from an already existing atom is given, Elixir shows an error instead of performing the conversion. This demonstrates that this refactoring creates a more controlled and predictable scenario for the application in terms of memory usage.
This example and the refactoring are based on the Elixir's official documentation. Sources: 1, 2
Treatments:
About
This catalog was proposed by Lucas Vegi and Marco Tulio Valente, from ASERG/DCC/UFMG.
For more info see the following paper:
- Code Smells in Elixir: Early Results from a Grey Literature Review, International Conference on Program Comprehension (ICPC), 2022. [slides] [video] [podcast (pt-BR) - English subtitles available]
- Understanding code smells in Elixir functional language, Empirical Software Engineering Journal (EMSE), 2023.
Please feel free to make pull requests and suggestions (Issues tab).
Acknowledgments
We are supported by FinbitsTM, a Brazilian Elixir-based fintech:
Our research is also part of the initiative called Research with Elixir (in portuguese).
Owner
- Name: Lucas Francisco da Matta Vegi
- Login: lucasvegi
- Kind: user
- Location: Belo Horizonte, MG, Brazil
- Company: DPI/UFV | DCC/UFMG
- Website: https://www2.dpi.ufv.br/?page_id=536
- Twitter: lucasvegi
- Repositories: 2
- Profile: https://github.com/lucasvegi
Ph.D. student and @aserg-ufmg member (UFMG). Assistant Professor (UFV).
Citation (CITATION.cff)
cff-version: 1.2.1
message: 'If you use this catalog in your work, please cite it as below.'
authors:
- given-names: Lucas Francisco da Matta
family-names: Vegi
email: lucasvegi@gmail.com
affiliation: UFMG
orcid: 'https://orcid.org/0000-0002-7999-7098'
- given-names: Marco Tulio
family-names: Valente
email: mtvalente@gmail.com
affiliation: UFMG
orcid: 'https://orcid.org/0000-0002-8180-7548'
title: 'Catalog of Elixir-specific code smells'
version: '1.0'
date-released: '2022-02-15'
url: 'https://github.com/lucasvegi/Elixir-Code-Smells'
keywords:
- elixir
- code smells
- functional programming
license: MIT
preferred-citation:
type: article
message: 'If you use this catalog in your work, please cite it as below.'
authors:
- given-names: Lucas Francisco da Matta
family-names: Vegi
email: lucasvegi@dcc.ufmg.br
affiliation: Federal University of Minas Gerais (UFMG)
orcid: 'https://orcid.org/0000-0002-7999-7098'
- given-names: Marco Tulio
family-names: Valente
email: mtov@dcc.ufmg.br
affiliation: Federal University of Minas Gerais (UFMG)
orcid: 'https://orcid.org/0000-0002-8180-7548'
doi: "10.1007/s10664-023-10343-6"
journal: "Empirical Software Engineering"
pages: 32
start: 1 # First page number
end: 32 # Last page number
title: "Understanding code smells in Elixir functional language"
issue: 102
volume: 28
year: 2023
GitHub Events
Total
- Watch event: 42
- Issue comment event: 2
- Fork event: 8
Last Year
- Watch event: 42
- Issue comment event: 2
- Fork event: 8
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Lucas Francisco da Matta Vegi | l****i@g****m | 111 |
| Cristine Guadelupe | c****e@m****m | 14 |
| Marco Tulio Valente | m****v | 9 |
| sabiwara | s****a@g****m | 2 |
| c4710n | c****n | 2 |
| Adolfo Neto | a****p@g****m | 2 |
| Rich Morin | r****m@c****m | 1 |
| Kian-Meng Ang | k****g@c****g | 1 |
| José Valim | j****m@g****m | 1 |
| Gabriel Giordano | h****o@g****m | 1 |
| Brian Underwood | b****n@b****s | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 15
- Total pull requests: 14
- Average time to close issues: about 1 month
- Average time to close pull requests: about 12 hours
- Total issue authors: 7
- Total pull request authors: 9
- Average comments per issue: 4.67
- Average comments per pull request: 0.57
- Merged pull requests: 14
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- josevalim (8)
- pdgonzalez872 (2)
- DaniruKun (1)
- adolfont (1)
- Cantido (1)
- sbacarob (1)
- cocoa-xu (1)
Pull Request Authors
- cristineguadelupe (6)
- josevalim (1)
- adolfont (1)
- gabrielgiordan (1)
- RichMorin (1)
- c4710n (1)
- cheerfulstoic (1)
- kianmeng (1)
- sabiwara (1)
