Build a Roles and Permissions System for Phoenix

Hey folks, this is the second part of a mini-series about building a roles and permissions (RAP) system for Phoenix. If you haven't read the first part yet, please go here and read it first :)

In the first post of this mini-series, we learned how to implement basic permissions for your schemas, how to group them into roles, and how to assign the roles to our users. We used the role to check a user's permissions in our Phoenix Controllers. Our controllers lie at the outer bounds of our system and adding the permission checks here is a good first step, but we can do better.

The second security layer of our RAP will be the restriction of queries. This layer aims to prevent data leaks by filtering out query results that the user shouldn't see. What the user shouldn't see depends on your use case, but we will implement two restrictions as an example:

For external users, our queries should return only the data that belong to the requesting user.
For internal users, our queries should hide the data of other internal users. Only special internal users (e.g. HR Managers) are allowed to see such data. Internal users can see the data of all external users though.

For example, customers of our company (external users) should only see their own invoices and addresses, but a customer support employee (internal user) should be able to see the invoices of all customers and their addresses. That employee shouldn't see the addresses of other internal users though, only special internal users (e.g. HR Managers) can do so.

This post is inspired by the work of my brilliant (ex-)colleagues Lucas, Antonio, and Vitor who built the RAP we use here at Remote. If you like my posts, maybe consider working with me since my team is hiring :)

Okay, that's enough theory. Let's write some code!

Create the Roles

Let's first define the roles of our users. In part 1, we created some DefaultRoles. Let's replace them with these:

defmodule Demo.DefaultRoles do
  def all() do
    [
      %{
        name: "HR Manager",
        permissions: %{
          "addresses" => ["read", "update", "delete"]
        },
        data_scope: :internal
      },
      %{
        name: "Customer Support Employee",
        permissions: %{
          "addresses" => ["read"]
        },
        data_scope: :external_all
      },
      %{
        name: "Customer",
        permissions: %{
          "addresses" => ["read", "update", "delete"],
          "invoices" => ["read"]
        },
        data_scope: :external_one
      }
    ]
  end
end

If you read part 1, you're familiar with the permissions field and what it means. If not, please go back and read part 1 :)

As you might have spotted already, we extended our roles by a new field called data_scope. It defines, well, the scope of the data that a user with this role is allowed to see. For our customers, this scope is external_one, meaning that they can only see the invoices and addresses of a single external user, namely themselves. Our customer support employees get the external_all scope, which means they can query the addresses of all external users. Our HR managers can query all internal addresses, but not our customer's addresses.

Configure the User

Before we can restrict the queries, we need a way to track which "group" a user belongs to. Let's add a group field to the user.ex schema to distinguish between internal and external users:

defmodule Demo.User do
  schema "users" do
    field(:group, Ecto.Enum, values: [:internal, :external]))
    # other fields
  end
end

With the new group field in place, we can easily filter out internal or external users depending on the data scope access a user has. Neat! Let's see how to set up these restrictions next.

Set up the Repo

We will use Ecto.Repo's prepare_query/3 callback to restrict the queries. This callback is executed before any query is run through our Repo. This is the perfect place to restrict our queries because it will catch every single one of them! This method was inspired by the Multi-tenancy docs of Ecto.

Let's implement a prepare_query/3 callback that restricts a query based on the data_scope option. It will look something like this:

defmodule Demo.Repo do
  use Ecto.Repo,
    otp_app: :demo,
    adapter: Ecto.Adapters.Postgres

  require Ecto.Query

  @data_scope_key {__MODULE__, :data_scope}
  @user_id_key {__MODULE__, :user_id}

  def put_data_scope(data_scope) do
    Process.put(@data_scope_key, data_scope)
  end

  def get_data_scope() do
    Process.get(@data_scope_key)
  end

  def put_user_id(user_id) do
    Process.put(@user_id_key, user_id)
  end

  def get_user_id() do
    Process.get(@user_id_key)
  end

  @impl true
  def default_options(_operation) do
    [data_scope: get_data_scope(), user_id: get_user_id()]
  end

  @impl true
  def prepare_query(_operation, query, opts) do
    cond do
      opts[:schema_migration] || opts[:data_scope] == :ignore ->
        {query, opts}

      data_scope = opts[:data_scope] ->
        # RESTRICT THE QUERY HERE

      true ->
        raise "expected data_scope to be set"
    end
  end
end

We extended our Demo.Repo with a few functions. Let's go through them.

First, we implemented a prepare_query/3 callback that checks if the data_scope field was set and raises an exception otherwise. It allows an automatic override for migrations that run by default with the schema_migration: true option. We also added a manual override with opts[:data_scope] == :ignore. This manual override comes in handy for our test setup and seeds.exs script. When we write tests, we probably don't want to worry about satisfying our query restrictions (but this is up to you to decide). Setting up every test just right would become harder and harder the more restrictions we implement, so we will just disable the query restrictions for all tests unless we explicitly activate them. This only applies to our tests though. If we forget to set the data scope in production, the prepare_query/3 will raise an exception and block the query.

We can easily set the data_scope field using the put_data_scope/1 function. This stores the value in the process dictionary using Process.put/2. This way, we can easily set the data scope in our Phoenix controllers and retrieve the value once we execute a query caused by the request. No need to pass the current_user struct all the way from the controller to every query we execute. Every request spawns its own process, so we don't have to worry about deleting the value after a request completes since the process gets killed and cleaned up automatically.

For our customers with the :external_one scope, we'll need to store their user_id so that we can limit the query to only their data. We'll use the put_user_id/1 and get_user_id/1 functions for that.

However, if we call put_data_scope/1 or put_user_id/1, Ecto won't pass the values to the prepare_query/3 automatically. We need to add the fields to the default_options/1 callback ourselves. This ensures that we fetch the :data_scope and :user_id values just before our query is executed.

Now, if you add this code to your project and run your tests, probably all of them will fail. That's because we haven't disabled the query restrictions in our tests yet. Let's fix this in our ConnCase and DataCase helpers like this:

defmodule Demo.DataCase do

  # using do ...

  setup tags do
    Demo.DataCase.setup_sandbox(tags)
    Demo.Repo.put_data_scope(:ignore) # <- Add this line

    :ok
  end
end

defmodule DemoWeb.ConnCase do

  # using do ...

  setup tags do
    Demo.DataCase.setup_sandbox(tags)
    Demo.Repo.put_data_scope(:ignore) # <- Add this line

    {:ok, conn: Phoenix.ConnTest.build_conn()}
  end
end

These two lines of code will prompt the Repo to ignore the query restrictions by default when we run our tests. We can't forget about our seeds.exs script though. It should also ignore the query restrictions. Add the following command to your seeds.exs:

Demo.Repo.put_data_scope(:ignore) # <- Add this line

for role <- default_roles do
  unless Demo.RAP.get_role_by_name(role.name) do
    {:ok, _role} = Demo.RAP.create_role(role)
  end
end

Now, set up your repo and run your tests again. They should succeed. We successfully disabled the check in prepare_query/3 for all existing tests. Yey!

Okay, we set up the basics for restricting our queries, but two more steps are missing: We need to set the data scope based on the role of the requesting user and we need to actually restrict our query. Let's start with the first.

Set the Data Scope

We want to set the data_scope and user_id fields at the top of our controllers. We could create a dedicated plug to do so, but I'll reuse the existing CheckPermissions plug for this. Setting the data scope and user_id takes only two lines of code. See here:

defmodule DemoWeb.Plugs.CheckPermissions do

  # imports and init/1

  def call(conn, opts) do
    user = get_user(conn)
    required_permission = get_required_permission(conn, opts)

    if Permissions.user_has_permission?(user, required_permission) do
      Demo.Repo.put_data_scope(user.role.data_scope) # <- Add this line
      Demo.Repo.put_user_id(user.id) # <- Add this line
      conn
    else
      conn
      |> put_status(:forbidden)
      |> halt()
    end
  end

  # Private functions
end

Now, we set the data scope after checking the permissions of a user. Again, every request spawns a new process and we set the data_scope field on that process and that process alone. Thanks to the actor model of the BEAM, the data scope of one user won't interfere with the data scope of another. Neat!

Now onto the last step: Actually restricting the query.

Restrict the Query

Restricting the query is the easy part. All we want is to add a simple where: user.group == :external or where: user.id == ^user_id statement to the query. Let's do that next:

defmodule Demo.Repo do

  # get/put_data_scope and get/put_user_id

  @impl true
  def prepare_query(_operation, query, opts) do
    cond do
      opts[:schema_migration] || opts[:data_scope] == :ignore ->
        {query, opts}

      data_scope = opts[:data_scope] ->
        {restrict(query, data_scope, opts), opts} # <- Add this line

      true ->
        raise "expected data_scope to be set"
    end
  end

  # Add these functions here or move them to a dedicated handler for the address schema
  defp restrict(%{from: %{source: {"addresses", _module}}} = query, data_scope, opts) do
    do_restrict(query, data_scope, opts)
  end

  defp do_restrict(query, :internal, _opts) do
    from(address in query,
      left_join: user in assoc(address, :user),
      where: user.group == :internal
    )
  end

  defp do_restrict(query, :external_all, _opts) do
    from(address in query,
      left_join: user in assoc(address, :user),
      where: user.group == :external
    )
  end

  defp do_restrict(query, :external_one, opts) do
    user_id = Keyword.fetch!(opts, :user_id)

    from(address in query,
      left_join: user in assoc(address, :user),
      where: user.group == :external and user.id == ^user_id
    )
  end
end

We extended our Repo with one query restriction for each data scope. In restrict/3, we pattern match against the schema name that is queried. In this case, we only handle queries for the address schema. If a requesting user has the :internal data scope, we limit the data to only internal users by extending the query with where: user.group == :internal. We do the same for external_all and external_one data scopes.

We have now successfully restricted all queries for the address schema to only users of the correct user group! Nice! This concludes the basic setup of the query restrictions.

Caveats

The setup above will get you started. Once you extend the RAP above, you'll run into a few problems though. I'll explain some of them and their solutions next.

Named Binding exists already

In the do_restrict/3 functions above, I assumed that the users schema wasn't joined to the address schema yet. This won't always be the case though. A safer solution is to check whether a named binding for the user schema already exists before joining the user again. Here's how you can check for that:

defp restrict(%{from: %{source: {"rap_addresses", _module}}} = query, data_scope, opts) do
  {binding, query} = ensure_binding(query, [:user, :users])
  do_restrict(query, binding, data_scope, opts)
end

defp ensure_binding(query, keys) do
  case Enum.find(keys, fn key -> has_named_binding?(query, key) end) do
    nil -> add_binding(query, keys)
    key -> {key, query}
  end
end

defp add_binding(query, [key | _rest]) do
  query = join(query, :left, [a], user in assoc(a, :user), as: ^key)
  {key, query}
end

defp do_restrict(query, binding, :internal, _opts) do
  from([{^binding, user}] in query, where: user.group == :internal)
end

The ensure_binding/2 function will check whether the query already has a named binding based on a list of possible keys. It joins the user schema into the query if not and returns the query and the key for the named binding. Note that this will only work if you joined the user with a as: named binding like this: join: user in assoc(a, :user), as: :user. If you join the user, but without named binding, we'll still join the user again in this case. You can extend ensure_binding by also checking the query.joins. This will also include unnamed bindings.

Create Restriction Modules

We added restrictions for the address schema above, but even that required ~35 lines of code already. If you want to add restrictions for other schemas as well, I'd advise moving them into their own modules like this:

defmodule Demo.Restrictions.Address do
  def restrict(%{from: %{source: {"addresses", _module}}} = query, data_scope, opts) do
    # call do_restrict/3
  end

  def restrict(query, _data_scope, _opts), do: query
end

defmodule Demo.Repo do
  @handlers [
    Demo.Restrictions.Address,
    # Other restrictions
  ]

  # Other functions

  defp restrict(query, data_scope, opts) do
    Enum.reduce(@handlers, query, fn handler, acc -> handler.restrict(acc, data_scope, opts) end)
  end

This way, you can loop through every handler and restrict the query specifically for the schema of the handler. A setup like this makes it easy to add or remove restrictions.

Conclusion

This concludes the second and last part of the RAP mini-series! I hope you enjoyed the articles! Thanks again to my dear colleagues Lucas, Vitor, and Antonio for their brilliant work and support. If you have questions or comments, let's discuss them on BlueSky. Follow me on BlueSky or subscribe to my newsletter below if you want to get notified when I publish the next blog post. If you found this topic interesting, here are some references for further reading:

Build a Roles and Permissions System for Phoenix - Part 2