Data Transfer Objects and Data Access Objects in Python with FastAPI

APIs and contracts

Lately I have been playing more with FastAPI and its integration with awesome libraries such as pydantic and sqlalchemy. One thing I noticed by contributing and looking at some projects that use FastAPI is the fact that DTOs and DAOs are use quite often. DTOs in particular reminded me of how protobufs can be used in GoLang to define API endpoints and the objects needed as a contract when end users interact with the API. So, here are some of my considerations and how I think about these two patterns.

DTOs

A Data Transfer Object (DTO) can be viewed as representing the contract between a user of an API and the data structure needed by the backend from the user to perform the action requested. We can then compare them to a function signature, protocol or golang's interface, where, in order to satisfy the API, the user must send an object that has the attributes that match (in key names and value types) the DTO. Another metaphor could be of a lock, where the DTO is a lock with a specific shape and any key, no matter its color, that has the matching shape, can open the door.

Let's look at the following example from an API endpoint that is used to obtain a file entity from the API:

  @router.get(
    "/{file_id}",
    response_model=FileDTO,
)
async def get_entity(file_id: int, file_dao: FileDAO = Depends()) -> FileDTO:
    try:
        file = await file_dao.get_file_by_id(file_id)
        if not file:
            raise HTTPException(
                status_code=status.HTTP_404_NOT_FOUND, detail="File not found"
            )
        return file
    except Exception as e:
        logger.error(f"There was an error retrieving the file: {e}")
        raise HTTPException(
            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
            detail="Error retrieving the file",
        )

We can see that the response model is a FileDTO, which means that whenever we get a 200 response from the API, we can expect to receive a jsonified object that matches a FileDTO object. In this case, a FileDTO object looks like the following:

  class FileDTO(BaseModel):
    id: int
    name: str
    file_path: str
    file_size: int

class FileUploadDTO(BaseModel):
    name: str
    file: bytes
    description: str | None = None

class FilePatchDTO(BaseModel):
    name: str | None = None
    description: str | None = None

Hence, we can expect our get response to have an id, a name, a file_path and a file_size. At the same time, we have a FileUploadDTO, so when we upload a file, we should provide a jsonified object that matches the FileUploadDTO attributes. It will then need to have a name attribute, a file one and optionally a description. In case we want to update the file, we have to provide either a new name, new description or both, following the FilePatchDTO class.

In another endpoint example, we can see how the FilePatchDTO is required as an argument so that it can be passed to the DAO:

  @router.patch("/file/{file_id}", response_model=FileDTO)
  def update_file(
      file_id: str, to_update_file: FilePatchDTO, file_dao: FileDAO = Depends()
  ) -> FileDTO:
      ...

DAOs

Data Access Objects (DAO) are similar to infrastructure structs and interfaces in GoLang API's, they are the interface/adapter between the application and the database table (or other data sources) of the domain objects usually represented to the users via DTOs, which do not always necesserily expose all fields/attributes that a DAOs is aware of. Therefore, end-users are never aware of DAOs and should not have any knowledge of their existence. DAOs are then used to perform all CRUD operations and/or queries against the database by implementing the ORM's methods needed for those operations or the custom methods developed to interact with the database for a given table.

If you look at our previous example, in FastAPI you can provide the needed DAO for an endpoint as an injected dependency in the endpoint's function's parameters using FastAPI's Depends. This is what the DAO will look like for our example:

  class FileDAO:
    """Class for accessing file table."""

    def __init__(self, session: AsyncSession = Depends(get_db_session)):
        self.session = session

    async def get_file_by_id(self, _id: int) -> Optional[File]:
        """
        Get a specific file by its id.

        :param _id: id of the file instance.
        :return: File instance if found, otherwise None.
        """
        query = select(File).where(File.id == _id).limit(1)
        row = await self.session.execute(query)
        file = row.scalars().first()  # fetch the first result if exists
        return file

    async def list_files(self, limit: int, offset: int = 0) -> List[File]:
        """
        List all files with limit/offset pagination.

        :param limit: limit of files.
        :param offset: offset of files.
        :return: list of files.
        """
        raw_files = await self.session.execute(select(File).limit(limit).offset(offset))
        return list(raw_files.scalars().fetchall())

    async def delete_file(self, _id: int) -> None:
        """
        Delete a specific file by its id.

        :param _id: id of the file instance.
        """
        query = delete(File).where(File.id == _id)
        await self.session.execute(query)
        await self.session.commit()

    async def create_file(
        self, name: str, file_path: str, file_size: int, description: str = None
    ) -> File:
        """
        Create a new file entry.

        :param name: Name of the file.
        :param file_path: Path where the file is stored.
        :param file_size: Size of the file in bytes.
        :param description: Optional description of the file.
        :return: The created File instance.
        """
        new_file = File(
            name=name, file_path=file_path, file_size=file_size, description=description
        )
        self.session.add(new_file)
        await self.session.commit()
        return new_file

This DAO class for file entities uses asynchronous methods to query for multiple or single files, create, delete or update files. These methods all use sqlalchemy, so they can be used the same for any database supported by sqlalchemy, but you could also use libraries for different storage systems and still have the same FileDAO methods. In this example the methods all take individual attributes of the file DTOs as arguments, but they could also just take whole DTOs as arguments. You can see how, by using this FileDAO, we abstract the interaction with the data storage system.

Conclusion

The synergy between DTOs and DAOs promotes a robust and clean architecture, applications become easier to develop and maintain, while minimizing mistakes as well by establishing these clear and precise contracts that are well integrated with libraries such as FastAPI, sqlalchemy and pydantic in Python. By clearly delineating responsibilities, we reach separation of concerns, ultimately contributing to the creation of resilient and efficient software solutions. Long story short, DTOs and DAOs are pretty nice and fun to work with when developing FastAPI applications.