Protobuf Python Imports

Problem

The import statements in Python code generated from factored Protobuf files do not correctly point to the modules to be imported.

Note: By factored, I mean at least two files a.proto and b.proto with potential of mutual dependencies.

Solution

Package structure should be setup as follows

<project_root>/
  |- <protobuf-source>/
    |- <module-a1>/
      |- (...)
        |- <module-an>/
          a.proto
    |- <module-b1>/
      |- (...)
        |- <module-bn>/
          b.proto
  |- <python-source>/
    |- <module-a1>/
      |- (...)
        |- <module-an>/
          (...)
    |- <module-b1>/
      |- ...
        |- <module-bn>/
          (...)

I.e. package structure of the python source should mirror the package structure of the protobuf source code.

Then change to the project root folder generate code with protoc for each Protobuf source file as follows

cd <project-root>
protoc --proto_path=protobuf --python_out=python <module-a1>/(...)/<module-an>/a.proto

This should create a file a_pb2.py in <project-root>/<python-source>/<module-a1>/(...)/<module-an>/.

Assuming that we have an import statement

import "<module-b1>/(...)/<module-bn>/b.proto"

in a.proto, the generated source code in a_pb2.py should now have an import statement <module-b1>.(...).<module-bn>.b_pb2.py.

In essence, the path prefix of the protobuf file—or, files—is replicated by protoc. This is a bit tricky since there is very little indication of that in the documentation or snippets on the web.

Notes

Jargon

Protobuf and Python jargon differs slightly (Protobuf package is a module in Python)

I.e. <module-a1>/(...)/<module-an>/a.proto is in the package declared within a.proto. Import statements do not take a module path but rather a file path (package is a separate concept).

The generated Python however is in the module identified by the module path <module-a1>.(...).<module-an> (whereas the package is the collection of all modules within the project).

Module Initialization Files (__init__.py)

We have to create module initialization files __init__.py for each module layer in the generated code. The last one—in the directory containing the generated code—however has to be empty since Protobuf import statements will be translated to

from <module-a1>.(...).<module-an> import <generated-module> 

I.e. the __init__.py in <module-an> and <module-bn> should be empty.