How to implement your model.
model/model.py
file. To recap, the simplest
directory structure for a model is:
model.py
file contains a class with particular methods:
__init__
method is used to initialize the Model
class, and allows you to read
in configuration parameters and other information.load
method is where you define the logic for initializing the model. This might
include downloading model weights, or loading them onto a GPU.predict
method is where you define the logic for inference.__init__
method is used to initialize the Model
class, and allows you to
read in configuration parameters and runtime information.
The simplest signature for __init__
is:
config
: A dictionary containing the config.yaml for the model.data_dir
: A string containing the path to the data directory for the model.secrets
: A dictionary containing the secrets for the model. Note that at runtime,
these will be populated with the actual values as stored on Baseten.environment
: A string containing the environment for the model, if the model has been
deployed to an environment.load
method is where you define the logic for initializing the model. As
mentioned before, this might include downloading model weights or loading them
onto the GPU.
load
, unlike the other method mentioned, does not accept any parameters:
load
has
completed successfully. Note that there is a timeout of 30 minutes for this, after which,
if load
has not completed, the deployment will be marked as failed.
predict
method is where you define the logic for performing inference.
The simplest signature for predict
is:
predict
must be JSON-serializable, so it can be:
dict
list
str
Pydantic
object.
predict
:
predict
method is synchronous by default. However, if your model inference
depends on APIs require asyncio
, predict
can also be written as a coroutine.
asyncio
in your predict
method, be sure not to perform any blocking
operations, such as a synchronous file download. This can result in degraded performance.