.. _label_build_python_model: Build a SMAUG model with Python API =================================== SMAUG's Python frontend provides easy APIs to build DL models. The created model is a computational graph, which is serialized into two protobuf files, one for the model topology and the other for the parameters. These two files are inputs to SMAUG's C++ runtime that performs the actual simulation. In this tutorial, we will be using the SMAUG Python APIs to build new DL models. Before building a model, we need to create a `Graph` context in which we will add operators. .. code-block:: python import smaug as sg with sg.Graph(name="my_model", backend="SMV") as graph: # Any operators instantiated within the context will be added to `graph`. When using the :class:`smaug.Graph` API to create a graph context, we need to give a name for the model and select the backend we want to use when running the model through the C++ runtime. A backend in SMAUG is a logical combination of hardware blocks that implements all the SMAUG operators. Refer to `C++ docs `_ for more details of a backend implementation in the C++ runtime. Here, we choose the :code:`SMV` backend that comes with SMAUG, which is modeled after the NVDLA architecture. SMAUG also has another backend named :code:`Reference`, which is a reference implementation without having specific performance optimizations. Also, refer to :class:`smaug.Graph` for a detailed description of the parameters. Now we can start adding operators to the graph context. The following gives an example of building a simple 3-layer model. .. code-block:: python import numpy as np import smaug as sg def generate_random_data(shape): r = np.random.RandomState(1234) return (r.rand(*shape) * 0.005).astype(np.float16) with sg.Graph(name="my_model", backend="SMV") as graph: input_tensor = sg.Tensor( data_layout=sg.NHWC, tensor_data=generate_random_data((1, 28, 28, 1))) conv_weights = sg.Tensor( data_layout=sg.NHWC, tensor_data=generate_random_data((32, 3, 3, 1))) fc_weights = sg.Tensor( data_layout=sg.NC, tensor_data=generate_random_data((10, 6272))) # Shape of act: [1, 28, 28, 1]. act = sg.input_data(input_tensor) # After the convolution, shape of act: [1, 32, 28, 28]. act = sg.nn.convolution( act, conv_weights, stride=[1, 1], padding="same", activation="relu") # After the max pooling, shape of act: [1, 32, 14, 14]. act = sg.nn.max_pool(act, pool_size=[2, 2], stride=[2, 2]) # After the matrix multiply, shape of act: [1, 10]. act = sg.nn.mat_mul(act, fc_weights) As we create the first operator for the model, we need to first prepare an input tensor and weight tensors that are used by the operator. Tensors are represented by the :class:`smaug.Tensor`. In the example, we create an input tensor :code:`input_tensor` using the API. Here, we specify the :code:`data_layout` as :code:`sg.NHWC`, which stands for a 4D tensor shape with the channel-major layout. We also specify the :code:`tensor_data` parameter with a randomly generated NumPy array, with a shape of :code:`[1, 28, 28, 1]`. However, the user can use the real weights extracted from a pretrained model. Likewise, we create two weight tensors that will be used by a convolution operator and a matrix multiply operator, respectively. A :func:`smaug.data_op`, which simply forwards an input tensor to its output, is required for any tensor that is not the output of another operator. Here, :code:`act` is a reference to code:`input_tensor`. Then, :code:`act` is fed to a convolution operator that also takes :code:`conv_weights` as its filter input. With more details provided in :func:`smaug.nn.convolution`, it computes a 3D convolution given the 4D input and filter tensors, and we use 1x1 strides, the :code:`same` padding and a ReLU activation fused with the convolution operation. The output of it then goes through a max pooling operator with a 2x2 filter size, which in turn fans its output into the last matrix multiply operator. Note that since the output of the max pooling operator is a 4D tensor while :func:`smaug.nn.mat_mul` expects a 2D input tensor, SMAUG will automatically add a layout transformation operator :func:`smaug.tensor.reorder` in between to make the data layout format compatible. Thus, the 4D tensor of shape :code:`[1, 32, 14, 14]` will be flattened into a 2D tensor of shape :code:`[1, 6272]` before running the matrix multiply. Similarly, SMAUG will also perform the NHWC to NCHW layout transformation or vice versa as per the expected layout format of the backend. After finishing adding operators to the model, we can now take a look at the summary of the model using the :func:`smaug.Graph.print_summary` API. .. code-block:: python graph.print_summary() This prints model-level information and operator-specific properties as below:: ================================================================= Summary of the network: my_model (SMV) ================================================================= Host memory access policy: AllDma. ----------------------------------------------------------------- Name: data (Data) Parents: Children:conv Input tensors: data/input0 Float16 [1, 28, 28, 1] NHWC alignment(8) Output tensors: data/output0 Float16 [1, 28, 28, 1] NHWC alignment(8) ----------------------------------------------------------------- Name: data_1 (Data) Parents: Children:conv Input tensors: data_1/input0 Float16 [32, 3, 3, 1] NHWC alignment(8) Output tensors: data_1/output0 Float16 [32, 3, 3, 1] NHWC alignment(8) ----------------------------------------------------------------- Name: conv (Convolution3d) Parents:data data_1 Children:max_pool Input tensors: data/output0 Float16 [1, 28, 28, 1] NHWC alignment(8) data_1/output0 Float16 [32, 3, 3, 1] NHWC alignment(8) Output tensors: conv/output0 Float16 [1, 28, 28, 32] NHWC alignment(8) ----------------------------------------------------------------- Name: max_pool (MaxPooling) Parents:conv Children:reorder Input tensors: conv/output0 Float16 [1, 28, 28, 32] NHWC alignment(8) Output tensors: max_pool/output0 Float16 [1, 14, 14, 32] NHWC alignment(8) ----------------------------------------------------------------- Name: reorder (Reorder) Parents:max_pool Children:mat_mul Input tensors: max_pool/output0 Float16 [1, 14, 14, 32] NHWC alignment(8) Output tensors: reorder/output0 Float16 [1, 6272] NC alignment(8) ----------------------------------------------------------------- Name: data_2 (Data) Parents: Children:mat_mul Input tensors: data_2/input0 Float16 [10, 6272] NC alignment(8) Output tensors: data_2/output0 Float16 [10, 6272] NC alignment(8) ----------------------------------------------------------------- Name: mat_mul (InnerProduct) Parents:reorder data_2 Children: Input tensors: reorder/output0 Float16 [1, 6272] NC alignment(8) data_2/output0 Float16 [10, 6272] NC alignment(8) Output tensors: mat_mul/output0 Float16 [1, 10] NC alignment(8) ----------------------------------------------------------------- Finally, we can export the model files using the :func:`smaug.Graph.write_graph` API. .. code-block:: python graph.write_graph() This gives us two files named :code:`my_model_topo.pbtxt` and :code:`my_model_params.pb`, where the former stores all the model information except for the parameters, which are stored in the latter. This separation is helpful for us to quickly check things in the human readable topology file while still compressing as much as possible the oftentimes large paramaters. We can now move on to the `C++ side tutorials `_ that explain the details of using these two files to run the model.