smaug.nn

smaug.nn.convolution(input_tensor, filter_tensor, stride, padding, activation=None, activation_params=None, name='conv')

Compute a 3D Convolution given 4D input_tensor and filter_tensor.

Parameters
  • input_tensor – A 4D Tensor.

  • filter_tensor – A 4D Tensor.

  • stride – A list of two integers: [row_stride, col_stride].

  • padding – A string from: same, valid. The zero padding options.

  • activation – A string representing the activation function (optional).

  • activation_params – kwargs for the activation function (optional).

  • name – Operator name (optional).

smaug.nn.batch_norm(input_tensor, mean_tensor, var_tensor, gamma_tensor, beta_tensor, activation=None, activation_params=None, name='batch_norm')

Perform batch normalization.

Parameters
  • input_tensor – A 2D or 4D Tensor.

  • mean_tensor – Mean parameter.

  • var_tensor – Variance parameter. For performance reasons, this is precomputed as 1/sqrt(variance + eps).

  • gamma_tensor – Gamma parameter.

  • beta_tensor – Beta parameter.

  • activation/activation_params – Activation function to use (optional).

  • name – Operator name (optional).

smaug.nn.max_pool(input_tensor, pool_size, stride, name='max_pool')

Compute max pooling.

Parameters
  • input_tensor – A 4D Tensor.

  • pool_size – A list of two integers: [pool_rows, pool_cols].

  • stride – A list of two integers: [row_stride, col_stride].

  • name – Operator name (optional).

smaug.nn.mat_mul(input_tensor, weight_tensor, activation=None, activation_params=None, name='mat_mul')

Compute a matrix multiplication for input_tensor and weight_tensor.

Parameters
  • input_tensor – A 2D Tensor. Shaped as NC, where N is batch size and C is number of channels.

  • weight_tensor – A 2D Tensor. Shaped as NC or CN, where N is number of neurons and C is the same as in input_tensor.

  • activation/activation_params – Activation function to use (optional).

  • name – Operator name (optional).

class smaug.nn.LSTM(weight_tensors, activation='tanh', activation_params={}, name='lstm')

An LSTM layer.

Parameters
  • weight_tensors – A list of two weights.

  • activation – Activation function used in LSTM.

  • activation_params – kwargs for the activation function.

prepare_states()

Initialize states as zeros.

step(input_tensor, timestep)

Invoke this cell for a single timestep.

Parameters
  • input_tensor – An input tensor of shape [batch, depth].

  • timestep – The start timestep. This is used for naming the output tensors.

Returns

  1. An output tensor of shape [Batch, Depth].

  2. The final state of the LSTM.

Return type

Output contains two parts

class smaug.nn.BidirectionalLSTM(fwd_weight_tensors, bwd_weight_tensors, activation='tanh', activation_params={}, name='bidir_lstm')

A bidirectional LSTM layer.

Parameters
  • fwd_weight_tensors – weights used for the forward LSTM.

  • bwd_weight_tensors – weights used for the backward LSTM.

  • activation/activation_params – See in the LSTM class.

class smaug.nn.BahdanauAttention(memory, w_encoder, w_decoder, w_alignment, name='bahdanau_attention')

Implements Bahdanau attention.

The attention implementation is described in:

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio. “Neural Machine Translation by Jointly Learning to Align and Translate.” ICLR 2015. https://arxiv.org/abs/1409.0473

Construct base Attention class.

Parameters
  • name – Name to use when creating ops.

  • memory – The memory to query; usually the outputs of an RNN encoder. The tensor should be shaped [batch, time, depth].

  • w_encoder – The weight used for the memory layer shaped [depth, depth].

  • w_decoder – The weight used for the query layer shaped [depth, depth].

compute_score(query)
smaug.nn.relu(input_tensor, name='relu')

Rectified linear unit operator.

smaug.nn.lrelu(input_tensor, slope=0.2, name='lrelu')

Leaky rectified linear unit operator: max(slope * x, 0).

smaug.nn.elu(input_tensor, alpha=0.1, name='relu')

Exponential linear unit function.

Defined as:

if input_tensor > 0, alpha * exp(input_tensor - 1), else input_tensor.

smaug.nn.selu(input_tensor, alpha=1.6733, lambda_param=1.0507, name='selu')

Scaled exponential linear unit function.

Defined as: lambda_param * elu(input_tensor, alpha).

smaug.nn.tanh(input_tensor, name='tanh')

Tanh operator.

smaug.nn.hard_tanh(input_tensor, min=- 1, max=1, name='hard_tanh')

Hard tanh operator.

This bounds the min and max values of the tanh operator.

smaug.nn.sigmoid(input_tensor, name='sigmoid')

Sigmoid operator.

Defined as 1/(1 + exp(-input_tensor)).

smaug.nn.softmax(input_tensor, name=None)

Softmax operator.