Skip to content

๐Ÿญ Preprocessing Layers Factory

The PreprocessorLayerFactory class provides a convenient way to create and manage preprocessing layers for your machine learning models. It supports both standard Keras preprocessing layers and custom layers defined within the KDP framework.

๐ŸŽก Using Keras Preprocessing Layers

All preprocessing layers available in Keras can be used within the PreprocessorLayerFactory. You can access these layers by their class names. Here's an example of how to use a Keras preprocessing layer:

normalization_layer = PreprocessorLayerFactory.create_layer(
    "Normalization",
    axis=-1,
    mean=None,
    variance=None
)
Available layers:

  • Normalization - Standardizes numerical features
  • Discretization - Bins continuous features into discrete intervals
  • CategoryEncoding - Converts categorical data into numeric representations
  • Hashing - Performs feature hashing for categorical variables
  • HashedCrossing - Creates feature crosses using hashing
  • StringLookup - Converts string inputs to integer indices
  • IntegerLookup - Maps integer inputs to indexed array positions
  • TextVectorization - Processes raw text into encoded representations
  • ... and more

๐Ÿ—๏ธ Custom KDP Preprocessing Layers

In addition to Keras layers, the PreprocessorLayerFactory includes several custom layers specific to the KDP framework. Here's a list of available custom layers:

cast_to_float32_layer(name='cast_to_float32', **kwargs) staticmethod

Create a CastToFloat32Layer layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'cast_to_float32'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the CastToFloat32Layer layer.

create_layer(layer_class, name=None, **kwargs) staticmethod

Create a layer using the layer class name, automatically filtering kwargs based on the layer class.

Parameters:

Name Type Description Default
layer_class str | Class Object

The name of the layer class to be created (e.g., 'Normalization', 'Rescaling') or the class object itself.

required
name str

The name of the layer. Optional.

None
**kwargs

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the specified layer class.

date_encoding_layer(name='date_encoding_layer', **kwargs) staticmethod

Create a DateEncodingLayer layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'date_encoding_layer'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the DateEncodingLayer layer.

date_parsing_layer(name='date_parsing_layer', **kwargs) staticmethod

Create a DateParsingLayer layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'date_parsing_layer'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the DateParsingLayer layer.

date_season_layer(name='date_season_layer', **kwargs) staticmethod

Create a SeasonLayer layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'date_season_layer'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the SeasonLayer layer.

distribution_aware_encoder(name='distribution_aware', num_bins=1000, epsilon=1e-06, detect_periodicity=True, handle_sparsity=True, adaptive_binning=True, mixture_components=3, prefered_distribution=None, **kwargs) staticmethod

Create a DistributionAwareEncoder layer.

Parameters:

Name Type Description Default
name str

Name of the layer

'distribution_aware'
num_bins int

Number of bins for quantile encoding

1000
epsilon float

Small value for numerical stability

1e-06
detect_periodicity bool

Whether to detect and handle periodic patterns

True
handle_sparsity bool

Whether to handle sparse data specially

True
adaptive_binning bool

Whether to use adaptive binning

True
mixture_components int

Number of components for mixture modeling

3
specified_distribution DistributionType

Optional specific distribution type to use

required
**kwargs

Additional keyword arguments

{}

Returns:

Type Description
tf.keras.layers.Layer

DistributionAwareEncoder layer

distribution_transform_layer(name='distribution_transform', transform_type='none', lambda_param=0.0, epsilon=1e-10, min_value=0.0, max_value=1.0, clip_values=True, auto_candidates=None, **kwargs) staticmethod

Create a DistributionTransformLayer layer.

Parameters:

Name Type Description Default
name str

Name of the layer

'distribution_transform'
transform_type str

Type of transformation to apply

'none'
lambda_param float

Parameter for parameterized transformations

0.0
epsilon float

Small value for numerical stability

1e-10
min_value float

Minimum value for min-max scaling

0.0
max_value float

Maximum value for min-max scaling

1.0
clip_values bool

Whether to clip values to the specified range

True
auto_candidates list[str]

List of transformations to consider in auto mode

None
**kwargs

Additional keyword arguments

{}

Returns:

Type Description
tf.keras.layers.Layer

DistributionTransformLayer layer

gated_linear_unit_layer(units, name='gated_linear_unit', **kwargs) staticmethod

Create a GatedLinearUnit layer.

Parameters:

Name Type Description Default
units int

Dimensionality of the output space

required
name str

Name of the layer

'gated_linear_unit'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
GatedLinearUnit tf.keras.layers.Layer

A GatedLinearUnit layer instance

gated_residual_network_layer(units, dropout_rate=0.2, name='gated_residual_network', **kwargs) staticmethod

Create a GatedResidualNetwork layer.

Parameters:

Name Type Description Default
units int

Dimensionality of the output space

required
dropout_rate float

Fraction of the input units to drop

0.2
name str

Name of the layer

'gated_residual_network'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
GatedResidualNetwork tf.keras.layers.Layer

A GatedResidualNetwork layer instance

global_numerical_embedding_layer(global_embedding_dim=8, global_mlp_hidden_units=16, global_num_bins=10, global_init_min=-3.0, global_init_max=3.0, global_dropout_rate=0.1, global_use_batch_norm=True, global_pooling='average', name='global_numerical_embedding', **kwargs) staticmethod

Create a GlobalNumericalEmbedding layer.

Parameters:

Name Type Description Default
global_embedding_dim int

Dimension of the final global embedding

8
global_mlp_hidden_units int

Number of hidden units in the global MLP

16
global_num_bins int

Number of bins for discretization

10
global_init_min float

Minimum value for initialization

-3.0
global_init_max float

Maximum value for initialization

3.0
global_dropout_rate float

Dropout rate for regularization

0.1
global_use_batch_norm bool

Whether to use batch normalization

True
global_pooling str

Pooling method to use ("average" or "max")

'average'
name str

Name of the layer

'global_numerical_embedding'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
GlobalNumericalEmbedding tf.keras.layers.Layer

A GlobalNumericalEmbedding layer instance

multi_resolution_attention_layer(num_heads, d_model, embedding_dim=32, name='multi_resolution_attention', **kwargs) staticmethod

Create a MultiResolutionTabularAttention layer.

Parameters:

Name Type Description Default
num_heads int

Number of attention heads

required
d_model int

Dimensionality of the attention model

required
embedding_dim int

Dimension for categorical embeddings

32
name str

Name of the layer

'multi_resolution_attention'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
MultiResolutionTabularAttention tf.keras.layers.Layer

A MultiResolutionTabularAttention layer instance

numerical_embedding_layer(embedding_dim=8, mlp_hidden_units=16, num_bins=10, init_min=-3.0, init_max=3.0, dropout_rate=0.1, use_batch_norm=True, name='numerical_embedding', **kwargs) staticmethod

Create a NumericalEmbedding layer.

Parameters:

Name Type Description Default
embedding_dim int

Dimension of the output embedding

8
mlp_hidden_units int

Number of hidden units in the MLP

16
num_bins int

Number of bins for discretization

10
init_min float

Minimum value for initialization

-3.0
init_max float

Maximum value for initialization

3.0
dropout_rate float

Dropout rate for regularization

0.1
use_batch_norm bool

Whether to use batch normalization

True
name str

Name of the layer

'numerical_embedding'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
NumericalEmbedding tf.keras.layers.Layer

A NumericalEmbedding layer instance

tabular_attention_layer(num_heads, d_model, name='tabular_attention', **kwargs) staticmethod

Create a TabularAttention layer.

Parameters:

Name Type Description Default
num_heads int

Number of attention heads

required
d_model int

Dimensionality of the attention model

required
name str

Name of the layer

'tabular_attention'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
TabularAttention tf.keras.layers.Layer

A TabularAttention layer instance

text_preprocessing_layer(name='text_preprocessing', **kwargs) staticmethod

Create a TextPreprocessingLayer layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'text_preprocessing'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the TextPreprocessingLayer layer.

transformer_block_layer(name='transformer', **kwargs) staticmethod

Create a TransformerBlock layer.

Parameters:

Name Type Description Default
name str

The name of the layer.

'transformer'
**kwargs dict

Additional keyword arguments to pass to the layer constructor.

{}

Returns:

Type Description
tf.keras.layers.Layer

An instance of the TransformerBlock layer.

variable_selection_layer(nr_features=None, units=16, dropout_rate=0.2, name='variable_selection', **kwargs) staticmethod

Create a VariableSelection layer.

Parameters:

Name Type Description Default
nr_features int

Number of input features

None
units int

Dimensionality of the output space

16
dropout_rate float

Fraction of the input units to drop

0.2
name str

Name of the layer

'variable_selection'
**kwargs dict

Additional arguments to pass to the layer

{}

Returns:

Name Type Description
VariableSelection tf.keras.layers.Layer

A VariableSelection layer instance