Skip to content

๐Ÿ” Passthrough Features

Passthrough Features in KDP

Include pre-processed data and custom vectors in your model without additional transformations.

๐Ÿ“‹ Overview

Passthrough features allow you to include data in your model without any preprocessing modifications. They're perfect for pre-processed data, custom vectors, and scenarios where you need to preserve exact values.

โšก

Direct Integration

Include pre-processed data without modifications

๐Ÿ”ข

Custom Vectors

Use pre-computed embeddings and vectors

๐Ÿ“Š

Raw Values

Preserve exact original values in your model

๐Ÿ”„

Flexible Integration

Seamlessly combine with other feature types

๐Ÿš€ When to Use Passthrough Features

๐Ÿ”„

Pre-processed Data

You have already processed the data externally

๐Ÿ”ข

Custom Vectors

You want to include pre-computed embeddings or vectors

๐Ÿ“Š

Raw Values

You need the exact original values in your model

๐Ÿ”

Feature Testing

You want to compare raw vs processed feature performance

๐Ÿ”„

Gradual Migration

You're moving from a legacy system and need compatibility

๐Ÿ’ก Defining Passthrough Features

from kdp import PreprocessingModel, FeatureType
from kdp.features import PassthroughFeature
import tensorflow as tf

# Simple approach using enum
features = {
    "embedding_vector": FeatureType.PASSTHROUGH,
    "age": FeatureType.FLOAT_NORMALIZED,
    "category": FeatureType.STRING_CATEGORICAL
}

# Advanced configuration with PassthroughFeature class
features = {
    "embedding_vector": PassthroughFeature(
        name="embedding_vector",
        dtype=tf.float32  # Specify the data type
    ),
    "raw_text_embedding": PassthroughFeature(
        name="raw_text_embedding",
        dtype=tf.float32
    ),
    "age": FeatureType.FLOAT_NORMALIZED,
    "category": FeatureType.STRING_CATEGORICAL
}

# Create your preprocessor
preprocessor = PreprocessingModel(
    path_data="customer_data.csv",
    features_specs=features
)

๐Ÿ“Š How Passthrough Features Work

Passthrough Feature Model

Passthrough features are included in model inputs without any transformations, maintaining their original values throughout the pipeline.

โž•

Added to Inputs

Included in model inputs like other features

๐Ÿ”„

Type Casting

Cast to specified dtype for compatibility

โšก

No Transformation

Pass through without normalization or encoding

๐ŸŽฏ

Feature Selection

Optional feature selection if enabled

๐Ÿ”ง Configuration Options

Parameter Type Description
name str The name of the feature
feature_type FeatureType Set to FeatureType.PASSTHROUGH by default
dtype tf.DType The data type of the feature (default: tf.float32)

๐ŸŽฏ Example: Using Pre-computed Embeddings

import pandas as pd
from kdp import PreprocessingModel, FeatureType
from kdp.features import PassthroughFeature, NumericalFeature
import tensorflow as tf

# Define features
features = {
    # Regular features
    "age": NumericalFeature(
        name="age",
        feature_type=FeatureType.FLOAT_NORMALIZED
    ),
    "category": FeatureType.STRING_CATEGORICAL,

    # Passthrough features for pre-computed embeddings
    "product_embedding": PassthroughFeature(
        name="product_embedding",
        dtype=tf.float32
    )
}

# Create your preprocessor
preprocessor = PreprocessingModel(
    path_data="data.csv",
    features_specs=features
)

# Build the model
model = preprocessor.build_preprocessor()

โš ๏ธ Things to Consider

๐Ÿ”

Data Type Compatibility

Ensure the data type of your passthrough feature is compatible with the overall model

๐Ÿ“Š

Dimensionality

Make sure the feature dimensions fit your model architecture

๐Ÿงน

Data Quality

Since no preprocessing is applied, ensure your data is clean and ready for use

โšก

Performance Impact

Using raw data may affect model performance; test both approaches

๐Ÿš€ Best Practices

๐Ÿ“

Document Your Decision

Make it clear why certain features are passed through

๐Ÿ”

Test Both Approaches

Compare passthrough vs preprocessed features for performance

๐ŸŽฏ

Feature Importance

Use feature selection to see if passthrough features contribute meaningfully

๐Ÿ“ˆ

Monitor Gradients

Watch for gradient issues since passthrough features may have different scales