๐ Passthrough Features
Passthrough Features in KDP
Include pre-processed data and custom vectors in your model without additional transformations.
๐ Overview
Passthrough features allow you to include data in your model without any preprocessing modifications. They're perfect for pre-processed data, custom vectors, and scenarios where you need to preserve exact values.
Direct Integration
Include pre-processed data without modifications
Custom Vectors
Use pre-computed embeddings and vectors
Raw Values
Preserve exact original values in your model
Flexible Integration
Seamlessly combine with other feature types
๐ When to Use Passthrough Features
Pre-processed Data
You have already processed the data externally
Custom Vectors
You want to include pre-computed embeddings or vectors
Raw Values
You need the exact original values in your model
Feature Testing
You want to compare raw vs processed feature performance
Gradual Migration
You're moving from a legacy system and need compatibility
๐ก Defining Passthrough Features
from kdp import PreprocessingModel, FeatureType
from kdp.features import PassthroughFeature
import tensorflow as tf
# Simple approach using enum
features = {
"embedding_vector": FeatureType.PASSTHROUGH,
"age": FeatureType.FLOAT_NORMALIZED,
"category": FeatureType.STRING_CATEGORICAL
}
# Advanced configuration with PassthroughFeature class
features = {
"embedding_vector": PassthroughFeature(
name="embedding_vector",
dtype=tf.float32 # Specify the data type
),
"raw_text_embedding": PassthroughFeature(
name="raw_text_embedding",
dtype=tf.float32
),
"age": FeatureType.FLOAT_NORMALIZED,
"category": FeatureType.STRING_CATEGORICAL
}
# Create your preprocessor
preprocessor = PreprocessingModel(
path_data="customer_data.csv",
features_specs=features
)
๐ How Passthrough Features Work

Passthrough features are included in model inputs without any transformations, maintaining their original values throughout the pipeline.
Added to Inputs
Included in model inputs like other features
Type Casting
Cast to specified dtype for compatibility
No Transformation
Pass through without normalization or encoding
Feature Selection
Optional feature selection if enabled
๐ง Configuration Options
Parameter | Type | Description |
---|---|---|
name |
str | The name of the feature |
feature_type |
FeatureType | Set to FeatureType.PASSTHROUGH by default |
dtype |
tf.DType | The data type of the feature (default: tf.float32) |
๐ฏ Example: Using Pre-computed Embeddings
import pandas as pd
from kdp import PreprocessingModel, FeatureType
from kdp.features import PassthroughFeature, NumericalFeature
import tensorflow as tf
# Define features
features = {
# Regular features
"age": NumericalFeature(
name="age",
feature_type=FeatureType.FLOAT_NORMALIZED
),
"category": FeatureType.STRING_CATEGORICAL,
# Passthrough features for pre-computed embeddings
"product_embedding": PassthroughFeature(
name="product_embedding",
dtype=tf.float32
)
}
# Create your preprocessor
preprocessor = PreprocessingModel(
path_data="data.csv",
features_specs=features
)
# Build the model
model = preprocessor.build_preprocessor()
โ ๏ธ Things to Consider
Data Type Compatibility
Ensure the data type of your passthrough feature is compatible with the overall model
Dimensionality
Make sure the feature dimensions fit your model architecture
Data Quality
Since no preprocessing is applied, ensure your data is clean and ready for use
Performance Impact
Using raw data may affect model performance; test both approaches
๐ Best Practices
Document Your Decision
Make it clear why certain features are passed through
Test Both Approaches
Compare passthrough vs preprocessed features for performance
Feature Importance
Use feature selection to see if passthrough features contribute meaningfully
Monitor Gradients
Watch for gradient issues since passthrough features may have different scales