Skip to content

๐ŸŒŸ Keras Data Processor (KDP)

Transform your raw data into powerful ML-ready features

A high-performance preprocessing library for tabular data built on TensorFlow. KDP combines the best of traditional preprocessing with advanced neural approaches to create state-of-the-art feature transformations.

๐Ÿ”— Integrations

๐Ÿ“š Examples

๐Ÿ“š Reference

๐Ÿค Contributing

๐Ÿ“ˆ Key Features

  • โœ“ Smart distribution detection
  • โœ“ Neural feature interactions
  • โœ“ Feature-wise Mixture of Experts
  • โœ“ Memory-efficient processing
  • โœ“ Single-pass optimization
  • โœ“ Production-ready scaling

๐Ÿ† Why Choose KDP?

Challenge Traditional Approach KDP's Solution
Complex Distributions Fixed binning strategies ๐Ÿ“Š Distribution-Aware Encoding that adapts to your specific data
Interaction Discovery Manual feature crosses ๐Ÿ‘๏ธ Tabular Attention that automatically finds important relationships
Heterogeneous Features Uniform processing ๐Ÿงฉ Feature-wise Mixture of Experts that specializes processing per feature
Feature Importance Post-hoc analysis ๐ŸŽฏ Built-in Feature Selection during training
Performance at Scale Memory issues with large datasets โšก Optimized Processing Pipeline with batching and caching

๐Ÿš€ Quick Example

from kdp import PreprocessingModel, FeatureType

# Define your features
features = {
    "age": FeatureType.FLOAT_NORMALIZED,
    "income": FeatureType.FLOAT_RESCALED,
    "occupation": FeatureType.STRING_CATEGORICAL,
    "description": FeatureType.TEXT
}

# Create and build your preprocessor
preprocessor = PreprocessingModel(
    path_data="data.csv",
    features_specs=features,
    use_distribution_aware=True,  # Smart distribution handling
    tabular_attention=True,       # Automatic feature interactions
    use_feature_moe=True,         # Specialized processing per feature
    feature_moe_num_experts=4     # Number of specialized experts
)

# Build and use
result = preprocessor.build_preprocessor()
model = result["model"]

๐Ÿ”„ Architecture Diagram

๐Ÿ” Find What You Need

๐Ÿ”ฐ New to KDP? Start with the Quick Start Guide
๐Ÿ” Specific feature type? Check the Feature Processing section
โšก Performance issues? See the Optimization guides
๐Ÿ”Œ Integration help? Visit the Integration Overview section
๐Ÿ“ Practical examples? Browse our Examples
๐Ÿ“š API details? Refer to the API Reference documentation

๐Ÿ“ฃ Community & Support

๐Ÿ™ GitHub Repository ๐Ÿ› Issue Tracker
๐Ÿ“œ MIT License - Open source and free to use