Skip to content

๐Ÿ‘๏ธ Tabular Attention

Tabular Attention

Discover hidden relationships in your data with sophisticated attention mechanisms

๐Ÿ“‹ Overview

Tabular Attention is a powerful feature in KDP that enables your models to automatically discover complex interactions between features in tabular data. Based on attention mechanisms from transformers, it helps your models focus on the most important feature relationships without explicit feature engineering.

๐Ÿ”

Automatic Interaction Discovery

Finds complex feature relationships without manual engineering

๐ŸŽฏ

Context-Aware Processing

Each feature is processed in the context of other features

๐Ÿ“ˆ

Improved Performance

Better predictions through enhanced feature understanding

๐Ÿ”„

Flexible Integration

Works seamlessly with other KDP processing techniques

๐Ÿ“Š

Hierarchical Learning

Captures both low-level and high-level patterns

๐Ÿš€ Getting Started

from kdp import PreprocessingModel, FeatureType

# Define features
features_specs = {
    "age": FeatureType.FLOAT_NORMALIZED,
    "income": FeatureType.FLOAT_RESCALED,
    "occupation": FeatureType.STRING_CATEGORICAL,
    "education": FeatureType.INTEGER_CATEGORICAL
}

# Initialize model with standard tabular attention
preprocessor = PreprocessingModel(
    path_data="data/my_data.csv",
    features_specs=features_specs,
    tabular_attention=True,              # Enable tabular attention
    tabular_attention_heads=4,           # Number of attention heads
    tabular_attention_dim=64,            # Attention dimension
    tabular_attention_dropout=0.1        # Dropout rate
)

๐Ÿง  How It Works

Tabular Attention Architecture

KDP's tabular attention mechanism transforms features through a multi-head attention mechanism, allowing the model to learn complex patterns across features.

Standard Tabular Attention

1

Inter-Feature Attention

Features attend to each other within each sample, capturing dependencies between different features.

2

Inter-Sample Attention

Samples attend to each other for each feature, capturing patterns across different samples.

3

Feed-Forward Networks

Process attended features further with non-linear transformations.

Multi-Resolution Tabular Attention

1

Specialized Processing

Numerical and categorical features processed through type-specific attention mechanisms.

2

Cross-Attention

Enables features to attend across different types, capturing complex interactions.

3

Type-Specific Projections

Each feature type gets custom embedding dimensions for optimal representation.

๐Ÿ“Š Model Architecture

KDP's tabular attention mechanism:

Tabular Attention Architecture

The diagram shows how tabular attention transforms features through a multi-head attention mechanism, allowing the model to learn complex patterns across features.

๐Ÿ’ก How to Enable

โš™๏ธ Configuration Options

General Options

Parameter Type Default Description
tabular_attention bool False Enable/disable attention mechanisms
tabular_attention_heads int 4 Number of attention heads
tabular_attention_dim int 64 Dimension of the attention model
tabular_attention_dropout float 0.1 Dropout rate for regularization

Multi-Resolution Options

Parameter Type Default Description
tabular_attention_embedding_dim int 32 Dimension for categorical embeddings
tabular_attention_placement str "ALL_FEATURES" Where to apply attention

Placement Options

Option Description Best For
ALL_FEATURES Apply to all features uniformly General purpose, balanced datasets
NUMERIC Only numerical features Datasets dominated by numerical features
CATEGORICAL Only categorical features Datasets with important categorical relationships
MULTI_RESOLUTION Type-specific attention Mixed data types with different importance

๐ŸŽฏ Best Use Cases

When to Use Standard Tabular Attention

  • When your features are mostly of the same type
  • When you have a balanced mix of numerical and categorical features
  • When feature interactions are likely uniform across feature types
  • When computational efficiency is a priority

When to Use Multi-Resolution Attention

  • When you have distinctly different numerical and categorical features
  • When categorical features need special handling (high cardinality)
  • When feature interactions between types are expected to be important
  • When certain feature types dominate your dataset

๐Ÿ” Examples

Customer Analytics with Standard Attention

from kdp import PreprocessingModel, FeatureType
from kdp.enums import TabularAttentionPlacementOptions

features_specs = {
    "customer_age": FeatureType.FLOAT_NORMALIZED,
    "account_age": FeatureType.FLOAT_NORMALIZED,
    "avg_purchase": FeatureType.FLOAT_RESCALED,
    "total_orders": FeatureType.FLOAT_RESCALED,
    "customer_type": FeatureType.STRING_CATEGORICAL,
    "region": FeatureType.STRING_CATEGORICAL
}

preprocessor = PreprocessingModel(
    path_data="data/customer_data.csv",
    features_specs=features_specs,
    tabular_attention=True,
    tabular_attention_heads=4,
    tabular_attention_dim=64,
    tabular_attention_dropout=0.1,
    tabular_attention_placement=TabularAttentionPlacementOptions.ALL_FEATURES.value
)

Product Recommendations with Multi-Resolution Attention

from kdp import PreprocessingModel, FeatureType
from kdp.enums import TabularAttentionPlacementOptions

features_specs = {
    # Numerical features
    "user_age": FeatureType.FLOAT_NORMALIZED,
    "days_since_last_purchase": FeatureType.FLOAT_RESCALED,
    "avg_session_duration": FeatureType.FLOAT_NORMALIZED,
    "total_spend": FeatureType.FLOAT_RESCALED,
    "items_viewed": FeatureType.FLOAT_RESCALED,

    # Categorical features
    "gender": FeatureType.STRING_CATEGORICAL,
    "product_category": FeatureType.STRING_CATEGORICAL,
    "device_type": FeatureType.STRING_CATEGORICAL,
    "subscription_tier": FeatureType.INTEGER_CATEGORICAL,
    "day_of_week": FeatureType.INTEGER_CATEGORICAL
}

preprocessor = PreprocessingModel(
    path_data="data/recommendation_data.csv",
    features_specs=features_specs,
    tabular_attention=True,
    tabular_attention_heads=8,              # More heads for complex interactions
    tabular_attention_dim=128,              # Larger dimension for rich representations
    tabular_attention_dropout=0.15,         # Slightly higher dropout for regularization
    tabular_attention_embedding_dim=64,     # Larger embedding for categorical features
    tabular_attention_placement=TabularAttentionPlacementOptions.MULTI_RESOLUTION.value
)

๐Ÿ“Š Performance Considerations

Memory Usage

  • Standard Attention: O(nยฒ) memory complexity for n features
  • Multi-Resolution: O(n_numยฒ + n_catยฒ) memory complexity
  • For large feature sets, multi-resolution is more efficient

Computational Cost

  • Attention mechanisms introduce additional training time
  • Multi-head attention scales linearly with number of heads
  • Multi-resolution can be faster when categorical features dominate

Guidelines:

Dataset Size Attention Type Recommended Heads Dimension
Small (<10K) Standard 2-4 32-64
Medium Standard/Multi-Resolution 4-8 64-128
Large (>100K) Multi-Resolution 8-16 128-256

๐Ÿ’ก Pro Tips

Head Count Selection

Start with 4 heads for most problems, increase for complex feature interactions, but beware of overfitting with too many heads.

Dimension Tuning

Choose dimensions divisible by number of heads, larger for complex patterns, but balance with dataset size to avoid overfitting.

Placement Strategy

Use ALL_FEATURES for initial experimentation, MULTI_RESOLUTION for mixed data types, and NUMERIC/CATEGORICAL for targeted focus.