Skip to main content
Ctrl+K
ETL Toolkit - Library of data transformations to build high-quality pipelines fast - Home
  • E module (“expressions”)
  • A module (“analyses”)
  • Table creation functions
  • Mutable Table functions
  • GitHub
  • E module (“expressions”)
  • A module (“analyses”)
  • Table creation functions
  • Mutable Table functions
  • GitHub

Welcome to etl_toolkit’s documentation!#

The etl_toolkit contains many utilities to write better pyspark to simplify pipelines on Databricks. Navigate to each module of the etl_toolkit to learn about the functions provided and how to use them.

  • expressions: Contains functions for deriving complex pyspark.Columns

  • analyses: Contains functions for deriving complex pyspark.DataFrames that apply many data transformations.

Contents:

  • E module (“expressions”)
    • Aggregate Expressions
    • Boolean Expressions
    • Calculation Expressions
    • ID Expressions
    • Mapping Expressions
      • chain_assigns()
    • Normalization Expressions
    • Regex Expressions
    • Time Expressions
  • A module (“analyses”)
    • Calculation Analyses
    • Card Analyses
    • E-Receipt Analyses
    • Dedupe Analyses
    • Index Analyses
    • Investor Standard Analyses
    • Investor Reporting Analyses
      • backtest_configuration()
      • add_unified_consensus_column()
    • Ordering Analyses
    • Parser Analyses
      • parser()
    • Scalar Analyses
    • Time Analyses
    • Comparison Analyses
    • Investor Standard Metrics Analyses
    • Calendar Analyses
  • Table creation functions
    • Batch Tables
    • Standard Metrics Tables
  • Mutable Table functions
    • Table Operations

Indices and tables#

  • Index

  • Module Index

  • Search Page

next

E module (“expressions”)

On this page
  • Welcome to etl_toolkit’s documentation!
  • Indices and tables
Show Source

© Copyright 2024, YipitData.

Created using Sphinx 7.3.7.

Built with the PyData Sphinx Theme 0.15.4.