Skip to main content

Command Palette

Search for a command to run...

Databricks Temp Views and Caching

Published
1 min read
Databricks Temp Views and Caching

There are two kinds of temp views:

  1. Session based
  2. Global

The temp views, once created, are not registered in the underlying metastore. The non-global (session) temp views are session based and are purged when the session ends.

The global temp views are stored in system preserved temporary database called global_temp.

There are two ways to created a temp view from a DataFrame:

  1. createOrReplaceTempView
  2. createOrReplaceGlobalTempView
# Python
spark.read \
  .format("delta") \
  .load(batch_source_path) \
  .createOrReplaceTempView(batch_temp_view)
#  .createOrReplaceGlobalTempView(batch_temp_view)

The Delta Engine gains some of the optimization through the caching layer that sits between the execution layer and the cloud object store.

There are also two ways to cache a temp view:

  1. spark.catalog.cacheTable(name)
  2. dataFrame.cache()
# Python
# Cache using the spark catalog
spark.catalog.cacheTable(batch_temp_view)

To cache a DataFrame object

# Python
df = spark.read \
  .format("delta") \
  .load(batch_source_path)

df.cache()