Databricks Cheat Sheet 1

Mayur Saparia
1 min readApr 8, 2023

Cluster Management

  • Create a cluster: Clusters > Create Cluster.
  • Edit cluster configuration: Clusters > Edit.
  • Terminate a cluster: Clusters > Terminate.

Notebook Basics

  • Create a notebook: Workspace > Create > Notebook.
  • Rename a notebook: Click on the notebook’s name and type the new name.
  • Delete a notebook: Right-click on the notebook and select Delete.
  • Run a cell: Click on the cell and press Shift + Enter.
  • Add a new cell: Click on the + button or press Ctrl + Enter.
  • Move a cell: Click on the up/down arrow buttons.
  • Copy a cell: Click on the copy button.
  • Delete a cell: Click on the delete button.

Data Management

  • Upload a file: Workspace > Upload Data.
  • Mount external storage: Workspace > Create > Mount.
  • Create a table: Workspace > Create > Table.
  • Browse tables: Data > Tables.
  • Create a view: Data > Views.
  • Query data: Use SQL or Spark code in a notebook.

Spark Basics

  • Create a Spark context: val sc = sparkContext.
  • Create a Spark session: val spark = SparkSession.builder().appName("MyApp").getOrCreate().
  • Read data: val df = spark.read.format("csv").option("header", "true").load("file.csv").
  • Write data: df.write.format("csv").mode("overwrite").save("output").
  • Transform data: Use Spark’s DataFrame API.
  • Aggregate data: Use Spark’s DataFrame API or SQL.
  • Join data: Use Spark’s DataFrame API or SQL.

Visualization

  • Plot data: Use the %matplotlib magic command or third-party libraries like plotly.
  • Show data: Use the display function.

Machine Learning

  • Import MLlib: import org.apache.spark.ml._.
  • Train a model: Use Spark’s MLlib API.
  • Evaluate a model: Use Spark’s MLlib API.

Additional Resources

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Mayur Saparia
Mayur Saparia

Written by Mayur Saparia

Data engineering is my profession, making data available for analytics from various source is my responsibility. Passionate about Big data technology and cloud.

No responses yet

Write a response