Databricks Cheat Sheet 1

1 min readApr 8, 2023

--

Cluster Management

Create a cluster: Clusters > Create Cluster.
Edit cluster configuration: Clusters > Edit.
Terminate a cluster: Clusters > Terminate.

Notebook Basics

Create a notebook: Workspace > Create > Notebook.
Rename a notebook: Click on the notebook’s name and type the new name.
Delete a notebook: Right-click on the notebook and select Delete.
Run a cell: Click on the cell and press Shift + Enter.
Add a new cell: Click on the + button or press Ctrl + Enter.
Move a cell: Click on the up/down arrow buttons.
Copy a cell: Click on the copy button.
Delete a cell: Click on the delete button.

Data Management

Upload a file: Workspace > Upload Data.
Mount external storage: Workspace > Create > Mount.
Create a table: Workspace > Create > Table.
Browse tables: Data > Tables.
Create a view: Data > Views.
Query data: Use SQL or Spark code in a notebook.

Spark Basics

Create a Spark context: val sc = sparkContext.
Create a Spark session: val spark = SparkSession.builder().appName("MyApp").getOrCreate().
Read data: val df = spark.read.format("csv").option("header", "true").load("file.csv").
Write data: df.write.format("csv").mode("overwrite").save("output").
Transform data: Use Spark’s DataFrame API.
Aggregate data: Use Spark’s DataFrame API or SQL.
Join data: Use Spark’s DataFrame API or SQL.

Visualization

Plot data: Use the %matplotlib magic command or third-party libraries like plotly.
Show data: Use the display function.

Machine Learning

Import MLlib: import org.apache.spark.ml._.
Train a model: Use Spark’s MLlib API.
Evaluate a model: Use Spark’s MLlib API.

Additional Resources

Databricks documentation: https://docs.databricks.com
Databricks Community: https://community.databricks.com

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Mayur Saparia

Data engineering is my profession, making data available for analytics from various source is my responsibility. Passionate about Big data technology and cloud.

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Help
Status
About
Careers
Press
Blog
Privacy
Rules
Terms
Text to speech