Which type of Databricks cluster should you use?

Posted by: Pdfprep Category: DP-200 Tags: , ,

You plan to perform batch processing in Azure Databricks once daily.

Which type of Databricks cluster should you use?
A . job
B . interactive
C . High Concurrency

Answer: A

Explanation:

Example: Scheduled batch workloads (data engineers running ETL jobs)

This scenario involves running batch job JARs and notebooks on a regular cadence through the Databricks platform.

The suggested best practice is to launch a new cluster for each run of critical jobs. This helps avoid any issues (failures, missing SLA, and so on) due to an existing workload (noisy neighbor) on a shared cluster.

Note: Azure Databricks has two types of clusters: interactive and automated. You use interactive clusters to analyze data collaboratively with interactive notebooks. You use automated clusters to run fast and robust automated jobs.

References: https://docs.databricks.com/administration-guide/cloud-configurations/aws/cmbp.html#scenario-3-scheduledbatch-workloads-data-engineers-running-etl-jobs

Leave a Reply

Your email address will not be published.