You plan to perform batch processing in Azure Databricks once daily.
Which type of Databricks cluster should you use?
A . job
B . interactive
C . High Concurrency
Answer: A
Explanation:
Example: Scheduled batch workloads (data engineers running ETL jobs)
This scenario involves running batch job JARs and notebooks on a regular cadence through the Databricks platform.
The suggested best practice is to launch a new cluster for each run of critical jobs. This helps avoid any issues (failures, missing SLA, and so on) due to an existing workload (noisy neighbor) on a shared cluster.
Note: Azure Databricks has two types of clusters: interactive and automated. You use interactive clusters to analyze data collaboratively with interactive notebooks. You use automated clusters to run fast and robust automated jobs.
References: https://docs.databricks.com/administration-guide/cloud-configurations/aws/cmbp.html#scenario-3-scheduledbatch-workloads-data-engineers-running-etl-jobs
Leave a Reply