Skip to the content.

Databricks Certified Data Engineer Associate Questions v2 & v3

Suggestions

Repo link

Topics

  1. Data not available due to - vacuum or merge or optimize command
  2. Delta lake becomes - single source of truth based question
  3. Delta table contains - single/multiple files for history, metadata and data
  4. Advantage of delta lake over data lake
  5. Web application is part of - Control plane
  6. What needs to be done outside repo - pull/push/commit/clone
  7. Advantage of repo over notebook version - branching
  8. Delta lake is ACID compliant
  9. How to avoid duplicates - MERGE
  10. Question on INSERT OVERWRITE
  11. Question on Z order for practice exam
  12. Why Copy into is not working in this code - reference
  13. Expect or Drop on violation in DLT - reference
  14. Unity catalog Grant All priviledges - When to use?
  15. Unity catalog Grant Usage- When to use?
  16. Advantage of array function
  17. processingTime = “5 seconds”- refer practice exam
  18. Practice exam Q36 but Continuos + Production
  19. Which physical object to create for 10 tables so that other teams can use
  20. Delete metadatd but retain file - external table
  21. When “Streaming Live”- refer practice exam
  22. PII data using comment- Create table <tbl> comment “Contains PII”
  23. describe database customer360 to get path
  24. Adv of gold table over silver table
  25. Bronze vs raw table
  26. Practice exam Q31 but which one is silver to bronze code
  27. How to create dependent task in DLT pipeline
  28. How to speed up query execution - refer practice exam
  29. How not to run a particular block of code on Sunday
  30. Where to see DQ matric in DLT
  31. Execute DLT from?
  32. Save cost by using serverless endpoint or control DBU in sql warehouse
  33. Manager is worried about over costing after project release - how to save cost
  34. Practice exam Q40
  35. Reduce cluster cost- add autostop in sql endpoint?
  36. Practice exam Q1
  37. Practice exam Q3
  38. spark.table(“mytable”) or spark.delta.table(“mytable”) or spark.sql(“mytable”) in pyspark
  39. jdbc driver name for sqlite
  40. two table = march_transaction and april_transaction. create all_transaction without duplicates = join/merge/union
  41. Practice exam Q27
  42. Practice exam Q33
  43. Check failed status of a task in DLT pipeline?
  44. Practice exam Q42 - webhook or email alert?
  45. To speed up query - use cluster pools?

Lessons Learned

No matter what, please attempt the practice exam thoroughly. The answers in each question becomes another question. You will have ample amount of time during assessment.