Azure Data Engineer Associate Questions

Suggestions

Read this DP203 Notes1
Read this DP203 Notes2
Practice this github by Microsoft
Udemy course[I personally opted this which includes practice questions too. Use Udemy for Bussiness for free]

Repo link

Questions

hot vs cold vs archive tiers- days/when to choose
When to switch from databricks standard cluster to premium based on limitations.
Read databricks cluster config json based questions
Table with clustered column, hash distribution- which columns - id col/date col
Dim table with star schema- which is SK
When to choose SHIR
Azure HD Insight based question- basic functionality
Synapse life cycle management vs soft delete vs retention vs delete from event hub
Event hub - reference input and Stream input
Is there data skewness in given table
Monitor activities of adf + adb containing jar/notebook/copy
Is the table of type SCD 2
Date dim table and Transactions on fact table
Dim table- type- replicated
SCD 2 update using - MERGE/UPDATE/INSERT
IoT folder structure when engineers from multiple region access the data - raw/regionid/yyyy/mm/dd/devideid.csv
find the SK- identity col
Remove old data - switch partiton or delete-where
Clustered index usage
Copy from sql to synapse using R language- MDF/Copu into/databricks
storage format preferred for IoT with high compressibility - csv/txt/avro
Repo - collab branch/publish branch/root folder - where is ARM located & where is ARM of xyz located
Realtime data in ADLS - autoloader or copy data
Rentention setting in Event hub??
Log analytics montitor - KQL or adf monitor based
Count of tweet every 10 sec- query
Count of tweet every 10 sec in last sec - query- hop/window
Read ADLS based on given situation - SAS/MIdent/AKeys
CLS based Q
RLS based Q
ADF log for 180 days- how?
Data flow debug based- delay
Which pipeline failed from given image
Read json synapse query - filedquote?
cross apply - openjson/opendataset/openrowset
txt file has list of table name. read thos tables in adf - filter/lookup
%%scala, scala_df.write.__(db) - load/saveastable/synapsesql
synapse spark pool measuting unit - monitor?
Trigger type from given scenario
data>10000? from the shown table, dbcc pdw_showspace….
streaming data in adls - how to read in databricks
transaction if failed should rollback - begin tran/rollback tran in catch statement/commit tran - query
sql pool/spark pool when to use based on situation
append or update based on situation
json- flatten/expand/explode- query synapse
SAS key least maintenance based on situation
Transparent data encryption based question
Case study on contoso- partition col?, distribution type for transaction?, table or ext table?, range right for right boundaries?

Lessons Learned

No matter what, please read the following topics carefully