In a traditional data warehousing architecture—often guided by the Kimball methodology—data is organized around two […]
Databricks : Job Clusters VS All-Purpose Clusters
In Databricks, clusters are distributed environments used to execute tasks or workloads. There are two […]
To convert the type of column in Apache Spark, you use cast, not convert.
The cast function allows you to change the data type of column in a DataFrame […]
In apache spark the executors accept jobs from the driver or tasks from the driver ?
In Apache Spark, executors accept and execute tasks from the driver. Here’s a breakdown of […]
Apache spark glossary
Slot CPU Core, it is often associated with a CPU core. Each physical core of […]
Redshift LOCK
Overview There are three LOCK mode: AccessExclusiveLock: Acquired primarily during DDL operations, such as ALTER TABLE, DROP, or TRUNCATE. […]
Comprendre les Bases de Données Columnar
Une base de données columnar (ou colonne, en français) est une architecture de stockage de […]
Comprendre le Pivotement en SQL avec le mot clé PIVOT
Le pivotement en SQL est une technique puissante pour transformer les données de lignes en […]