NEO_AKSA – Page 2

neo_aksa January 10, 2022 How to break down databricks DBU cost to the pipeline level

One day we found databricks DBU cost surged, but we didn’t know which ADF job caused this issue. Then I […]

neo_aksa November 13, 2021 A real case of optimazing spark notebook

At the beginning of my optimazation, I tried to find some standard principles that can quickly and smoothly help me. […]

neo_aksa August 17, 2021 Some working performance/cost improvement tips applying to ADF and Databricks recently

While switching to the cloud, we found some pipelines running slowly and cost increased rapidly. To solve the problems, we […]

neo_aksa June 30, 2021 First Glance on GPU Accelerated Spark

Since I started to play with cluster, I thought there was no mission which was not able to be completed […]

neo_aksa April 16, 2021 SCD II or Snapshot for Dimension

SCD II is widely used to process dimensional data with all historical information. Each change in dimensions will be recorded […]

neo_aksa March 20, 2021 Parallel and Redundancy

1990, engineers were fighting for optimizing code performance and increasing CPU speeds. 1994, MPI started to be the dominant model […]

neo_aksa February 13, 2021 Optimize concurrency for merge operation in delta table

Concurrency control is normal in OLTP operations, but for OLAP, not really. So I didn’t take care of it until […]

neo_aksa January 25, 2021 Some features need to be improved in Azure Data Products

Azure Storage Explorer/Data Lake Ghost file In some rare case, if you delete files in ASE, then you call APIs […]

neo_aksa December 25, 2020 I bought a raspberry Pi 4, and tried to replace my PC>:<

Since I got a RP4 8GB from microcenter with a fan case, I tried to use it to replace pc […]

neo_aksa October 21, 2020 Windows package manager(winget)

Since Microsoft moves from windows to cloud in the last 10 years, he is more welcome to opensource especially Linux. […]