Concurrency control is normal in OLTP operations, but for OLAP, not really. So I didn’t take care of it until […]
Category: Big Data
Azure Storage Explorer/Data Lake Ghost file In some rare case, if you delete files in ASE, then you call APIs […]
Dr.Kazuaki Ishizaki gives a great summary of spark 3.0 features in his presentation “SQL Performance Improvements at a Glance in […]
Columnstore is the most popular storage tech within big data. We must have already heard parquet, delta lake. They are […]
Big Data • Computer Science • ETL&DW
Azure Pipeline is consist of two parts: pipeline and release. They represent CI and CD separately. Build Pipeline – to […]
First thing first, I should remind all visitors I am not a master in Kafka. Actually I am just a […]
It’s a little bit late to talk about Kafka, since this technology has been widely used for a long time. […]
Azure provides datafactory and azure databricks for handling with ELT pipeline on a scalable environment. Datafactory provides more integrated solution […]
After so many discussion, evaluation and testing, we finally launched a basic architecture for Azure cloud. I hid some key […]
I will combine three parts: Create Ubuntu VM & attach data disk, Install and configure MySQL, Performance comparison with Azure […]