Some features need to be improved in Azure Data Products

Azure Storage Explorer/Data Lake Ghost file In some rare case, if you delete files in ASE, then you call APIs or use browser data explorer, you will find 0 byte file is still there which we should already deleted. I reported Microsoft last year, they said they fixed it, which was true until I found… Continue reading Some features need to be improved in Azure Data Products

Our new cloud architecture launched!

After so many discussion, evaluation and testing, we finally launched a basic architecture for Azure cloud. I hid some key words that explain the business flows underneaths. Basically, however it is good for all similar scenarios. We tried to use tableau to read data from datalake directly, bot spark sql or native databrick JDBC are… Continue reading Our new cloud architecture launched!

Set up MySQL on Azure Ubuntu and compare with Azure SQL

I will combine three parts: Create Ubuntu VM & attach data disk, Install and configure MySQL, Performance comparison with Azure SQL. Create Ubuntu VM Choose your size of VM. Here I used D4s_v3, which has 4 cores and 16GB memory. You need to choose disk for storage and set the initial admin password, I recommend… Continue reading Set up MySQL on Azure Ubuntu and compare with Azure SQL

ADF tends to replace SSIS

Essentially, ADF(Azure Data Factory) only takes responsibility of Extraction and Load in the ELT pipeline. Then we have to use another tool, like databrick to write jupyter notebook to manipulate dataframe or RDD to complete transform activities. However, as Microsoft launched data “Data Flow” in ADF, it becomes more and more similar to SSIS. Most… Continue reading ADF tends to replace SSIS