A simple mistaken occurred leveraging spark in python multiprocessing

Look at this snippet first:

def processEachCustomer(client_id):
    df_test = sql('')

num_cores = multiprocessing.cpu_count()
with ThreadPoolExecutor(max_workers=num_cores) as executor:

It looks fine at the first glance. However, after the validation, the output was incomplete in delta table. At the end, the issue happens in the df_test, which is not a local variable in a function. So, when it ran as multicore, df_test was overwritten.

The best way to avoid this issue is using pyspark code only. If you have to combine them within the same notebook. Here is maybe a work around.

df_test = {}
def processEachCustomer(client_id):
    df_test[client_id] = sql('')

num_cores = multiprocessing.cpu_count()
with ThreadPoolExecutor(max_workers=num_cores) as executor:

Multi Processing in Python

Recently we get the work to improve the ETL efficiency. The latency mostly happen in I/O and file translation, since we only use one thread to handle it. So we want to use multi processing to accelerate this task.

Before everything, I think I need to introduce the difference between multi processing and multi threading in python. The major distance is multi threading is not parallel, there can only be one thread running at any given time in python. You can see it in blew.

  • Actually for CPU heavy tasks, multi threading is useless indeed. However it’s perfect for IO
  • Multiprocessing is always faster than serial, but don’t pop more than number of cores
  • Multiprocessing is for increasing speed. Multi threading is for hiding latency
  • Multiprocessing is best for computations. Multi threading is best for IO
  • If you have CPU heavy tasks, use multiprocessing with n_process = n_cores and never more. Never!
  • If you have IO heavy tasks, use multi threading with n_threads = m * n_cores with m a number bigger than 1 that you can tweak on your own. Try many values and choose the one with the best speedup because there isn’t a general rule. For instance the default value of m in ThreadPoolExecutor is set to 5 [Source] which honestly feels quite random in my opinion.

In this task, we utilized multi processing.


  1. 5x-10x faster than single core when utilizing 20 cores. (6087 files, 27s (20 cores) vs 180s(single core)
  2. no need to change current coding logic, only adapt to multiprocess pattern.
  3. no effect to upstream and down stream.


  1. need to change code by scripts, no general framework or library 
  2. manually decide which part to be parallel 

How to work:

  • Library: you need to import these libraries to enable multi processing in python
from multiprocessing import Pool
from functools import partial
import time # optional
  • Function: you need two functions. parallelize_dataframe is used split dataframe and call multi processing, multiprocess is used for multiprocessing. These two function need to put outside of main function. The yellow marked parameters can be changed by needs.
# split dataframe into parts for parallelize tasks and call multiprocess function
def parallelize_dataframe(df, func, num_cores,cust_id,prodcut_fk,first_last_data):
    df_split = np.array_split(df, num_cores)
    pool = Pool(num_cores)
    parm = partial(func, cust_id=cust_id, prodcut_fk=prodcut_fk,first_last_data=first_last_data)
    return_df = pool.map(parm, df_split)
    df = pd.DataFrame([item for items in return_df for item in items])
    return df

# multi processing
# each process handles parts of files imported
def multiprocess(dcu_files_merge_final_val_slice, cust_id,prodcut_fk,first_last_data):
  • Put another script into main function, and utilize  parallelize_dataframe  to execute.
if __name__ == '__main__':
                   # set the process number
                    core_num = 20
                    rsr_df = parallelize_dataframe(dcu_files_merge_final_val,multiprocess,core_num, cust_id, prodcut_fk,first_last_data)
                    end = time.time()

I can’t share the completed code, but you should understand it very well. There are some issues that I have not figure out the solution. The major one is how to more flexible to transmit the parameters to the multicore function rather than hard coding each time.


We added the feature that unzip single zip file with multi processing supporting utilizing futures library.

import os
import zipfile
import concurrent

def _count_file_object(f):
    # Note that this iterates on 'f'.
    # You *could* do 'return len(f.read())'
    # which would be faster but potentially memory
    # inefficient and unrealistic in terms of this
    # benchmark experiment.
    total = 0
    for line in f:
        total += len(line)
    return total

def _count_file(fn):
    with open(fn, 'rb') as f:
        return _count_file_object(f)

def unzip_member_f3(zip_filepath, filename, dest):
    with open(zip_filepath, 'rb') as f:
        zf = zipfile.ZipFile(f)
        zf.extract(filename, dest)
    fn = os.path.join(dest, filename)
    return _count_file(fn)

def unzipper(fn, dest):
    with open(fn, 'rb') as f:
        zf = zipfile.ZipFile(f)
        futures = []
        with concurrent.futures.ProcessPoolExecutor() as executor:
            for member in zf.infolist():
            total = 0
            for future in concurrent.futures.as_completed(futures):
                total += future.result()
    return total

Ref :

Multithreading VS Multiprocessing in Python, https://medium.com/contentsquare-engineering-blog/multithreading-vs-multiprocessing-in-python-ece023ad55a

Fastest way to unzip a zip file in Python, https://www.peterbe.com/plog/fastest-way-to-unzip-a-zip-file-in-python

Share some useful/special MS SQL tips as a data engineer

If you are a data scientist, you maybe never need to do the data preprocess work, like ETL/ELT, performance tunning or OLTP database design. Everything is already prepared in the structured data warehouse or flat file, it is beauty and nice. Regarding to data quality, all a data scientist need to do is handle some missing or wrong value, then clear the relationship and do the analysis. I didn’t say it is easy after preprocess, what I mean is data engineer really does lots of time-consuming work for the final success. So I wanna summary and share some of my experience, it maybe can save data engineer much time. And I am welcome if someone can correct me and add more information, please send me email: neo_aksa@hotmail.com

1. Incremental Loading. We get three method to do incremental loading.

A. Merge clause. It’s very simple. just the combination of update and insert.

target_table tg_table
USING source_table src_table
ON ( src_table.id = tg_table.id )
THEN UPDATE SET tg_table.name = src_table.name
THEN INSERT ( tg_table.id, tg_table.name ) VALUES ( src_table.id, src_table.name );

B. CDC(change data capture) in SSIS. More information see my another topic: “Incremental Load DW by using CDC in SSIS

C.Lookup + conditional split in SSIS. Essentially it is as same as the method A. Not find goes to “Insert”, find goes to “update”.

2. CTE. Before CTE coming out, we write the SQL with many sub queries which is a little bit hard to read since the logic is reversed. Now with the help of CTE, we can make our codes more readable and get rid of function in group by.

-- return the customers who had over $10,000 in purchase for their first three transactions.
with OrderRank
select custID, row_number() over(partition by custID order by orderID) as Rank, amount from SalesOrder
select custID, sum(amount) as totalAmount from OrderRank where rank<=3 group by custID
select custID, totalAmount from OrderOver where totalAmount>10000

3. Delete duplicate row. This is very common job as lots of data are manual input. Here we have two simple ways to handle it.
A. use “Sort” component in SSIS, check reduce duplication box.
B. Write script. Using CTE to mark the row number, then delete the row number greater than 1

With CTE RemoveDuplicate
-- partition and order by columns which decide duplication 
select ROW_NUMBER() over (partition id,name.. order by id,name) as row id, column ....... from tablename
delete from RemoveDuplicate where row_id > 1

4. Faster Loading. SQL Server defaults isolation level is “Read committed”. But in most of case, we don’t need it as we only need to load all the data from OLTP. There are two ways to make loading faster and not lock another jobs.
A. use “WITH(NOLOCK)” in statement level.

SELECT FirstName, LastName
WHERE EmpID = 1;

B. use “Set Transaction ISOLATION LEVEL” to read uncommitted.


5. Use Column stored index in data warehouse. Column stored index is very helpful to increase the select performance since the column in the same page, but bad for insert or update. For the fact table with many different values, it is very good for full table scan. Just remember, if we create clustered columnstore index, we cannot create primary key, and all columns should be included into this clustered columnstore index.

--BASIC EXAMPLE: Create a nonclustered index on a clustered columnstore table.  
--Create the table  
CREATE TABLE t_account (  
    AccountKey int NOT NULL,  
    AccountDescription nvarchar (50),  
    AccountType nvarchar(50),  
    UnitSold int  
--Store the table as a clustered columnstore.  
--Add a nonclustered index for table seek.
CREATE UNIQUE INDEX taccount_nc1 ON t_account (AccountKey);

6. Bulk data load. We maybe find its very slow to bulk load huge tables into data warehouse. This is because we missed some steps before loading.
A. drop the clustered index for large table before loading.
B. recreate index for the large table after loading.
C. Update statistics.

7. Update view. Typically, we use view to hide the logic and table behead, and make loading more easier. But in some cases, we need to update view( yes, we dont want to know the detail of view, we just need to update some data). SQL SERVER provides ability to update view directly and indirectly.
A. If the view match following limitation, you can do DML operation directly. a. no subquery and only select b. no distinct or group by(aggregation=NO) c. No order by d. if view contains multiple tables, you can only insert/update one table. e. use ‘with check option‘, otherwise, you will update the data out of you exception.
B. Use instead of trigger to update tables which related to view.

CREATE TRIGGER trigUnion ON vwUnionCustomerSupplier
DECLARE @DelName nvarchar(50)

IF (SELECT inserted.Type FROM inserted) Is Null

SELECT @DelName = deleted.CompanyName FROM deleted

IF (SELECT inserted.Type FROM inserted) = 'Company'
UPDATE Customers
SET CompanyName =
  (SELECT CompanyName
  FROM inserted)
  WHERE Customers.CompanyName =
UPDATE Suppliers
SET CompanyName =
  (SELECT CompanyName
  FROM inserted)
  WHERE Suppliers.CompanyName =

8. Deadlock or long running Query. It’s not normal. but if you find your ELT or ETL is running for a long time. It may be caused by deadlock. check it by sys.dm_tran_lock. or we can use sys.dm_exec_query_stats to get the query running information.

9. Use windows function for rolling aggregation. We can set the row or range option to achieve running aggregation in MS SQL. By default, Range is default option.

-- running total
select customer id, orderId, amount, sum(amount) over (order by orderid) runningtotal from sales_order (in tempDB)
-- revised running total 
select customer id, orderId, amount, sum(amount) over (order by orderid rows unbounded preceding) runningtotal from saels_order (in memory) running total
-- runningtotal from sales order(in memory) all sum, very useful in partition with subtotal
select customer id, orderId, amount, sum(amount) over (order by orderid rows between unbounded preceding and unbounded following) 
-- running 3 month total from sales (in memory)
select customer id, orderId, amount, sum(amount) over (order by orderid rows between 1 preceding and 1 following) 

10. Covering Index. An index that contains all information required to resolve the query is known as a “Covering Index” . If the fields from “select” are not in non-cluster or cluster index, the “key lookup” will happen in execution plan.

To meet the covering Index, but we don’t want move new column into non-clustered index, we can use “Included columns“. It will keep non index in the leaf node of the index.

CREATE NONCLUSTERED INDEX [ix_Customer_Email] ON [dbo].[Customers]
            [Last_Name] ASC,
            [First_Name] ASC

11. Schema Binding. schema binding is used for view and function.
Objects that are referenced by schema bound objects cannot have their definition changed. it can also significantly increase the performance of user defined functions

CREATE FUNCTION dbo.GetProductStatusLabel
  @StatusID tinyint
RETURNS nvarchar(32)
  RETURN (SELECT Label FROM dbo.ProductStatus WHERE StatusID = @StatusID);

12. Table/Index partitioning. If you are working on Azure or cluster platform, please skip this. The HDFS has already helps you to complete similar thing. But if you still work on-prem, table partitioning will help to improve performance a lot. Essentially, table partitioning is creating more than one filegroup to improve its I/O. There are four steps to create partition for table or index.
A. Add filegroups and files

-- Adds four new filegroups to the AdventureWorks2012 database  
ALTER DATABASE AdventureWorks2012  
ADD FILEGROUP test1fg;  
ALTER DATABASE AdventureWorks2012  
ADD FILEGROUP test2fg;  
-- Adds one file for each filegroup.  
ALTER DATABASE AdventureWorks2012   
    NAME = test1dat1,  
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\t1dat1.ndf',  
    SIZE = 5MB,  
    MAXSIZE = 100MB,  
TO FILEGROUP test1fg;  
ALTER DATABASE AdventureWorks2012   
    NAME = test2dat2,  
    FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA\t2dat2.ndf',  
    SIZE = 5MB,  
    MAXSIZE = 100MB,  
TO FILEGROUP test2fg;  

B. Add partition function: how map to the partitions based on column’s value

-- Creates a partition function called myRangePF1 that will partition a table into four partitions  

C. Add partition scheme: map the partition function to filegourps.

-- Creates a partition scheme called myRangePS1 that applies myRangePF1 to the four filegroups created above  
    AS PARTITION myRangePF1  
    TO (test1fg, test2fg) ;  

D. participating column: partition function uses it to perform partition

-- Creates a partitioned table called PartitionTable that uses myRangePS1 to partition col1  
CREATE TABLE PartitionTable (col1 int PRIMARY KEY, col2 char(10))  
    ON myRangePS1 (col1) ;  

13. Defragmentation. According to Mircorsoft suggestion, if fragment greater than 30%, we need to rebuild index, if between 5% – 30%, we need to reorganize index. We can use sys.dm_db_index_physical_stats to check the avg_fragmentation_in_percent.

-- use sys.dm_db_index_pysical_stats to check fregmanet
select * from   sys.dm_db_index_physical_stats(DB_ID(N'AdventureWorks2017'),OBJECT_id(N'AdventureWorks2017.Person.Person'),-1,null,'detailed')
-- Check avg_fragmentation_in_percent
-- if this percent 
--> 5% and < = 30%
--> 30%

14. Other convenient code.

-- get all column names of spec table
sp_coulumns table_name, table_owner

-- Object Dependencies
sp_depends table_name, table_owner

-- convert if fail
Try_Convert(data_ype(length), expression, style)

-- split string by demilation.
SELECT * FROM STRING_split('A,B,B',',')
select column1, column2 from table1
cross apply string_split(column3,',')

--  returns the last day of the month containing a specified date, with an optional offset.
EOMONTH ( start_date [, month_to_add ] ) 

-- check the object
IF OBJECT_ID('Sales.uspGetEmployeeSalesYTD', 'P') IS NOT NULL

-- dynamic SQL
-- use sp_executesql
SET @ParmDefinition = N'@BusinessEntityID tinyint'; /* Execute the string with the first parameter value. */ 
SET @IntVariable = 197; 
EXECUTE sp_executesql @SQLString, @ParmDefinition, @BusinessEntityID = @IntVariable;
-- use exec
SET @columnList = 'AddressID, AddressLine1, City'SET @city = '''London'''
SET @sqlCommand = 'SELECT ' + @columnList + ' FROM Person.Address WHERE City = ' + @city
EXEC (@sqlCommand)

Learn Django with me(part 3)

Handle view and templates

View consist of a set of functions which handle the different url request with the specific url pettarns.
And it returns either of HttpResponse or Http404.

Firstly, let’s update webapp/views.py:

from django.shortcuts import render
from .models import Question

def index(request):
# get the lastest 5 questions
latest_question_list = Question.objects.order_by('-pub_date')[:5]
# create context
context = {'latest_question_list': latest_question_list}
# a shortcuts for render request by template
return render(request, 'webapp/index.html', context)

def detail(request, question_id):
question = get_object_or_404(Question, pk=question_id)
return render(request, 'webapp/detail.html', {'question': question})

def results(request, question_id):
response = "You're looking at the results of question %s."
return HttpResponse(response % question_id)

def vote(request, question_id):
return HttpResponse("You're voting on question %s." % question_id)

Here we used template webapp/index.html, which locates in webapp/tempaltes/webapp/index.html. So let’s create a folder templates and its subfolder webapp, the code of index.html:

# list of all items from question object
{% if latest_question_list %}
    {% for question in latest_question_list %}
# webapp is the namespace, detail is the name
    <li><a href="{% url 'webapp:detail' question.id %}">{{ question.question_text }}</a></li>
{% endfor %}</ul>
{% else %}

No polls are available.

{% endif %}

The trick point is when we refer to the details, we use {% url 'webapp:detail' question.id %} instand of absolute path. Here webapp is the name space, detail is the name, all can be found in updated webapp/urls.py:

from django.urls import path

from . import views

# namespace
app_name = 'webapp'
urlpatterns = [
# ex: /webapp/
path('', views.index, name='index'),
# ex: /webapp/5/
path('<int:question_id>/', views.detail, name='detail'),
# ex: /webapp/5/results/
path('<int:question_id>/results/', views.results, name='results'),
# ex: /webapp/5/vote/
path('<int:question_id>/vote/', views.vote, name='vote'),

Similar to the template `index.html`, we should add the template `webapp/tempaltes/webapp/detail.html`:
``` html
<h1>{{ question.question_text }}</h1>
{% for choice in question.choice_set.all %}
    <li>{{ choice.choice_text }}</li>
{% endfor %}</ul>

All the dynamic codes in html are easy to understand, I wouldn’t waste time to explain them.

Now, you can access http://localhost:8000/webapp/ to display the reuslts.The whole process can be described like this:
1. send request to server
2. Djongo pastes the url by ROOT_URLCONF = 'mysite.urls' in mysite/settings.py, which points to mysite.urls.
3. In term of urlpatterns in mysite/urls.py, the request will be transfered to webapp folder.
4. The request can be handled by webapp/urls.py, which points to the different functions in webapp/views.py. Here the second param in the function results is from request pattern.
5. View pastes and handle the request, then retrieves template in template/webapp/.
6. HttpResponse or Http404 back to client

In a nutshell, urls.py handles url patterns and sends request to views.py, views.py calls model.py and templates to send response back.

Singleton Pattern in Java, Python and C++

An implementation of the singleton pattern must:

  1. ensure that only one instance of the singleton class ever exists
  2. class creates its own singleton pattern instance
  3. provide global access to that instance.

Typically, this is done by:

  1. declaring all constructors of the class to be private
  2. providing a static method that returns a reference to the instance.


//Lazy loading(not recommend)
public class Singleton {  
    private static Singleton instance;  
    private Singleton (){}  
    public static synchronized Singleton getInstance() {  
    if (instance == null) {  
        instance = new Singleton();  
    return instance;  
//Non-lazy loading(recommend)
public final class Singleton {
    // create instance
    private static final Singleton INSTANCE = new Singleton();
    // set constructor as null, so that no instance will be created 
    private Singleton() {}
    // return this singletion instance 
    public static Singleton getInstance() {
        return INSTANCE;
public enum Singleton {  
    public void dummy() {  
// how to use
 SingletonEnum singleton = SingletonEnum.INSTANCE;


// Lazy loading
class Singleton(object):
    def __new__(cls, *args, **kw):
        if not hasattr(cls, '_instance'):
            orig = super(Singleton, cls)
            cls._instance = orig.__new__(cls, *args, **kw)
        return cls._instance

class MyClass(Singleton):
    a = 1

//meta class
class Singleton(type):
    _instances = {}
    def __call__(cls, *args, **kwargs):
        if cls not in cls._instances:
            cls._instances[cls] = super(Singleton, cls).__call__(*args, **kwargs)
        return cls._instances[cls]

class MyClass(metaclass=Singleton):

from functools import wraps

def singleton(cls):
    instances = {}
    def getinstance(*args, **kw):
        if cls not in instances:
            instances[cls] = cls(*args, **kw)
        return instances[cls]
    return getinstance

class MyClass(object):
    a = 1


#include <iostream>

class Singleton
        /* Here will be the instance stored. */
        static Singleton* instance;

        /* Private constructor to prevent instancing. */

        /* Static access method. */
        static Singleton* getInstance();

/* Null, because instance will be initialized on demand. */
Singleton* Singleton::instance = 0;

Singleton* Singleton::getInstance()
    if (instance == 0)
        instance = new Singleton();

    return instance;


int main()
    //new Singleton(); // Won't work
    Singleton* s = Singleton::getInstance(); // Ok
    Singleton* r = Singleton::getInstance();

    /* The addresses will be the same. */
    std::cout << s << std::endl;
    std::cout << r << std::endl;

Mandelbrot set

Defination of Mandelbort set

According to wiki, the Mandelbrot set is the set of complex numbers c for which the function f_{c}(z)=z^{2}+c does not diverge when iterated from  z=0 .

What magic does it has

If we try to find the set of c in a complex plane. Here is the magic:

where x axis is the real part of c, y axis is the imaginary of c. A point c is colored black if it belongs to the set.

How to find set

here is the pseducode from wiki:

For each pixel (Px, Py) on the screen, do:
  x0 = scaled x coordinate of pixel (scaled to lie in the Mandelbrot X scale (-2.5, 1))
  y0 = scaled y coordinate of pixel (scaled to lie in the Mandelbrot Y scale (-1, 1))
  x = 0.0
  y = 0.0
  iteration = 0
  max_iteration = 1000

  //when the sum of the squares of the real and imaginary parts exceed 4, the point has reached escape.
  while (x*x + y*y < 2*2  AND  iteration < max_iteration) {
    xtemp = x*x - y*y + x0
    y = 2*x*y + y0
    x = xtemp
    iteration = iteration + 1
  color = palette[iteration]
  plot(Px, Py, color)

for complex numbers:




Hence: x is sum of real parts in z and c, y is sum of imaginary parts of z and c. x={Re}(z^{2}+c)=x^{2}-y^{2}+x_{0} and  y={Im}(z^{2}+c)=2xy+y_{0}

How about 3D

I am going to try my 3D later, but please see the detail from skytopia firstly, I cited the formula directly.

3D formula is defined by:

z -> z^n + c where z and c are defined by {x,y,z}

{x,y,z}^n = r^n { sin(theta*n) * cos(phi*n) , sin(theta*n) * sin(phi*n) , cos(theta*n) }

r = sqrt(x^2 + y^2 + z^2)
theta = atan2( sqrt(x^2+y^2), z )
phi = atan2(y,x)

// z^n + c is similar to standard complex addition

{x,y,z}+{a,b,c} = {x+a, y+b, z+c}

//The rest of the algorithm is similar to the 2D Mandelbrot!

//Here is some pseudo code of the above:

r = sqrt(x*x + y*y + z*z )
theta = atan2(sqrt(x*x + y*y) , z)
phi = atan2(y,x)

newx = r^n * sin(theta*n) * cos(phi*n)
newy = r^n * sin(theta*n) * sin(phi*n)
newz = r^n * cos(theta*n)

Interesting Python I ( function )

how to pass-by-reference?

Many lauguages support pass by value or pass by reference, like C/C++. It copies the address of an argument into the formal parameter. Inside the function, the address is used to access the actual argument used in the call. It means the changes made to the parameter affect the passed argument. In Python, pass by reference is very tricky. There are two kinds of objects: mutable and immutable. string, tuple, numbers are immuable, list, dict, set are muable. When we try to change the value of immuable object, Python will create a copy of reference rather than changing the value of reference. Let us see the code:

    def ref_demo(x):
        print "x=",x," id=",id(x)
        print "x=",x," id=",id(x)

    >>> x = 9
    >>> id(x)
    >>> ref_demo(x)
    x= 9  id= 41902552
    x= 42  id= 41903752
    >>> id(x)

We can find when x = 42, the address of x has changed.

And so on, if we pass a mutable object into a function, we can change it value as pass-by-reference.

*args and **kwargs

Before I explain them, I want to metion that * is used to unpack tuple or list into positional arguments and ** is used to it unpacks dictionary into named arguments.

* defines a variable number of arguments. The asterisk character has to precede a variable identifier in the parameter list.

>>> def print_everything(*args):
        for count, thing in enumerate(args):
...         print '{0}. {1}'.format(count, thing)
>>> print_everything('apple', 'banana', 'cabbage')
0. apple
1. banana
2. cabbage

** defines an arbitrary number of keyword parameters.

>>> def table_things(**kwargs):
...     for name, value in kwargs.items():
...         print '{0} = {1}'.format(name, value)
>>> table_things(apple = 'fruit', cabbage = 'vegetable')
cabbage = vegetable
apple = fruit

A * can appear in function calls as well, as we have just seen in the previous exercise: The semantics is in this case “inverse” to a star in a function definition. An argument will be unpacked and not packed. In other words, the elements of the list or tuple are singularized:

>>> def f(x,y,z):
...     print(x,y,z)
>>> p = (47,11,12)
>>> f(*p)
(47, 11, 12)

There is also a mechanism for an arbitrary number of keyword parameters. To do this, we use the double asterisk “**” notation:

>>> def f(a,b,x,y):
...     print(a,b,x,y)
>>> d = {'a':'append', 'b':'block','x':'extract','y':'yes'}
>>> f(**d)
('append', 'block', 'extract', 'yes')

Efficiency of different programming languages

Most likely every programmer knows python is a low efficiency language, but how slow it is? See this picture:

enter image description here

The author compared virtually all languages considering three variables: energy consumption, memory consumption and execution time. Java and C are doing very well in energy and time. As to python, the numbers are not ideal as python is interpreted a language and using GIL mechanism.

I also did a experiment by myself, where I picked up k numbers from n numbers(n is much larger than k). I compared three languages: python, c and cuda with single and multi-threads. The column name (k|n) means k from n. Here is my result:

C is much faster than python, specially when data is raising fast.Python’s multi-threads is significantly better than single thread. However, I don’t know why C’s multi-threads is almost as same as the single thread.In multi-threads C, I found CPU usage is lower than 30% for each thread.Perhaps create threads takes time. Since I used clock object in C to calc time costing, it got the user time rather than cpu time.(real < user when parallely running). If we want to get real time, we should use clock_gettime(CLOCK_MONOTONIC, &start);Credit: [Dr.Greg Wolffe] . As to Cuda, in this case, no matter how data size raising, the time didn’t change much, although in the small data size, it didn’t run very fast compared with C.

You can find more detail on my Github.

Deprecated: preg_replace(): Passing null to parameter #3 ($subject) of type array|string is deprecated in /home/jietao/jie-tao/wp-content/themes/zacklive/library/zacklive.php on line 283