Math in Machine Learning

Linear Algebra

  • mathematics of data: multivariate, least square, variance, covariance, PCA
  • equotion: y = A \cdot b, where A is a matrix, b is a vector of depency variable
  • application in ML
    1. Dataset and Data Files
    2. Images and Photographs
    3. One Hot Encoding: A one hot encoding is a representation of categorical variables as binary vectors. encoded = to_categorical(data)
    4. Linear Regression. L1 and L2
    5. Regularization
    6. Principal Component Analysis. PCA
    7. Singular-Value Decomposition. SVD. M=U*S*V
    8. Latent Semantic Analysis. LSA typically, we use tf-idf rather than number of terms. Through SVD, we know the different docments with same topic or the different terms with same topic
    9. Recommender Systems.
    10. Deep Learning

Numpy

  • array broadcasting
    1. add a scalar or one dimension matrix to another matrix. y = A + b where b is broadcated.
    2. it oly works when when the shape of each dimension in the arrays are equal or one has the dimension size of 1.
    3. The dimensions are considered in reverse order, starting with the trailing dimension;

Matrice

  • Vector
    1. lower letter. \upsilon = (\upsilon<em>1, \upsilon</em>2, \upsilon_3)
    2. Addtion, Substruction
    3. Multiplication, Divsion(Same length) a*b or a / b
    4. Dot product: a\cdot b
  • Vector Norm
    1. Defination: the length of vector
    2. L1. Manhattan Norm. L<em>1(\upsilon)=|a</em>1| + |a<em>2| + |a</em>3| python: norm(vector, 1) . Keep coeffiencents of model samll
    3. L2. Euclidean Norm. L<em>2(\upsilon)=\sqrt(a</em>1^2+a<em>2^2+a</em>3^2) python: norm(vector)
    4. Max Norm. L<em>max=max(a</em>1,a<em>2,a</em>3) python: norm(vector, inf)
  • Matrices
    1. upper letter. A=((a<em>{1,1},a</em>{1,2}),(a<em>{2,1},a</em>{2,2}) )
    2. Addtion, substruction(same dimension)
    3. Multiplication, Divsion( same dimension)
    4. Matrix dot product. If C=A\cdot B, A’s column(n) need to be same size to B’s row(m). python: A.dot(B) or A@B
    5. Matrix-Vector dot product. C=A\cdot \upsilon
    6. Matrix-Scalar. element-wise multiplication
    7. Type of Matrix
      1. square matrix. m=n. readily to add, mulitpy, rotate
      2. symmetric matrix. M=M^T
      3. triangular matrix. python: tril(vector) or triu(vector) lower tri or upper tri matrix
      4. Diagonal matrix. only diagonal line has value, doesnot have to be square matrix. python: diag(vector)
      5. identity matrix. Do not change vector when multiply to it. notatoin as I^n python: identity(dimension)
      6. orthogonal matrix. Two vectors are orthogonal when dot product is zeor. \upsilon \cdot \omega = 0 or \upsilon \cdot \omega^T = 0. which means the project of \upsilon to \omega is zero. An orthogonal matrix is a matrix which Q^T \cdot Q = I
    8. Matrix Operation
      1. Transpose. A^T number of rows and columns filpped. python: A.T
      2. Inverse. A^{-1} where AA^{-1}=I^n python: inv(A)
      3. Trace. tr(A) the sum of the values on the main diagonal of matrix. python: trace(A)
      4. Determinant. a square matrix is a scalar representation of the volume of the matrix. It tell the matrix is invertable. det(A) or |A|. python: det(A) .
      5. Rank. Number of linear indepent row or column(which is less). The number of dimesions spanned by all vectors in the matrix. python: rank(A)
    9. Sparse matrix
      1. sparsity score = \frac{count of non-zero elements}{total elements}
      2. example: word2vector
      3. space and time complexity
      4. Data and preperation
        1. record count of activity: match movie, listen a song, buy a product. It usually be encoded as : one hot, count encoding, TF-IDF
      5. Area: NLP, Recomand system, Computer vision with lots of black pixel.
      6. Solution to represent sparse matrix. reference
        1. Dictionary of keys:  (row, column)-pairs to the value of the elements.
        2. List of Lists: stores one list per row, with each entry containing the column index and the value.
        3. Coordinate List: a list of (row, column, value) tuples.
        4. Compressed Sparse Row: three (one-dimensional) arrays (A, IA, JA).
        5. Compressed Sparse Column: same as SCR
      7. example
        1. covert to sparse matrix python: csr_matrix(dense_matrix)
        2. covert to dense matrix python: sparse_matrix.todense()
        3. sparsity = 1.0 – count_nonzero(A) / A.size
    10. Tensor
      1. multidimensional array.
      2. algriothm is similar to matrix
      3. dot product: python: tensordot()

Factorization

  • Matrix Decompositions
    1. LU Decomposition
      1. square matrix
      2. A = L\cdot U \cdot P, L is lower triangle matrix, U is upper triangle matrix. P matrix is used to permute the result or return result to the orignal order.
      3. python: lu(square_matrix)
    2. QR Decomposition
      1. n*m matrix
      2. A = Q \cdot R where Q a matrix with the size mm, and R is an upper triangle matrix with the size mn.
      3. python: qr(matrix)
    3. Cholesky Decomposition
      1. square symmtric matrix where values are greater than zero
      2. A = L\cdot L^T=U\cdot U^T, L is lower triangle matrix, U is upper triangle matrix.
      3. twice faster than LU decomposition.
      4. python: cholesky(matrix)
    4. EigenDecomposition
      1. eigenvector: A\cdot \upsilon = \lambda\cdot \upsilon, A is matrix we want to decomposite, \upsilon is eigenvector, \lambda is eigenvalue(scalar)
      2. a matrix could have one eigenvector and eigenvalue for each dimension. So the matrix A can be shown as prodcut of eigenvalues and eigenvectors. A = Q \cdot \Lambda \cdot Q^T where Q is the matrix of eigenvectors, \Lambda is the matrix of eigenvalue. This equotion also mean if we know eigenvalues and eigenvectors we can construct the orignal matrix.
      3. python: eig(matrix)
    5. SVD(singluar value decomposition)
      1. A = U\cdot \sum \cdot V^T, where A is m*n, U is m*m matrix, \sum is m*m diagonal matrix also known as singluar value, V^T is n*n matrix.
      2. python: svd(matrix)
      3. reduce dimension
        1. select top largest singluar values in \sum
        2. B = U\cdot \sum<em>k \cdot V</em>k^T, where column select from \sum, row selected from V^T, B is approximate of the orignal matrix A.
        3. `python: TruncatedSVD(n_components=2)

Stats

  • Multivari stats
    1. variance: \sigma^2 = \frac{1}{n-1} * \sum<em>{i=1}^{n}(x</em>i-\mu)^2, python: var(vector, ddof=1)
    2. standard deviation: s = \sqrt{\sigma^2}, python:std(M, ddof=1, axis=0)
    3. covariance: cov(x,y) = \frac{1}{n}\sum<em>{i=1}^{n}(x</em>i-\bar{x})(y_i-\bar{y}), python: cov(x,y)[0,1]
    4. coralation: cor(x,y) = \frac{cov(x,y)}{s<em>x*s</em>y}, normorlized to the value between -1 to 1. python: corrcoef(x,y)[0,1]
    5. PCA
      1. project high dimensions to subdimesnion
      2. steps:
        1. M = mean(A)
        2. C = A-M
        3. V = cov(C)
        4. values,vector = eig(V)
        5. B = select(values,vectors), which order by eigenvalue
      3. scikit learn

        pca = PCA(2) # get two components
        pca.fit(A)
        print(pca.componnets_) # values
        print(pca.explained_variance_) # vectors
        B = pca.transform(A) # transform to new matrix
    • Linear Regression
    1. y = X \cdot b, where b is coeffcient and unkown
    2. linear least squares( similar to MSE) ||X\cdot b - y|| = \sum<em>{i=1}^{m}\sum</em>{j=1}^{n}X<em>{i,j}\cdot (b</em>j - y_i)^2, then b = (X^T\cdot X)^{-1} \cdot X^T \cdot y. Issue: very slow
    3. MSE with SDG

Reference: Basics of Linear Algebra for Machine Learning, jason brownlee, https://machinelearningmastery.com/linear_algebra_for_machine_learning/

Learn Django with me(part 2)

Change Database Setting

Open up mysite/settings.py, find snippet as blew:

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.sqlite3',
'NAME': os.path.join(BASE_DIR, 'db.sqlite3'),
}
}

Here you can change your database to others if needed
* ENGINE – Either ‘django.db.backends.sqlite3’, ‘django.db.backends.postgresql’, ‘django.db.backends.mysql’, or ‘django.db.backends.oracle’.
* NAME – Name of database

We can aslo change the time zone in the setting file

TIME_ZONE = 'America/Chicago'

To create related tables in the database, we need to execute
python manage.py migrate. It will create tables following by INSTALLED_APPS in setting.py.

To read the recetly created tables in sqlite:

python manage.py dbshell
# into sqlite shell
.table

Thre result will like blew:

sqlite&gt; .tables
auth_group                  auth_user_user_permissions
auth_group_permissions      django_admin_log
auth_permission             django_content_type
auth_user                   django_migrations
auth_user_groups            django_session

Create a model

In offical defination, model is the single, definitive source of truth about your data.. In my option, model is only data model in single place, rather than in database as well as in you codes.

Let copy this into webapp/models.py. We create two classes which are also two tables in the database. Each variable is a filename with its data type, such as models.CharField is type char, and models.DataTimeField is datatime. Here we can also figure out a ForeignKey in class Choice which points to Question.

from django.db import models

class Question(models.Model):
question_text = models.CharField(max_length=200)
pub_date = models.DateTimeField('date published')
def __str__(self):
return self.question_text

class Choice(models.Model):
question = models.ForeignKey(Question, on_delete=models.CASCADE)
choice_text = models.CharField(max_length=200)
votes = models.IntegerField(default=0)
def __str__(self):
return self.choice_text

To active model, we need to add config file into INSTALLED_APPS. webapp.apps.WebappConfig means calling WebappConfig in apps file in webapp folder.

INSTALLED_APPS = [
'webapp.apps.WebappConfig',
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
]

Then we run makemigrations to create migration files.

python manage.py makemigrations webapp
# then you will see somehitng like the following:
Migrations for 'webapp':
webapp/migrations/0001_initial.py
- Create model Choice
- Create model Question
- Add field question to choice

We can find the migration opertaion in webapp/migrations/, we run python manage.py migrate, then we can find two new table in the database already(.schema {tablename})

Summary 3 steps for making model changes:

  • Change your models (in models.py).
  • Run python manage.py makemigrations to create migrations for those changes
  • Run python manage.py migrate to apply those changes to the database.

Play with Shell to add some records into db

To get into the shell, we need to execute

python manage.py shell

Then add some questions and choice

from webapp.models import Choice, Question
# show all question
Question.objects.all()

# add a new question
from django.utils import timezone
q = Question(question_text="What's new?", pub_date=timezone.now())

# save into database
q.save()
q.id()

# search in database, similar to where in sql
Question.objects.filter(id=1) # id=1
Question.objects.filter(question_text__startswith='What')
Question.objects.get(pk=1) # filter with pk

# add some choices, here Django creates a set to hold the "other side" of ForeignKey relation
q.choice_set.create(choice_text='Not much', votes=0)
q.choice_set.create(choice_text='The sky', votes=0)
q.choice_set.create(choice_text='The moon', votes=0)

# delete records
d = q.choice_set.filter(choice_text__startswith='The moon')
d.delete()

Django Admin

To create a admin with python manage.py createsuperuser, then system will ask you enter the username, email and password. After this, we can access admin website http://localhost:8000/admin/

If admin account want to add new question in the website, we need to add the follow snippet into admin.py

from django.contrib import admin

from .models import Question

admin.site.register(Question)

Some issue

When I tried to save question in the admin webpage, there poped out a issue like no such table: main.auth_user__old. Just marked here waiting to find the reason later.

Learn Django with me(part 1)

Although I touched python for a while, most of time I use it only for data analysis with panda or some machine learning packages. Django as one of most famous webframeworks has been existing for over 12 years. So, I decide to learn it step by step with the official tutorial and share my experience with you.

Prepare Django

# install django
sudo pip install Django
# build project
django-admin startproject {name of site}
# run test
# goto project folder, you will find manage.py, run
python manage.py runserver {port}

Fast explain some of files:

  • mange.py: A command-line utility that lets you interact with this Django project in various way.
  • {name of site}: python package for the project
  • mysite/__init__.py: An empty file that tells Python that this directory should be considered a Python package. If you’re a Python beginner, read more about packages in the official Python docs.
  • mysite/settings.py: Settings/configuration for this Django project. Django settings will tell you all about how settings work.
    mysite/urls.py: The URL declarations for this Django project; a “table of contents” of your Django-powered site. You can read more about URLs in URL dispatcher.
  • mysite/wsgi.py: An entry-point for WSGI-compatible web servers to serve your project. See How to deploy with WSGI for more details.

Create a new app

# add a new app
python manage.py startapp {name of app}
  • In each App folder, there are three important python files
  • urls.py: controls what is served based on url patterns
  • models.py: database structures and metadata
  • views.py: handles what the end-user “views” or interacts with

Then we need add the app into setting.py under the site folder {name of site}

INSTALLED_APPS = [
'{name of app}',
]

And update the urls.py under the same folder

from django.contrib import admin
from django.urls import path,include

urlpatterns = [
path('admin/', admin.site.urls),
path('webapp/', include('webapp.urls')),
]

We have Completed all files modification under the site folder. Then we go to app folder to create urls.py and change view.py.

# create a file name `urls.py` under app folder
touch urls.py

# copy this to the file, which directs the request to views.py
# path(route, view, kwargs,name)
# @route: URL pattern
# @view: function name to be called
# @kwargs: argument to be passed in a dictionary
# @name: refer URL with the name
from django.urls import path
from . import views
urlpatterns = [
path('', views.index, name='index'),
]

# change view.py as
from django.http import HttpResponse
def index(request):
return HttpResponse("Hello, world. You're at the polls index.")

After all these done, we can access webapp by http://localhost:8000/{name of app}/

Only Programmers Understand

when a new intern debugging

You set a breakpoint successfully, then

Start a unit test

Forget “where” in a SQL

A tiny bug for 10 hours

There is no bug, we pretend

Last mins of the project

Let’s start a multi-thread program

You thought you catched all exceptions

Reduce some unused code

First time you presented a demo to boss

pair programming

Review some code you created one month ago

project went online after a perfect final test

Hello world!

Well. I have to agree “Hello world” is one of my favor titles. Since I bought a raspberry pi which supposed to do some deep learning works, like object detection, I have to do something before my camera coming from amazon.

So, I guess creating a pro personal website would be a good choice. I refer “Build a LAMP Web Server with WordPress”  and “offcial website“. In an nutshell, not hard stuffs, but some points need to be care.

Installation:

    1.  if you change password through mySQL directly, you have to transfer pwd through MD5 tool, here is one.
    2. if you happen to unable to upload file, please install “sudo apt-get install php7.0-gd” and restart apache “sudo service apache2 restart“.
    3. I initially started with offical website guide, but I suddenly realized I need to install LAMP(Linux, ApacheMySQLPHP)
    4. There might be requests for setting FTP when installing plugins, this can be solved by edit wp-config.php with adding “define(‘FS_METHOD’, ‘direct’);
    5. Give www-data authority to access wordpress folder: “sudo chown -R www-data:www-data /var/www/html“, /var/www/html is my wordpress folder

Backup:

  1. UpdraftPlus would be a good choice which provides several romote storage methods inculding google dirve.

After almost 2 hours mess around, I finally  create this website on raspberry!