This post explains how to set your code to work monthly/weekly/daily which is very useful for data monitoring, scraping or automatic reports. In Mac and Linux, cron is usually used for this purpose. Windows users can work with Task Schedular, using the command line or special R features.
The full information on the theory of principal component analysis may be found here. This article is about practice in R. It covers main steps in data preprocessing, compares R results with theoretical calculations, shows how to analyze principal components and use it for dimensionality reduction. The last section is devoted to modelling using principal components and comparing it to LDA.
Principal component analysis (PCA) is an unsupervised method of generating components from a large set of variables available if a data set which represent a combinations of features that capture as much information in the data as possible. Or in a Wikipedia way:
A statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.
If A is symmetric, then A is orthogonally diagonalizable and has only real eigenvalues. In other words, there exist real numbers λ_1, …, λ_n (the eigenvalues) and orthogonal, non-zero vectors v_1, …, v_n (the eigenvectors) such that for each i = 1,2,…,n:
import numpy as np
a = np.array([[1, 2, 3], [2, 5, 6], [6, 7, 4]])
b = np.eye(5)
c = np.ones((7, 5))
d = np.zeros((7, 5)) # tuple as argument!
v = np.arange(0, 24, 2) # start, stop, step as arguments
d = v.reshape((3, 4)) # reshape tells dim of matrix
print (d[2, 1]) # print an element
print (d[[1, 0], [2, 3]]) # print two elements
print(d[1,:]) # print a row
print([d[:,3]) # print a column
r1 = np.dot(a, b)
r2 = a.dot(b)
r = a * b # is a multiplication by coordinates