-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathindex.html
435 lines (412 loc) · 26.4 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<link rel="stylesheet" href="styles.css">
<script src="page-updater.js"></script>
<script src="libraries/p5.min.js"></script>
<script src="libraries/load-mathjax.js" async></script>
<script src="main.js"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjs/9.5.1/math.min.js" integrity="sha512-7+fUzDKxopLeVKiXTdoQQZBl6Zh9Bbl/NrZoowiddStpj7GXTUCM+LOPay4Wzxz14HazsoSsO96UFvvZqAH5rw==" crossorigin="anonymous" referrerpolicy="no-referrer"></script>
<link href="https://fonts.googleapis.com/css2?family=Roboto:ital,wght@0,100;0,300;0,400;0,500;0,700;1,300;1,400;1,500;1,700&display=swap" rel="stylesheet">
<link href="https://fonts.googleapis.com/css2?family=Open+Sans:ital,wght@0,300;0,400;0,500;1,300;1,400;1,500&family=Roboto:ital,wght@0,100;0,300;0,400;0,500;0,700;1,300;1,400;1,500;1,700&display=swap" rel="stylesheet">
<title>Document</title>
</head>
<body>
<div class="header" style="background:#067790;width:100%;height:200px;margin-bottom:50px;display:flex;flex-direction: column;justify-content: center;align-items: center;font-family: Open Sans;">
<h1 style="letter-spacing: -1px;padding-bottom: 6px;font-size: 35px;color: #fff;">Track an object's direction using PCA</h1>
<h2 style="font-weight: 400;color: hsl(202, 85%, 87%);">Iwan Bnlf's Github</h2>
</div>
<div class="blog_wrapper">
<div class="introduction_wrapper">
<h2>Introduction</h2>
<p style="font-style:italic;color:#3d3d3d">
"It all begins with a simple picture and its subject. The later, lost in the image can't seem to find its direction.
As it yells for help, a bold hero emerges from the land of mathematics to rescue him, and his name was PCA."
</p>
<p>
All jokes aside, Mathematics tends to have a beautiful relationship with image processing, and PCA & image tracking is one remarquable example.
In today's post, we'll try to understand the intuition behind PCA and its application in image processing to find an object's direction.
</p>
</div>
<div class="isolating_pixel" style="position: relative;">
<h2>Isolating the relevant data</h2>
<p>Let set our problem in the middle of a parking lot with one parked red car.</p>
<img src="Pictures/red_car.jpg" style="width:60%;height:60%;position:relative;margin-left:50%;transform:translate(-50%);padding: 15px 0 15px 0;">
<p>Our goal here is to find this car's direction and position and further, track those 2 values over time (if we take
a video of the moving car as an input)
</p>
<p>Therefore, our first task would be to isolate the pixels that are meaningful to us, here the car's pixel. There exist many methods to do so,
like <a href="https://en.wikipedia.org/wiki/Otsu%27s_method">Otsu Thresholding</a>, <a href="https://en.wikipedia.org/wiki/K-means_clustering">K-Segmentation</a>, <a href="https://en.wikipedia.org/wiki/Canny_edge_detector">Canny Edge Detector</a>, or even <a href="https://en.wikipedia.org/wiki/Blob_detection">Blob Detection</a>.
The only issue with these algorithms is that we still have to manually point which part among the isolated parts is the one we want to track. This can be easily overcomed with any kind of classification algorithms. (link to lipton or AI classifier)
</p>
<p>
However, if we take as an input a video where the background is static, isolating the moving element can be easily done with the algorithm described below.
</p>
<img src="Pictures/two_cars.gif" style="width:75%;height:75%;position:relative;margin-left:50%;transform:translate(-50%);padding: 15px 0 5px 0;">
<button type="button" class="collapsible">Video Isolating Algorithm</button>
<div class="content" style="display: none;">
If our input is a video composed of \(n\) frames, we can refer to each of those frames by \(F_{i, \ i \in [ 1,n]}\). First, we convert each frame to their grayscale value \(G_i\) using the following formula:
$$
\normalsize{
\begin{gather}
G_i(x,y) = 0.2126 \cdot R(F_i(x,y)) + 0.7152 \cdot G(F_i(x,y)) + 0.0722 \cdot B(F_i(x,y)) \\
x \in [1,width], \quad y \in [1,height]
\end{gather}
}
$$
Now, to compute the motion \(M_i\) between 2 following frames, we only have to compute their absolute difference pixel by pixel:
$$
M_i = │G_{i+1} - G_i│
$$
Finally to remove noise and keep the most relevant values, a simple threshold method can be applied to find our motion regions \(R_i\):
\begin{equation}
R_i(x,y) =
\begin{cases}
M_i(x,y), \quad M_i(x,y) > t \\
0, \quad M_i(x,y) \leq t
\end{cases}
\end{equation}
</div>
</div>
<div class="covariance_wrapper">
<h2>What is covariance, and what does it tells us ?</h2>
<p>Before going any further, we need to introduce and understand some very useful statistical tools.</p>
<h3>Variance</h3>
<p>
Let say you are looking at the grades students got at their last exam. Those grades are saved in a 1D-array, which you can plot on a single line.
</p>
<img src="Pictures/grade_1.png" style="width:60%;height:60%;position:relative;margin-left:50%;transform:translate(-50%);padding: 10px;">
<p>
Now we ask ourselves the following question: how can we compute how well the students performed on this test ? The first idea would be to compute the grade's mean value with the following formula.
</p>
<span>$$\large{\bar{x} = \frac{1}{n}\sum_{i = 1}^n x_i}$$</span>
<p>
However, this value isn't relevant enough on itself to display how well the students performed because the same mean can be obtained by those two very different distributions:
</p>
<img src="Pictures/grade_2.png" style="width:60%;height:60%;position:relative;margin-left:50%;transform:translate(-50%);padding: 10px">
<p>
So, to really get a picture of the classroom's performance, you also need to compute how the grades distribute around the mean value: that is Variance.
</p>
<span>
$$\large{Var(X) = \color{orange}\frac{1}{n} \color{green}\sum_{i = 1}^{n} \color{red} (\bar{x} - x_i)\color{blue}^2}$$
</span>
<p style="padding-bottom: 20px">
To construct variance, we compute for each point its <span style="color:#ff0000;font-style: italic;font-weight: bold;">"error" relative to the mean value</span>. We then <span style="color: blue;font-style: italic;font-weight: bold;">square</span> this new value and <span style="color: green;font-style:italic;font-weight: bold;">sum</span> them. Squaring the error of each values first assure that all the terms are positive (and thus that none of them
are going to cancel each other out), but mostly, it allows to "penalize" the biggest errors.
Finally, we <span style="color:orange;font-style: italic;font-weight: bold;">divide the total by the number of points</span> in the array.
</p>
<h3>Covariance</h3>
<p>
Now, let's say that you are looking at two sets of data that are linked, for instance the age of a person and the amount of money they spend on groceries.
Those data can be saved in a 2D-array, or ploted on a 2D plane.
</p>
<img src="Pictures/money_age.png" style="width:65%;height:65%;position:relative;margin-left:50%;transform:translate(-50%);padding: 10px">
<p>
If variance told you how the values distributed around their mean value, then covariance of distributions \(X\) & \(Y\) tells you how variations in distribution \(X\) affects distribution \(Y\)(and vice-versa).
</p>
<div style="display:flex;flex-direction:column;align-items: center;">
<span style="font-family: Roboto;font-size:18px;font-weight:400;padding:15px 0 15px 0">↓ Draw a distribution below and watch its covariance ↓</span>
<div class="covariance_p5_wrapper" style="display:flex; flex-direction: row">
<div id="computing_covariance"></div>
<div style="display:flex;flex-direction:row;align-items: center;margin-left:15px">
<span>$$ \large Cov = $$</span>
<span id="covariance_output" style="font-size: 21px;padding-left:8px"></span>
</div>
</div>
<div style></div>
<button style="margin-top:10px" onclick="reset_covariance_boolean = true">Reset</button>
</div>
<p>
Covariance formula doesn't really differ from variance's one, only that instead of squaring each error term for one distribution; for each point, we multiply the \(X\) error term with \(Y\)'s error term.
</p>
<span>
$$\large{Cov(X,Y) = \frac{1}{n}\sum_{i = 1}^{n}(\bar{x} - x_i)(\bar{y} - y_i)}$$
</span>
<p>
Silly but important fact: \(Var(X) = Cov(X,X)\)
</p>
</div>
<div class="linear_algebra_class">
<h2>Computing the direction</h2>
<p>
Now, let's go back to our red car in a parking lot picture. What we want to compute is the direction and position of this cloud of pixels. But now that we know about variance and covariance, we can consider that cloud of pixels to be two joint distributions (the distribution
of the x-coordinates, and the one of the y-coordinates). Therefore, computing their variances and covariances will tell us how a change in the x-axis distribution affects the y-axis distribution (and vice versa), which is another way to formulate a direction !
</p>
<p>
So now the problem boils down to: how to compute the direction of our "pixel blob" only from variance and covariance ?
</p>
<h3>The power of linear algebra</h3>
<p>The answer is to "throw" our covariances into a matrix and compute its eigenvectors. If you are not familiar with linear algebra, those words
might seem overwhelming; and if you are, you might not see the intuition behind the process. In both cases, the next paragraphs will explain everything.
</p>
<h4>Matrices & linear maps</h4>
<p>
A matrix can be written as an array of values. Matrices can be of any shape but the most common and useful ones are square matrices. Here, we will only look at 2x2 matrices.
</p>
<span>
$$\large{
\begin{pmatrix}
a & b \\ c & d
\end{pmatrix}
}$$
$$\textrm{Example of a simple 2 by 2 square matrix}$$
</span>
<p>
Now, let's picture a 2D-plane where vectors can be drawn. These vector can be represented as a 2-by-1 array containing their \(x\) and \(y\) coordinates.
But most importantly, you can think of each points of the plane as being the vector going from \((0,0)\) to its coordinates.
</p>
<p>
Now if you want, you can create a function (or linear map) \(f\) that takes as an input a vector \(V\) and gives as an output \(f(V)\):
</p>
<span>
\begin{gather}
\large
f : (x,y) \rightarrow
\begin{pmatrix}
x + y \\
2x - 3y
\end{pmatrix}
\end{gather}
</span>
<button class="collapsible">Further explanations</button>
<div class="content" style="display: none;">
Before discussing vectors and matrices, we must introduce first the <a href="https://en.wikipedia.org/wiki/Vector_space">vector space</a> in which we will work. Here, we pick \(\mathbb{R}^2\) meaning that the elements of our
vector space are "2D-vectors":
$$
\text{let } v \in \mathbb{R}^2, v =
\begin{pmatrix}
v_1 \\ v_2
\end{pmatrix}, (v_1,v_2) \in \mathbb{R} \times \mathbb{R}
$$
Just like vectors, linear maps also have a formal definition: let \(E\) and \(F\) be two vector space, and \(f:E \rightarrow F\) a function. \(f\) is a linear map if:
\begin{gather}
\forall x,y \in \mathbb{E}, \quad \forall \alpha,\beta \in \mathbb{C} \\
f(\alpha x + \beta y) = \alpha f(x) + \beta f(y)
\end{gather}
</div>
<p>
While this definition of a linear map works very well, another way to represent it is by building one of its representative matrices.
</p>
<span>
$$
\text{Let } B = (x_1,x_2) \ \text{where } x_1 =
\begin{pmatrix}
1 \\ 0
\end{pmatrix},
x_2 =
\begin{pmatrix}
0 \\ 1
\end{pmatrix}
$$
$$
A = (f(x_1),f(x_2)) =
\begin{pmatrix}
1 & 1 \\
2 & -3
\end{pmatrix}
$$
</span>
<button class="collapsible">Further explanations</button>
<div class="content" style="display:none">
Let \(E\) be a vector space such that \(dim(E) = n_{(n \in \mathbb{N})}\), and let \(V = (v_i)_{i \in [1,n]}\) be a basis of \(E\). The fact that \(V\) is a basis of \(E\) means in particular that:
$$
\forall v \in E, \exists (\alpha_1, \alpha_2, ..., \alpha_n) \in \mathbb{C}^n, \quad v = \alpha_1 v_1 + ... + \alpha_n v_n
$$
Now, that everything is clearly defined, we can introduce the concept of representative matrix. Let \(f:E \rightarrow E\) be a linear map. A simpler and more universal way to represent \(f\) is by computing one of its representative matrices.
I specify here <span style="font-weight: 600;">"one of its representative matrices"</span> because every representative matrix is linked with one and only one basis of \(E\).
<p style="padding-bottom: 0;">
Therefore, the representative matrix \(F\) of \(f\) in the \(V\) basis is defined as followed:
$$
F = Mat_V(f) = (f(v_1),...,f(v_2))
$$
Moreover, if we have \(F\) and \(F'\) two different representative matrices of \(f\), the two matrices are linked by the following formula:
\begin{gather}
F = P_V^{V'} \cdot F' \\
\small{
V,V' \in E^n, \quad F = Mat_V(f), \quad F'=Mat_V'(f)
}
\end{gather}
Where \(P_V^{V'}\) is the transfer matrice between the basis \(V\) and \(V'\). This matrix only express each vectors of the basis \(V\) as a linear combination of the vectors of the basis \(V'\).
</p>
</div>
<p>
So now, to compute the output of a vector throught the linear map \(f\), you only have to multiply the vector by the linear map's representative matrix:
</p>
<span>
\begin{gather}
\large
\text{Let } v \in E, \text{ then: }
A \cdot v = f(v)
\end{gather}
</span>
<p>
Therefore, matrices can be thougt of as transformation operators that can shift an "object" shape and orientation.
</p>
<span style="font-family: Roboto;font-size:18px;font-weight:400;display: flex;justify-content: center;">↓ Change the matrix and watch the shape change ↓</span>
<div id="apply_matrix_wrapper" style="display:flex;flex-direction:row;padding: 15px 0 20px 0">
<div id="apply_matrix"></div>
<div style="width:100vw;display:flex;flex-direction:column;justify-content: center;align-items: center;">
<div style="display: flex;flex-direction: column;" class="matrix_wrapper">
<div style="display: flex;flex-direction: row;">
<input id="a1" type="text" value="2">
<input id="b1" type="text" value="0">
</div>
<div style="display: flex;flex-direction: row;">
<input id="c1" type="text" value="0">
<input id="d1" type="text" value="1">
</div>
</div>
<button class="apply" style="margin-top:15px">Apply matrix</button>
<button class="reset" style="margin-top:10px">Reset circle</button>
</div>
</div>
<h4>Eigenvectors and diagonalizable matrices</h4>
<p>
But now, what are eigenvectors? Well eigenvector are defined by the following expression:
</p>
<span>
$$\large{
\begin{gather}
A \cdot V = \lambda V \\
V \in \mathbb{R}^n \setminus \{0\}, \quad A \in \mathcal{M}_{n,n}, \quad \lambda \in \mathbb{C}
\end{gather}}$$
</span>
<p>
If \(V\) is a non-null eigenvector of the matrix \(A\), and \(\lambda \) a real or complex value, then passing \(V\) througt \(A\) doesn't change \(V\)'s orientation
but only its magnitude and direction by a factor of \(\lambda\), which we call an eigenvalue.
</p>
<p>
Now, if we take back the example above and we decide to highlight in red the eigenvectors of the matrix, we can see that they are the only vectors that do not move but only extend.
</p>
<span style="font-family: Roboto;font-size:18px;font-weight:400;display: flex;justify-content: center;padding-top: 10px">↓ Change the matrix and watch the eigenvectors ↓</span>
<div id="apply_matrix_arrow_wrapper" style="display: flex;flex-direction: row; padding: 15px 0 20px 0">
<div id="apply_matrix_arrow"></div>
<div style="display: flex;flex-direction:column;justify-content: center;align-items: center;width: 100vw;">
<div style="display: flex;flex-direction: column;" class="matrix_wrapper">
<div style="display: flex;flex-direction: row;">
<input id="a2" type="text" value="2">
<input id="b2" type="text" value="0">
</div>
<div style="display: flex;flex-direction: row;">
<input id="c2" type="text" value="0">
<input id="d2" type="text" value="1">
</div>
</div>
<button class="apply" style="margin-top:15px">Apply matrix</button>
<button class="reset" style="margin-top:10px">Reset circle</button>
<div style="margin-top: 10px;">
<input type="range" id="speed" name="speed" min="1" max="10" onchange="animation_speed=this.value">
<label for="speed" style="font-family: Roboto;font-size: 14px;">Speed</label>
</div>
<div style="margin-top: 10px">
<input type="checkbox" id="eigenvectors" name="eigenvectors" onclick="eigenvectors_check = !eigenvectors_check">
<label for="eigenvectors" style="font-family: Roboto;font-size: 14px;">Show Eigenvectors</label>
</div>
</div>
</div>
<h4>The spectral theorem</h4>
<p>
One last important fact about square matrices is what we call the spectral theorem. This theorem states:
\begin{gather}
\text{Every real symmetric matrices are diagonalizable} \\
\text{and their eigenvectors are 2 by 2 orthogonal}
\end{gather}
</p>
<button class="collapsible">Further explanations</button>
<div class="content" style="display:none">
To get a broader understanding of the Spectral Theorem, we will first define all the terms used above.
First, the formal definition of a symmetric matrix is the following : if $A$ is a symmetrix matrix, then:
\begin{gather}
A = A^\intercal
\end{gather}
Second, a matrix $A$ is called diagonalizable if the eigenvectors of $A$ form a base of $E$ ($E$ being the vector space in in which $A$ is "defined").
<p>
Finally, two distinct vector $v$ and $u$ are orthogonal if:
\begin{gather}
\langle u , v \rangle = 0
\end{gather}
In two dimensions, this relation means that the two vectors are perpendicular.
</p>
</div>
<p>
To put it in a nutshell, this means that every real symmetric matrices have eigenvectors and especially that those vectors are orthogonal, and that the matrix transform the space along those eigenvectors.
</p>
<h3 style="padding-top: 20px">Putting it all together</h3>
<p>
Now, let's go back to our variance and covariance. Like we've seen before, variance and covariance tells us how a cloud of point is shaped. Thus, thanks to our understanding of linear algebra, we can define the covariance matrix as followed:
</p>
<span>
$$
\large{
C_{X,Y} =
\begin{pmatrix}
Cov(X,X) & Cov(X,Y)\\
Cov(Y,X) & Cov(Y,Y)
\end{pmatrix} =
\begin{pmatrix}
Cov(X,X) & Cov(X,Y)\\
Cov(X,Y) & Cov(Y,Y)
\end{pmatrix}}
$$
</span>
<p>
This covariance matrix can be thought as the transformation that will "shape any circle distribution" into the distribution that defined the covariance matrix.
</p>
<button class="collapsible">Further explanations</button>
<div class="content" style="display:none">
Let's denote $D$ our circle distribution (meaning all the vector that compose the circle) and $T$ the matrix that will transform the circle distribution into the target distribution $D'$.
What we would like is to decompose $T$ as the product of a rotation matrix $R$ and a scaling matrix $S$:
\begin{gather}
D' = T \cdot D \\
T = R \cdot S \\
\text{with} \
R =
\begin{pmatrix}
\cos(\theta) & -\sin(\theta) \\ \sin(\theta) & \cos(\theta)
\end{pmatrix}
\ \text{and} \
S =
\begin{pmatrix}
\alpha_1 & 0 \\
0 & \alpha_2
\end{pmatrix}
\end{gather}
Now, if we compute the covariance matrix $C$ of $D$, thanks to the spectral theorem, we can rewrite it as:
\begin{gather}
C = A B A^{-1} \\
\text{with B =}
\begin{pmatrix}
\lambda_1 & 0 & 0 \\
0 & \ddots & 0 \\
0 & 0 & \lambda_n
\end{pmatrix}
\end{gather}
Thanks to our previous knowledge, we can intuitively see the link between $C$ and $T$ being that :
\begin{gather}
C = R \cdot \sqrt{S} \cdot R^{-1} \\
\small{A = R, \ B = \sqrt{S}}
\end{gather}
Add explanation pictures
<span style="font-style:italic;color:#3d3d3d">The square root comes from analysing examples of distributions where covariance was null.</span>
<p>
Therefore, we can simply compute the transformation matrix $T$ by finding $C$ eigenvalues and eigenvectors (using an SVD algorithm from instance) to then construct $R$ and $S$ according to their new definition.
</p>
<p>
In a nutsheel, $C$ is not properly speaking the matrix "that shift any circle distribution into the distribution it's defined from", but it's the matrix from which we can derive the transformation matrix $T$.
</p>
</div>
<p style="padding-top:25px">
Thus the eigenvectors of the covariance matrix give us the axis along which the transformation happens also called direction (because \(C_{X,Y}\) is a real symmetric matrix).
</p>
<span style="font-family: Roboto;font-size:18px;font-weight:400;display: flex;justify-content: center;padding-top: 15px;">↓ Draw a filled shape to find its direction ↓</span>
<div style="align-items: center;display: flex;flex-direction:column;padding:15px">
<div id="drawing_script"></div>
<button style="margin-top:10px" id="reset_canvas" onclick="reset_canvas_boolean = true">Reset canvas</button>
</div>
</div>
<div class="conclusion_wrapper">
<h2>Conclusion</h2>
</div>
</body>
</html>