-
Notifications
You must be signed in to change notification settings - Fork 1
/
matMult.tex
2159 lines (1807 loc) · 63.9 KB
/
matMult.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\chapter{Matrix multiplication}
So far, we've multiplied scalars by vectors
(p.~\pageref{scalarVectorMultiplication}), vectors by other vectors
(p.~\pageref{dotProduct}), scalars by matrices
(p.~\pageref{scalarMatrixMultiplication}), and even matrices by vectors
(p.~\pageref{matrixVectorMultiplication}). The only thing we haven't done yet
is multiply one entire matrix by another. That mysterious operation is the
subject of this chapter.
Luckily, we've already set ourselves up for success. As it will turn out,
matrix-matrix multiplication is really just matrix-\textit{vector}
multiplication ``in a loop''; \textit{i.e.}, repeated several times.
\section{When it's legal and what you get}
But let's not get ahead of ourselves. First, let's outline the very curious
rules for (1) when two matrices \textit{can} be multiplied at all (often they
can't), and (2) if they can, what the dimensions of the result are. These rules
will surprise you at first (they certainly did me).
Let's say we have two matrices called $A$ and $B$. Suppose that $A$ is an
$m\times n$ matrix ($m$ rows and $n$ columns), and that $B$ is a $p\times q$
matrix. Visually, here's what we've got:
\makeatletter
\newcommand\makebig[2]{%
\@xp\newcommand\@xp*\csname#1\endcsname{\bBigg@{#2}}%
\@xp\newcommand\@xp*\csname#1l\endcsname{\@xp\mathopen\csname#1\endcsname}%
\@xp\newcommand\@xp*\csname#1r\endcsname{\@xp\mathclose\csname#1\endcsname}%
}
\makeatother
\makebig{Biggg} {3.5}
\makebig{Bigggg} {4.0}
\makebig{Biggggg} {4.5}
\vspace{-.15in}
\begin{align*}
m \Biggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\vdots & \vdots & \cdots & \vdots \\
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\end{bmatrix}}^n \ \ \smallblackcircle \ \
p \Biggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\vdots & \vdots & \cdots & \vdots \\
\smallblacksquare & \smallblacksquare & \cdots & \smallblacksquare \\
\end{bmatrix}}^q
\ \ = \ \ \text{\LARGE ?}
\end{align*}
\vspace{-.15in}
\smallskip
Here are the rules:
\begin{center}
\begin{framed}
\begin{compactenum}
\label{matMultRules}
\item $n$ must be equal to $p$, or you can't multiply the matrices at all.
\item If $n$ does equal $p$, then you'll get an $m\times q$ matrix when you
multiply them.
\end{compactenum}
\end{framed}
\end{center}
Those rules are so strange and unexpected that it's worth taking a long moment
to stare at both the matrices and the rules and try to digest them.
\smallskip
Some concrete examples:
\begin{enumerate}
\itemsep.5em
\item Can we multiply a $3\times 2$ matrix by a $2\times 4$? Yes, since $n=2$
and $p=2$. And our result will be a $3\times 4$:
\vspace{-.15in}
\begin{align*}
3 \Biggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^2 \ \ \cdot \ \
2 \Bigg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^4
\ \ = \ \
3 \Biggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^4.
\end{align*}
\vspace{-.15in}
\item Can we multiply a $2\times 5$ matrix by a $5\times 3$? Yes, since $n=5$
and $p=5$. And we get a $2\times 3$:
\vspace{-.15in}
\begin{align*}
2 \Bigg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare& \smallblacksquare& \smallblacksquare& \smallblacksquare \\
\smallblacksquare & \smallblacksquare& \smallblacksquare& \smallblacksquare& \smallblacksquare \\
\end{bmatrix}}^5 \ \ \cdot \ \
5 \Biggggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^3
\ \ = \ \
2 \Bigg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^3.
\end{align*}
\vspace{-.15in}
\item Can we multiply a $4\times 3$ matrix by another $4\times 3$? No, since
$n=3$ but $p=4$. Sorry.
\vspace{-.15in}
\begin{align*}
4 \Bigggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^3 \ \ \smallblackcircle \ \
4 \Bigggg\{
\overbrace{
\begin{bmatrix}
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\smallblacksquare & \smallblacksquare & \smallblacksquare \\
\end{bmatrix}}^3
\ \ = \ \ \text{NOPE}.
\end{align*}
\vspace{-.15in}
\end{enumerate}
It's sooo bizarre. Sometimes you multiply two biggish matrices together and get
a small one; sometimes you multiply narrow ones and get a tall one; sometimes
it seems like you'd get a valid answer and yet there is none.
Anyway, now that we have the ground rules for what the resulting matrix will be
shaped like (if there even is one) let's talk about actually calculating the
entries. I'm going to give you \textit{three} different ways to think about
this, each of which sheds a different light on the operation.
\section{Way \#1: Lather, rinse, repeat}
\index{matrix-vector multiplication}
The first way is to view the matrix multiplication $A \cdot B$ as
\textbf{repeated matrix-vector multiplication}, where the matrix is $A$ and the
vectors are the \textbf{columns} of $B$. The final answer is formed by
stitching together the results of the individual matrix-vector multiplications.
Let's see it in action. If you remember the procedure on
p.~\pageref{matrixVectorMultiplication}, you can confirm that if we perform
this matrix-vector multiplication:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
0 \\ 0 \\ 7 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
we'll get the answer
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
35 \\ -14 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
\smallskip
And if we do this:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
9 \\ 99 \\ 999 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
we'll get this:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
5112 \\ -1701 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
\smallskip
Finally, if we do this:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
-13 \\ -13 \\ -13 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
we'll get this:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
-104 \\ -13 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
\smallskip
Notice what I did there. I took the \textit{same} $2\times 3$ matrix each time,
and multiplied it by some vector -- a weird one, to help jog your memory in a
moment -- to get an answer.
All right. Now let's see what happens if I perform the following
matrix-\textit{matrix} multiplication:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
0 & 9 & -13 \\
0 & 99 & -13 \\
7 & 999 & -13 \\
\end{bmatrix} \ = \ {\LARGE ?}
\end{align*}
\vspace{-.15in}
Examine the columns of the right-hand matrix: they should ring a bell. Each
\textit{column} is one of the \textit{vectors} that we just multiplied our
matrix by to get a columnar answer. The result of this operation is achieved by
simply putting all those columnar answers together:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
0 & 9 & -13 \\
0 & 99 & -13 \\
7 & 999 & -13 \\
\end{bmatrix} =
\begin{bmatrix}
35 & 5112 & -104 \\
-14 & -1701 & -13 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
\index{Bond, James}
See how that works? The result of the multiplication is just the three
individual matrix-vector products, all concatenated together in an ``answer
matrix.'' The left column of our answer is ${\scriptsize \begin{bmatrix} 35 \\
-14 \\ \end{bmatrix}}$, which is exactly what we got when we multiplied that
left-hand matrix by James Bond. The right column of our answer is ${\scriptsize
\begin{bmatrix} -104 \\ -13 \\ \end{bmatrix}}$, which is what we got when we
multiplied the matrix by triple $-13$'s. And the middle column of the answer is
the matrix times the stack of nines. So you can see that matrix-\textit{matrix}
multiplication is really just repeated matrix-\textit{vector} multiplication.
This way of thinking about matrix multiplication might be the one that
resonates most strongly with you. (It did for me.)
\section{Way \#2: All possible dot products}
\label{matMultWay2}
On the other hand, maybe you'll like this one better. Matrix-matrix
multiplication can also be viewed as \textbf{all possible dot products} between
the \textbf{rows} of $A$ and the \textbf{columns} of $B$.
Flash back for a moment to \textit{A Cool, Brisk Walk} chapter 6, and the
Fundamental Theorem of Counting. Answer this question: ``You have two choices
of appetizer, and three choices of entr\'{e}e. How many different dinner
combinations are possible?''
The answer is six, since each of the two appetizers can go with any of the
three entr\'{e}es. So you could choose:
\begin{compactenum}
\item shrimp cocktail, filet mignon
\item shrimp cocktail, chicken pesto
\item shrimp cocktail, eggplant parmigiana
\item artichoke dip, filet mignon
\item artichoke dip, chicken pesto
\item artichoke dip, eggplant parmigiana
\end{compactenum}
Now back to matrices. If I multiply these two matrices together:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 1 & 5 \\
0 & 3 & -2 \\
\end{bmatrix} \ \text{and} \
\begin{bmatrix}
0 & 9 & -13 \\
0 & 99 & -13 \\
7 & 999 & -13 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
how many possible dot products are there between \textit{rows} of $A$ and
\textit{columns} of $B$?
The answer is six, since each of the two $A$ rows can go with any of the three
$B$ rows. The possibilities are:
\begin{compactenum}
\item $[\ 2\ \ 1\ \ 5\ ]$ and $[\ 0\ \ 0\ \ 7\ ]$
\item $[\ 2\ \ 1\ \ 5\ ]$ and $[\ 9\ \ 99\ \ 999\ ]$
\item $[\ 2\ \ 1\ \ 5\ ]$ and $[\ -13\ \ -13\ \ -13\ ]$
\item $[\ 0\ \ 3\ \ -2\ ]$ and $[\ 0\ \ 0\ \ 7\ ]$
\item $[\ 0\ \ 3\ \ -2\ ]$ and $[\ 9\ \ 99\ \ 999\ ]$
\item $[\ 0\ \ 3\ \ -2\ ]$ and $[\ -13\ \ -13\ \ -13\ ]$
\end{compactenum}
(Note \textit{very} carefully that we use the \textit{columns} of $B$, not the
rows!)
\smallskip
Very well. Let's compute all those dot products then:
\begin{compactitem}
\item $[\ 2\ \ 1\ \ 5\ ] \cdot [\ 0\ \ 0\ \ 7\ ] = 35$
\item $[\ 2\ \ 1\ \ 5\ ] \cdot [\ 9\ \ 99\ \ 999\ ] = 5112$
\item $[\ 2\ \ 1\ \ 5\ ] \cdot [\ -13\ \ -13\ \ -13\ ] = -104$
\item $[\ 0\ \ 3\ \ -2\ ] \cdot [\ 0\ \ 0\ \ 7\ ] = -14$
\item $[\ 0\ \ 3\ \ -2\ ] \cdot [\ 9\ \ 99\ \ 999\ ] = -1701$
\item $[\ 0\ \ 3\ \ -2\ ] \cdot [\ -13\ \ -13\ \ -13\ ] = -13$
\end{compactitem}
\smallskip
Those six dot products are precisely the entries in our answer matrix:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
35 & 5112 & -104 \\
-14 & -1701 & -13 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
The only thing you have to be careful of is which answer goes in which place.
The rule is:
\begin{framed}
The dot product of row $i$ of $A$ and column $j$ of $B$ goes in
row $i$, column $j$ of the answer.
\end{framed}
A sensible arrangement, I think you'll agree. Multiplying row 0 with column 0
will give us the entry in row 0, column 0 of our answer. Multiplying row 14
with column 9 will give us the entry in row 14, column 9 of our answer. And so
forth.
In terms of our current example, the reason that the number 5112 goes in the
\textit{top middle} of our answer (as opposed to the bottom left, or anywhere
else) is that 5112 is the dot product of the \textit{top} row of $A$ ($[\ 2\ \
1\ \ 5\ ]$) with the \textit{middle} column of $B$ ($[\ 9\ \ 99\ \ 999\ ]$).
Be sure to practice with this so you don't get numbers out of place.
\smallskip
\index{matchmaker@\texttt{matchmaker.com}}
It might help to keep in mind possible applications here. Why would we ever
want to compute ``all possible dot products?'' Well, think back to our
matchmaker example. Let's say we have 4 women and 5 men, each of whom has
completed a survey. Finding all the compatibilities -- \textit{i.e.},
predicting the dating success of all possible pairings -- is precisely
computing the dot product of every gal with every guy (assuming
heterosexuality). That's 20 possible dot products, which we can calculate with
a single matrix multiplication.
% TODO: explain that we need "Women times TRANSPOSE of men" not "women times
% men."
\section{Way \#3: Several linear combinations}
Our third and final way to think about matrix multiplication is in terms of
linear combinations. Remember (from p.~\pageref{linearComboOfColumns}) that
every matrix-\textit{vector} multiplication $A \cdot
\overrightarrow{\textbf{x}}$ is essentially specifying some linear combination
of $A$'s \textit{columns}. If we multiply $A$ by the vector ${\scriptsize
\begin{bmatrix} 3 \\ 5 \\ \end{bmatrix}}$, we're saying ``I'd like 3 copies of
$A$'s first column, plus 5 copies of its second column, please.''
Matrix multiplication is simply asking for several \textit{different} linear
combinations. If we multiply a matrix $A$ by this one:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
0 & 9 & -13 \\
0 & 99 & -13 \\
7 & 999 & -13 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
we're requesting the following:
\begin{quote}
\index{credit card}
\textit{
``Hello, I'd like to put in an order for three things. First, I'd like 7 copies
of $A$'s third column (ignore the first two). Additionally, I'd like 9 copies
of its first column, 99 copies of its second column, and 999 copies of its
third column, all added together. Finally, please give me $-13$ copies of each
of its columns, again added together. Thanks! You should have my credit card
number on file.''
}
\end{quote}
To fulfill this order, we compute each of the three linear combinations
requested. Using the same $A$ matrix we've been using (${\scriptsize
\begin{bmatrix} 2 & 1 & 5 \\ 0 & 3 & -2 \end{bmatrix}}$) this amounts to:
\vspace{-.15in}
\begin{align*}
\text{First combination: } &7
\begin{bmatrix}
5 \\
-2 \\
\end{bmatrix} =
\begin{bmatrix}
35 \\
-14 \\
\end{bmatrix} \\
\text{Second combination: } &9
\begin{bmatrix}
2 \\
0 \\
\end{bmatrix} + 99
\begin{bmatrix}
1 \\
3 \\
\end{bmatrix} + 999
\begin{bmatrix}
5 \\
-2 \\
\end{bmatrix} =
\begin{bmatrix}
5112 \\
-1701 \\
\end{bmatrix} \\
\text{Third combination: } &-13
\begin{bmatrix}
2 \\
0 \\
\end{bmatrix} -13
\begin{bmatrix}
1 \\
3 \\
\end{bmatrix} -13
\begin{bmatrix}
5 \\
-2 \\
\end{bmatrix} =
\begin{bmatrix}
-104 \\
-13 \\
\end{bmatrix}
\end{align*}
\vspace{-.15in}
Packaging up all those results again gives us:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
35 & 5112 & -104 \\
-14 & -1701 & -13 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
Same answer no matter which of the three ways we think about it.
\section{Outer and inner products}
All right. Now for some surprises.
\index{row vector}
\index{column vector}
Remember (p.~\pageref{rowAndColVectors}) that we will sometimes want to treat a
vector as a sort of degenerate matrix: a matrix with only one row, or only one
column. And we will sometimes want to do this matrix multiplication thing with
two vectors, treating one of them as a row vector and the other as a column
vector. Which one is which makes a tremendous difference.
As an illustration, I'm going to define vectors $\overrightarrow{\textbf{x}}$
and $\overrightarrow{\textbf{y}}$ this way:
\vspace{-.15in}
\begin{align*}
\overrightarrow{\textbf{x}} =
\begin{bmatrix}
3 & 1 & 2 \\
\end{bmatrix}, \quad
\overrightarrow{\textbf{y}} =
\begin{bmatrix}
5 \\ 4 \\ -3 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
So $\overrightarrow{\textbf{x}}$ is a row vector, and
$\overrightarrow{\textbf{y}}$ is a column vector. Put another way,
$\overrightarrow{\textbf{x}}$ can be thought of as a $1\times 3$ matrix, and
$\overrightarrow{\textbf{y}}$ as a $3\times 1$ matrix.
Now if we do treat these as matrices, then performing the operation
$\overrightarrow{\textbf{x}} \cdot \overrightarrow{\textbf{y}}$ gives us:
\vspace{-.15in}
\begin{align*}
\overrightarrow{\textbf{x}} \cdot \overrightarrow{\textbf{y}} =
\begin{bmatrix}
3 & 1 & 2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
5 \\ 4 \\ -3 \\
\end{bmatrix} = 13.
\end{align*}
\vspace{-.15in}
It's just the dot product, of course, calculated in the usual way.
Now suppose I swap the order, and compute $\overrightarrow{\textbf{y}}$ times
$\overrightarrow{\textbf{x}}$ instead. What would I get? The answer will surely
surprise you:
\vspace{-.15in}
\begin{align*}
\overrightarrow{\textbf{y}} \cdot \overrightarrow{\textbf{x}} =
\begin{bmatrix}
5 \\ 4 \\ -3 \\
\end{bmatrix} \cdot
\begin{bmatrix}
3 & 1 & 2 \\
\end{bmatrix} =
\begin{bmatrix}
15 & 5 & 10 \\
12 & 4 & 8 \\
-9 & -3 & -6 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
Hooooooo...wut?! $\overrightarrow{\textbf{x}}$ times
$\overrightarrow{\textbf{y}}$ is the \textit{number} 13, but
$\overrightarrow{\textbf{y}}$ times $\overrightarrow{\textbf{x}}$ is an entire
\textit{grid} full of numbers?
Yes it is. Here's why.
Remember our rules from p.~\pageref{matMultRules}. First, we can only multiply
two matrices if $n=p$. And that's true here whether we do
$\overrightarrow{\textbf{x}} \cdot \overrightarrow{\textbf{y}}$ or
$\overrightarrow{\textbf{y}} \cdot \overrightarrow{\textbf{x}}$. But the second
rule tells us that the answer will be a $m\times q$ matrix. If we put
$\overrightarrow{\textbf{x}}$ on the left, then $\overrightarrow{\textbf{x}}
\cdot \overrightarrow{\textbf{y}}$ will give us a $1\times 1$ matrix. But if we
put $\overrightarrow{\textbf{y}}$ on the left, then
$\overrightarrow{\textbf{y}} \cdot \overrightarrow{\textbf{x}}$ must give us a
$3\times 3$ matrix. Strange but true.
\index{inner product}
\index{outer product}
\label{outerProduct}
Btw, when we do the first thing -- treat the vector on the left-hand side of
the multiplication as a row vector, and the other as a column vector -- it's
called the \textbf{inner product} of the two vectors. The other way is called
the \textbf{outer product}.
\section{$A\cdot A^\intercal$ vs.~$A^\intercal \cdot A$}
\index{transpose}
Here's another interesting consequence of our operation. As we've seen, you
certainly can't multiply a matrix $A$ by just ``any old thing,'' since rule 1
on p.~\pageref{matMultRules} says that $n$ must equal $p$.
You can, however, \textit{always} perform the operation $A\cdot A^\intercal$,
no matter what dimensions $A$ is. That's because if $A$ is, say, $17\times 28$,
then $A^\intercal$ will be $28\times 17$, and $n=p$ (both are 28) as required.
You'll get a $17\times 17$ matrix if you do that.
You also can \textit{always} perform the operation $A^\intercal \cdot A$.
Again, if $A$ is $17\times 28$, then $A^\intercal$ will be $28\times 17$, and
so again $n=p$ (both are 17). You'll get a $28\times 28$ matrix if you do that.
Here's an example, smaller so it fits on the page. Let's say $A$ is
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 20 & 3 & -2 & -4 \\
-5 & 1 & 4 & 1 & 9 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
The two operations give us:
\vspace{-.15in}
\begin{align*}
A\cdot A^\intercal =
\begin{bmatrix}
2 & 20 & 3 & -2 & -4 \\
-5 & 1 & 4 & 1 & 9 \\
\end{bmatrix} \cdot
\begin{bmatrix}
2 & -5 \\
20 & 1 \\
3 & 4 \\
-2 & 1 \\
-4 & 9 \\
\end{bmatrix} =
\begin{bmatrix}
433 & -16 \\
-16 & 124 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
and
\vspace{-.15in}
\begin{align*}
A^\intercal \cdot A =
\begin{bmatrix}
2 & -5 \\
20 & 1 \\
3 & 4 \\
-2 & 1 \\
-4 & 9 \\
\end{bmatrix} \cdot
\begin{bmatrix}
2 & 20 & 3 & -2 & -4 \\
-5 & 1 & 4 & 1 & 9 \\
\end{bmatrix} = \\
\begin{bmatrix}
29 & 35 & -14 & -9 & -53 \\
35 & 401 & 64 & -39 & -71 \\
-14 & 64 & 25 & -2 & 24 \\
-9 & -39 & -2 & 5 & 17 \\
-53 & -71 & 24 & 17 & 97 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
\index{symmetric matrix}
Two other intriguing facts are worth noting here, one of which you may have
noticed. If you run your eyeballs carefully over those two results, you'll see
that both of them are \textbf{symmetric matrices}. This is always true of a
matrix times its transpose, and that turns out to be important for some
applications.
The other fact -- certainly not ascertainable to my eyeballs, at least -- is
that both of these matrices have only \textit{rank 2}. That's not surprising
about $A\cdot A^\intercal$, since it's just a $2\times 2$ anyway. But it is
very surprising about $A^\intercal \cdot A$. That's a $5\times 5$ matrix with
only \textit{two} linearly independent columns!
And this is always true. When you multiply a matrix by its transpose, either
way you do it, the rank will only be the \textit{lower} of the two dimensions.
The way I think of it is this. When you take a tall matrix (like our $5\times
2$, above) and multiply it by a wide one (the $2\times 5$), yes you're going to
get a result with large dimensions. Sure. But in a way, you only put ``two
columns' worth'' of information into the operation. The result is a large
$5\times 5$, but that's misleading, because there just isn't enough information
present in those 25 entries to represent five independent directions. The
$5\times 5$ result is brittle, containing only two columns' worth of
information spread out over a large landscape. It's almost as if I wrote a
fourteen-sentence paragraph with only three sentences repeated again and again,
with the words rescrambled a bit each time. Sure, it looks like a long
paragraph at first glance, but try reading it and you'll recognize how little
information it really contains.
\section{Associative, but not commutative}
\label{associative}
\index{commutative}
\index{associative (operation)}
The next surprising thing I'll point out is that matrix multiplication does
\textit{not} follow one of the laws of plain-Jane multiplication that you're
used to counting on. Namely, matrix multiplication is \textit{not} commutative.
You're so accustomed to this being true that it's positively jarring. Ever
since Mrs.~Jones taught you in second grade, you've safely relied on the fact
that:
\vspace{-.15in}
\begin{align*}
115 \cdot 272 = 272 \cdot 115.
\end{align*}
\vspace{-.15in}
It might be a pain to work out the answer, but at least you've know without
even thinking that it doesn't matter which order you do the multiplication in.
But \textit{oh!}~this totally does not work with matrix multiplication. Taking
two matrices at random:
\vspace{-.15in}
\begin{align*}
A =
\begin{bmatrix}
4 & 2 \\
2 & 3 \\
\end{bmatrix},
B =
\begin{bmatrix}
1 & -1 \\
2 & 0 \\
\end{bmatrix}\\
\end{align*}
\vspace{-.55in}
\begin{align*}
A \cdot B &=
\begin{bmatrix}
8 & -4 \\
8 & -2 \\
\end{bmatrix}, \\
B \cdot A &=
\begin{bmatrix}
2 & -1 \\
8 & 4 \\
\end{bmatrix}\ {\footnotesize \textit{surprise!}}
\end{align*}
\vspace{-.15in}
Not even close to the same thing. And in general, matters are even worse
because you normally can't even \textit{do} the operation both ways! (If $A$
were a $2\times 4$, for example, and $B$ a $4\times 3$, then $A \cdot B$ would
give you a $2\times 3$ matrix but $B \cdot A$ is impossible.) You actually have
to really work at it to come up with two matrices whose product is the same
both ways.
Nothing much to say here except to stay on your toes.
Another ``given,'' however, \textit{is} true of matrices, and a good thing,
too. That's the \textbf{associative} property. This means that if you're
multiplying together a string of matrices, it doesn't matter which
multiplication you perform first. In other words, for any three matrices $A$,
$B$, and $C$:
\vspace{-.15in}
\begin{align*}
(A \cdot B) \cdot C =
A \cdot (B \cdot C).
\end{align*}
\vspace{-.15in}
\label{associativityExample}
Again, an example to illustrate:
\vspace{-.15in}
\begin{align*}
A =
\begin{bmatrix}
4 & 2 \\
2 & 3 \\
\end{bmatrix},
B =
\begin{bmatrix}
1 & -1 \\
2 & 0 \\
\end{bmatrix},
C =
\begin{bmatrix}
3 & 4 \\
1 & 5 \\
\end{bmatrix}\\
\end{align*}
\vspace{-.55in}
\begin{align*}
(A \cdot B) \cdot C =
\begin{bmatrix}
8 & -4 \\
8 & -2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
3 & 4 \\
1 & 5 \\
\end{bmatrix} &=
\begin{bmatrix}
20 & 12 \\
22 & 22 \\
\end{bmatrix}, \\
A \cdot (B \cdot C) =
\begin{bmatrix}
4 & 2 \\
2 & 3 \\
\end{bmatrix} \cdot
\begin{bmatrix}
2 & -1 \\
6 & 8 \\
\end{bmatrix} &=
\begin{bmatrix}
20 & 12 \\
22 & 22 \\
\end{bmatrix}. \quad \checkmark
\end{align*}
\vspace{-.15in}
\index{Cartesian plane}
And this always works.
One reason this is nice has to do with linear operators. Recall from
section~\ref{linearOps} (pp.~\pageref{linearOps}--\pageref{endLinearOps}) that
we can create operators to scale, flip, and rotate points in the Cartesian
plane. To review:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 0 \\
0 & 2 \\
\end{bmatrix}
\quad & \text{stretch points twice as far from the origin}\\
\begin{bmatrix}
.866 & -.5 \\
.5 & .866 \\
\end{bmatrix}
\quad & \text{rotate points 30\degree\ counterclockwise}\\
\begin{bmatrix}
-1 & 0 \\
0 & 1 \\
\end{bmatrix}
\quad & \text{flip points horizontally across the $y$-axis}\\
\end{align*}
\vspace{-.15in}
Multiplying any of these matrices by a vector has the desired effect.
\medskip
Now suppose we wanted to perform \textit{several} of these operations on a
vector. For example, take a random point $(8.5, 19)$. To stretch, rotate,
\textit{and} flip it, we'd do this:
\vspace{-.15in}
\begin{align*}
\begin{bmatrix}
2 & 0 \\
0 & 2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
.866 & -.5 \\
.5 & .866 \\
\end{bmatrix} \cdot
\begin{bmatrix}
-1 & 0 \\
0 & 1 \\
\end{bmatrix} \cdot
\begin{bmatrix}
8.5 \\
19 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
This works out to ${\scriptsize \begin{bmatrix} -33.722 \\ 24.408 \\
\end{bmatrix}}$ if you're keeping score at home.
\index{Bowser}
Now often instead of transforming a single point, we want to transform an
entire image, which contains multiple points. Imagine calculating every pixel
of Bowser as his Kart trips over a green shell and spins towards the side of
the screen. We can take advantage of the associativity of matrix multiplication
to pre-compute a \textit{single} matrix that will
stretch/flip/rotate/squish/whatever any point we care to multiply it by:
\vspace{-.15in}
\begin{align*}
T =
\begin{bmatrix}
2 & 0 \\
0 & 2 \\
\end{bmatrix} \cdot
\begin{bmatrix}
.866 & -.5 \\
.5 & .866 \\
\end{bmatrix} \cdot
\begin{bmatrix}
-1 & 0 \\
0 & 1 \\
\end{bmatrix} =
\begin{bmatrix}
-1.732 & -1 \\
-1 & 1.732 \\
\end{bmatrix}.
\end{align*}
\vspace{-.15in}
This makes our game engine run a lot faster, since we don't have to do all
those calculations separately for every point in the image. Instead, we
calculate our $T$ matrix (for \textbf{t}ransformation) just once, and then
multiply it by every point.
In fact, if we have all of Bowser's pixels in a $2\times 1000$ matrix:
\vspace{-.15in}
\begin{align*}
B =
\begin{bmatrix}
18 & 19 & 22 & 32 & 34 & \cdots & 195 \\
9 & 9 & 11 & 14 & 19 & \cdots & 212 \\
\end{bmatrix},
\end{align*}
\vspace{-.15in}
then we can compute what pixels to draw in the next frame with just one
operation: $T \cdot B$!
\vspace{-.15in}
\begin{align*}
B_{\text{next}} = T \cdot B =
\begin{bmatrix}
-1.732 & -1 \\
-1 & 1.732 \\
\end{bmatrix} \cdot