\documentclass[12pt]{article}
 
\usepackage[margin=1in]{geometry}
\usepackage{amsmath,enumitem,amsthm,amssymb,graphicx,mathtools,tikz,hyperref,yfonts,tikz-cd, subfiles, datetime, adjustbox, extarrows,old-arrows}
\usepackage[nottoc]{tocbibind}

\usepackage[backend=biber,style=alphabetic,sorting=ynt]{biblatex}
\addbibresource{bob.bib}
\DeclareFieldFormat{postnote}{#1}
\DeclareFieldFormat{multipostnote}{#1}

\usetikzlibrary{positioning,lindenmayersystems}

\usepackage{graphicx}
\graphicspath{ {./images/} }
\usepackage[rightcaption]{sidecap}
\usepackage{wrapfig}

\setcounter{tocdepth}{2}

\usepackage{hyperref}
\hypersetup{
    colorlinks  = true,
    linkcolor   = blue,
    filecolor   = magenta,      
    urlcolor    = cyan,
}

\newtheorem{theorem}{Theorem}[section]
\newtheorem{corollary}{Corollary}[theorem]
\newtheorem{lemma}[theorem]{Lemma}

\theoremstyle{definition}
\newtheorem{definition}{Definition}[section]

\theoremstyle{definition}
\newtheorem*{example}{Example}

\theoremstyle{definition}
\newtheorem*{examples}{Examples}

\theoremstyle{definition}
\newtheorem*{remark}{Remark}

\theoremstyle{definition}
\newtheorem*{remarks}{Remarks}

\theoremstyle{definition}
\newtheorem*{conventions}{Conventions}

\DeclareMathOperator{\spec}{Spec}
\DeclareMathOperator{\Frac}{Frac}
\DeclareMathOperator{\nil}{nil}
\DeclareMathOperator{\rad}{rad}
\DeclareMathOperator{\ann}{Ann}
\DeclareMathOperator{\cok}{cok}
\DeclareMathOperator{\im}{im}
\DeclareMathOperator{\Hom}{Hom}
\DeclareMathOperator{\End}{End}
\DeclareMathOperator{\id}{id}
\DeclareMathOperator{\tor}{Tor}
\DeclareMathOperator{\ext}{Ext}
\DeclareMathOperator{\codim}{codim}
\DeclareMathOperator{\trdeg}{tr deg}
\DeclareMathOperator{\tot}{Tot}
\DeclareMathOperator{\qf}{qf}
\DeclareMathOperator{\mor}{Mor}
\DeclareMathOperator*{\colim}{colim}
\newcommand{\R}{\mathbb R}
\newcommand{\C}{\mathbb C}
\newcommand{\Z}{\mathbb Z}
\newcommand{\N}{\mathbb N}
\newcommand{\Q}{\mathbb Q}
\renewcommand{\P}{\mathbb P}
\newcommand{\A}{\mathbb A}
\newcommand{\F}{\mathbb F}
\newcommand{\p}{\mathfrak p}
\newcommand{\q}{\mathfrak q}
\newcommand{\m}{\mathfrak m}
\renewcommand{\a}{\mathfrak a}
\renewcommand{\b}{\mathfrak b}
\newcommand{\rp}{\R\P}
\newcommand{\cp}{\C\P}
\newcommand{\transverse}{\mathrel{\text{\tpitchfork}}}
\makeatletter
\newcommand{\tpitchfork}{%
  \vbox{
    \baselineskip\z@skip
    \lineskip-.52ex
    \lineskiplimit\maxdimen
    \m@th
    \ialign{##\crcr\hidewidth\smash{$-$}\hidewidth\crcr$\pitchfork$\crcr}
  }%
}
\makeatother
\newcommand{\diff}[2]{\frac{\partial{#1}}{\partial{#2}}}
\newcommand{\parens}[1]{{\left(#1\right)}}
\newcommand{\bracket}[1]{{\left[#1\right]}}
\newcommand{\curly}[1]{{\left\{#1\right\}}}
\newcommand{\Angle}[1]{{\left\langle#1\right\rangle}}
\newcommand{\pipe}[1]{{\left|#1\right|}}

\newcommand{\harpoon}{\overset{\rightharpoonup}}

\begin{document}

\begin{center}
    33A\\
    Change of Coordinates\\
    Author: Jas Singh
\end{center}

\section{Introduction}
I'll try to present here a (more coherent) explanation of a way to think of change of basis/coordinates.
As always, I highly suggest 3Blue1Brown videos on linear algebra, and he has an excellent one on change of basis \href{https://www.youtube.com/watch?v=P2LTAUO1TdA}{here}.

Changing basis is an essential topic in linear algebra, and mastering allows for a fruitful extension of one's computational abilities.
It also explains a deeper symmetry in the structure of linear maps, one which could be explored in greater depth in a class like 115A.
On a purely aesthetic level, working with more general coordinate systems allows for certain scenarios to be phrased in a more natural way.
Essentially, a basis puts a grid on our space.
But space itself has no preferred grid.
There are many grids we can put on, say, the plane, so why focus our attention on any specific one?

\textbf{Warning.} I will introduce new notation in this little document.
Be warned that this notation does not align exactly with how the book does it.
So treat this as a supplement.
When I was first learning about change of bases, this sort of notation made a lot more sense to me.

\section{Recalling how we usually do things}
A critical idea in linear algebra is the dictionary between matrices and linear transformation.
If we have an $n \times m$ matrix $A$, we associate to it a linear transformation $T: \R^m \longrightarrow \R^n$ by the formula
\[
    T(\vec{x}) = A\vec{x}
\]
where the left hand side is applying a function $T$ to an input $\vec{x}$ in $\R^m$ and the right hand side is multiplying a matrix by a vector.
The right hand side yields a formula for evaluating $T$.

On the other hand, if we are given a linear transformation $T$, how can we recover the matrix defining it?
This is the reverse problem to the above.
Note that this isn't a contrived question.
Often we will be able to reason that a function is linear more easily than we can find its matrix.
For instance, try to think geometrically about why a rotation in the $xy$ plane is linear, by which I mean the following two equations are satisfied
\begin{align*}
    T(\vec{x} + \vec{y}) &= T(\vec{x}) + T(\vec{y})\\
    T(c\vec{x}) &= c T(\vec{x})
\end{align*}
for all vectors $\vec{x}$ and $\vec{y}$ and scalars $c$.
Essentially, these say that $T$ preserves linear combinations, which appears visually as taking an evenly spaced grid to an evenly spaced grid.

Anyways, how do we find this matrix $A$?
Well that matrix has to satisfy $T(\vec{x}) = A\vec{x}$ for all $\vec{x}$ in $\R^m$.
What if we plugged in the \emph{standard basis vectors} $\vec{e_i}$?
Then we'd have $T(\vec{e_i}) = A \vec{e_i}$.
So our hands are tied.
The only possible matrix that works is the matrix $A$ whose $i^{th}$ column is $T(\vec{e_i})$.
Notice then the incredible fact that to define $T$, we only needed to know what its outputs were on the standard basis vectors!
To try applying this fact, derive the matrix representing the $90^\circ$ counterclockwise rotation in $\R^2$ by geometrically determining how $\vec{e_1}$ and $\vec{e_2}$ behave under such a rotation.

\section{What's a basis?}
A key in the above section that we implicitly used was the following fact.

\textbf{Fact.} Every vector $\vec{x}$ can be written uniquely as a linear combination of the standard basis vectors $\vec{e_i}$.

This means that there are some scalars $x_1, \dots, x_m$ so that
\[
    \vec{x} = x_1 \vec{e_1} + x_2 \vec{e_2} + \dots + x_m \vec{e_m} 
\]
Furthermore, these coefficients $x_i$ are unique, meaning that if we had another representation
\[
    \vec{x} = x'_1 \vec{e_1} + x'_2 \vec{e_2} + \dots + x'_m \vec{e_m} 
\]
then we'd have $x'_i = x_i$ for all $i$.
That is, this is the only representation of $\vec{x}$ as a linear combination of the $\vec{e_i}$.
This is not a tremendously deep fact.
Indeed, if we wrote
\[
    \vec{x} = 
    \begin{bmatrix}
        x_1\\
        x_2\\
        \vdots\\
        x_m
    \end{bmatrix}
\]
then we've found our coefficients!

The key is that we can generalize this notion to collections of vectors other than the $\vec{e_i}$.
Indeed, a collection of vectors $\vec{v_1}, \dots, \vec{v_m}$ is called a basis if every vector $\vec{x}$ can be uniquely written as a linear combination $x_1 \vec{v_1} + x_2 \vec{v_2} + \dots + x_m \vec{v_m}$.
Being able to write everything as a linear combination of the $\vec{v_i}$ says that $\vec{v_1}, \dots, \vec{v_m}$ is a spanning set.
Uniqueness says that $\vec{v_1}, \dots, \vec{v_m}$ is linearly independent.
Note that I've written a collection of $m$ vectors here.
Could I have two bases of the same space with a different number of elements?
It turns out the answer is no, but this requires proof.

Now, let me give some examples.
I won't prove right now why these examples are bases.
Can you think of a way to determine if a set of vectors is a basis?
\begin{examples}
    \begin{enumerate}
        \item
        \[
            \begin{bmatrix}
                1\\
                1
            \end{bmatrix},
            \begin{bmatrix}
                -1\\
                1
            \end{bmatrix}
        \]      
        is a basis for $\R^2$.

        \item
        \[
            \begin{bmatrix}
                1\\
                2\\
                3
            \end{bmatrix},
            \begin{bmatrix}
                566\\
                12\\
                1
            \end{bmatrix},
            \begin{bmatrix}
                11037\\
                0\\
                \pi
            \end{bmatrix}
        \]
        is a basis for $\R^3$.

        \item
        We can refine our notion of bases to have bases for subspaces of $\R^n$ too.
        \[
            \begin{bmatrix}
                \sqrt{3}/2\\
                1/2\\
                0
            \end{bmatrix},
            \begin{bmatrix}
                -1/2\\
                \sqrt{3}/2\\
                0
            \end{bmatrix}
        \]
        is a basis for the subspace of $\R^3$ consisting of vectors with third component $0$.
        This is the $xy$ plane in $\R^3$.
    \end{enumerate}
\end{examples}

\section{Starting to use bases}
Let's say we have a basis $\vec{v_1}, \dots, \vec{v_m}$ of $\R^m$.
Let me denote this as $\mathcal{B} = (\vec{v_1}, \dots, \vec{v_m})$, noting that I do care about the order these vectors are written in.
By definition, that means that any vector $\vec{x}$ in $\R^m$ can be written uniquely as a linear combination of the $\vec{v_i}$.
That is, there are some unique coefficients $a_i$ so that $\vec{x} = a_1 \vec{v_1} + \dots + a_m \vec{v_m}$.
We want to concisely expressly this phenomenom, and we choose the following notation.
We write
\[
    \vec{x} =
    \begin{bmatrix}
        a_1\\
        a_2\\
        \vdots\\
        a_m
    \end{bmatrix}_{\mathcal B}
\]
Note the analogy with how we wrote things in the ``Fact" in section 3.
There, we wrote
\[
    \begin{bmatrix}
        x_1\\
        x_2\\
        \vdots\\
        x_m
    \end{bmatrix}
    =
    x_1 \vec{e_1} + x_2 \vec{e_2} + \dots + x_m \vec{e_m}
\]
We want to incorporate this into our above notation.
As such, we let $\mathcal{S} = (\vec{e_1}, \dots, \vec{e_m})$ be the standard basis vector.
Then we have the following two statements
\begin{align*}
    \begin{bmatrix}
        a_1\\
        a_2\\
        \vdots\\
        a_m
    \end{bmatrix}_{\mathcal B}
    &= a_1 \vec{v_1} + \dots + a_m \vec{v_m}\\
    \begin{bmatrix}
        x_1\\
        x_2\\
        \vdots\\
        x_m
    \end{bmatrix}_{\mathcal S}
    &= x_1 \vec{e_1} + \dots + x_m \vec{e_m}
\end{align*}
So we have generalized our above ``Fact" to other bases.
Let's notice some key remarks.
For one, we now have a way to write down the same vector $\vec{x}$ in multiple different ways.
We can write it as a column vector in the $\mathcal B$ basis, in the $\mathcal S$ basis, or in any other basis!
We will express these representations as $[\vec{x}]_{\mathcal B}$ or $[\vec{x}]_{\mathcal S}$ respectively.
So when we see a column vector, we can think of it as having different interpretations depending on the context, as determined by what basis we are working with.
This is sort of like how different languages, say German and English, can be written with the same script.
The very same column vector can be interpreted as representing two different vectors in space, depending on what basis you are working in.
For example, let's take the basis $\mathcal B = (\vec{v_1}, \vec{v_2})$ where
\begin{align*}
    \vec{v_1} &=
    \begin{bmatrix}
        1\\
        1
    \end{bmatrix}
    \\
    \vec{v_2} &=
    \begin{bmatrix}
        -1\\
        1
    \end{bmatrix}
\end{align*}
Then we have the following results.
\begin{align*}
    [\vec{v_1}]_{\mathcal S}
    &=
    \begin{bmatrix}
        1\\
        1
    \end{bmatrix}_{\mathcal S}
    \\
    [\vec{v_1}]_{\mathcal B}
    &=
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}_{\mathcal B}
    \\
    [\vec{e_1}]_{\mathcal S}
    &=
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}_{\mathcal S}
    \\
    [\vec{e_1}]_{\mathcal B}
    &=
    \begin{bmatrix}
        \frac{1}{2}\\
        -\frac{1}{2}
    \end{bmatrix}_{\mathcal B}
\end{align*}
First, notice that
\[
    [\vec{v_i}]_{\mathcal B} = [\vec{e_i}]_{\mathcal S}
\]
is always true, as $\vec{v_i} = 0 \vec{v_1} + \dots + 1 \vec{v_i} + \dots + 0 \vec{v_m}$.
Let me also explain this last result in greater depth.
Indeed,
\[
    \begin{bmatrix}
        \frac{1}{2}\\
        -\frac{1}{2}
    \end{bmatrix}_{\mathcal B}
    =
    \frac{1}{2} \vec{v_1} - \frac{1}{2} \vec{v_2}
    = \frac{1}{2}
    \begin{bmatrix}
        1 - (-1)\\
        1 - 1
    \end{bmatrix}
    =
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}
\]
which is $\vec{e_1}$.

This reveals an unfortunate issue with this notion of using different bases.
Here, we are trying to make a distinction between vectors $\vec{x}$ in $\R^m$ and how they are represented in coordinates.
This way we can work with different representations of the same vector, which we will see later is fruitful.
But there's a problem here -- a vector $\vec{x}$ is by its very definition a length $m$ column vector.
So vectors are, per our definition, always given in their representation in the standard basis $\mathcal S$.
This leads to a confusing conflation at times.
For instance, I defined the vectors $\vec{v_i}$ above by giving a column vector.
In that sense, the standard basis vectors $\mathcal S$ really do hold a privileged position above the other bases.
This is why we try to write the subscript $\mathcal B$ and $\mathcal S$ whenever possible.
But as a general rule, if no subscript is given, we mean to use the standard basis vectors $\mathcal S$.
Note by the way that this issue can become more clear when studying abstract vector spaces, where there really is no privileged representation of a vector.
This sort of thing appears in Math 115A and Math 115AH.

It may be helpful then to think of vectors visually as arrows in space, and our column vectors as ways of writing down the coordinates of said arrow.
Different people may use different coordinates.
For example, if our vectors represent velocity, one person may measure things in meters per second and another in miles per hour.
That specific example is the most basic coordinate change -- a simple rescaling -- but it at least reflects that we do often think of the very same object as having multiple distinct representations.

\section{Changing basis for vectors}
Now that we have determined that there are multiple ways to write down the same object, we are led to ask how we translate from one representation to another.
If I write down some column vector in the basis I use, how do you determine what arrow in space I am talking about, and how do you translate it into your basis?
We can think of this language analogy here too.
If I write something down in my basis, it's like I'm using my language.
We want to figure out a way to communicate between my language and yours.

In less fanciful terms, let's say I have a vector $\vec{x}$ and a basis $\mathcal B$ of $\R^m$.
How do I relate the column vectors $[\vec{x}]_{\mathcal B}$ and $[\vec{x}]_{\mathcal S}$?
More generally, if we had yet another basis $\mathcal B'$, we could ask for a relationship between $[\vec{x}]_{\mathcal B}$ and $[\vec{x}]_{\mathcal B'}$.
So far, in this course, only the former notion is covered.
I'll therefore focus mostly on translating between the $\mathcal B$ and $\mathcal S$ languages, but I'll mention the more general case as it will be useful later on.

It turns out that there is a very precise relationship between these different representations of $\vec{x}$.
Indeed, there is a matrix $_{\mathcal S}P_{\mathcal B}$ so that
\[
    _{\mathcal S}P_{\mathcal B}[\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal S}
\]
So $_{\mathcal S}P_{\mathcal B}$ gives us a way to translate between the representation of $\vec{x}$ in $\mathcal B$ coordinates into its representation in $\mathcal S$ coordinates.
In the book, this is written as $S[\vec{x}]_{\mathcal B} = \vec{x}$.
This is the first instance of my \textbf{warning} above: mine is not the standard notation!
I added a lot more subscripts (and changed $S$ to $P$ because I use $\mathcal S$ for the standard basis).
The way it's written in the book is an instance of the privileged position that the standard basis vectors hold in this theory.

I choose to write the subscripts $\mathcal S$ because to me, it makes things more clear.
Notice that the way the subscripts line up indicates to us how the matrix translates between coordinate systems.
Read the subscripts from right to left.
$_{\mathcal S}P_{\mathcal B}$ therefore takes in vectors in $\mathcal B$ coordinates on the right, and it spits out the representation of that vector in $\mathcal S$ coordinates.
Furthermore, it allows us to generalize this result to say that there is a matrix $_{\mathcal B'} P_{\mathcal B}$ so that
\[
    _{\mathcal B'}P_{\mathcal B} [\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal B'}    
\]
The double subscript on the $P$ is meant to represent in the direction in which $P$ translates.
In the above, $_{\mathcal S}P_{\mathcal B}$ takes inputs in $\mathcal B$ language on the right and spits out the corresponding vector in $\mathcal S$ coordinates.

So how do we find our translator $P$?
By the way, we often call $P$ the change of basis matrix.
I'll do it in the case $_{\mathcal S}P_{\mathcal B}$.
Indeed, write $\mathcal B = (\vec{v_1}, \dots, \vec{v_n})$.
Then we define $_{\mathcal S}P_{\mathcal B}$ to be the matrix whose $i^{th}$ column is $[\vec{v_i}]_{\mathcal S}$.
Remember that the subscript $\mathcal S$ is essentially superfluous beyond my personal feeling that it makes things more clear.
In the book, they write this as the $i^{th}$ column being $\vec{v_i}$.
Let's try to make sense of this.
What happens if we take $\vec{x} = \vec{v_i}$?
Then $[\vec{x_i}]_{\mathcal B} = [\vec{e_i}]_{\mathcal S}$.
So when we take the product $_{\mathcal S}P_{\mathcal B}[\vec{x}]_{\mathcal B}$, this becomes $_{\mathcal S}P_{\mathcal B}[\vec{e_i}]_{\mathcal S}$, which is just the $i^{th}$ column of this matrix.
And by definition, this is $[\vec{v_i}]_{\mathcal S}$.
So we have shown
\[
    _{\mathcal S}P_{\mathcal B} [\vec{v_i}]_{\mathcal B} = [\vec{v_i}]_{\mathcal S}
\]
We can replace $\vec{v_i}$ by a linear combination of $\vec{v_1}, \dots, \vec{v_m}$ in the above equation.
So as $\mathcal B$ is a basis, that proves the above for all vectors $\vec{x}$.
In the general case, we can similarly define $_{\mathcal B'}P_{\mathcal B}$ having $i^{th}$ column be $[\vec{v_i}]_{\mathcal B'}$.

Ok, this was a whole lot of alphabet soup I just did there.
Let me do an example.
Take $\mathcal B = (\vec{v_1}, \vec{v_2})$ with
\begin{align*}
    \vec{v_1} &=
    \begin{bmatrix}
        1\\
        1
    \end{bmatrix}
    \\
    \vec{v_2} &=
    \begin{bmatrix}
        -1\\
        1
    \end{bmatrix}
\end{align*}
as above.
Then per our definition, we have
\[
    _{\mathcal S}P_{\mathcal B} = 
    \begin{pmatrix}
        1 & -1\\
        1 & 1
    \end{pmatrix}
\]
We claimed above that $[\vec{e_1}]_{\mathcal B} =
\begin{bmatrix}
    \frac{1}{2}\\
    -\frac{1}{2}
\end{bmatrix}_{\mathcal S}$
Let's interpret this with our change of basis matrix.
Indeed, we should have
\[
    _{\mathcal S}P_{\mathcal B} [\vec{e_1}]_{\mathcal B} = [\vec{e_1}]_{\mathcal S}
\]
Well, we know $[\vec{e_1}]_{\mathcal S} = \begin{bmatrix} 1\\0\end{bmatrix}$.
So if we're trying to find $[\vec{e_1}]_{\mathcal B}$, we have to multiply by the inverse to get
\begin{align*}
    [\vec{e_1}]_{\mathcal B} &={} _{\mathcal S}P_{\mathcal B}^{-1} [\vec{e_1}]_{\mathcal S}\\
    &=
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}^{-1}
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}
\end{align*}
and we can compute this inverse matrix as
\[
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}^{-1}
    =
    \begin{bmatrix}
        \frac{1}{2} & \frac{1}{2}\\
        -\frac{1}{2} & \frac{1}{2}
    \end{bmatrix}
\]
so indeed, we get
\begin{align*}
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}^{-1}
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}
    &= \begin{bmatrix}
        \frac{1}{2}\\
        -\frac{1}{2}
    \end{bmatrix}
\end{align*}
as we said above.

This also brings us to an important point.
We can find a corresponding change of basis matrix $_{\mathcal B}P_{\mathcal S}$ satisfying
\[
    _{\mathcal B}P_{\mathcal S}[\vec{x}]_{\mathcal S} = [\vec{x}]_{\mathcal B}
\]
which is just given by
\[
    _{\mathcal B}P_{\mathcal S}^{-1} = {}_{\mathcal S}P_{\mathcal B}
\]
But wait, why is this change of basis matrix invertible?
Because its columns form a basis!
By the way, the general formula then becomes
\[
    _{\mathcal B}P_{\mathcal B'}^{-1} = {}_{\mathcal B'}P_{\mathcal B}
\]

This concept has the potential to be very confusing, and in this case the best option really is to try a bunch of exercises.
The book will provide lots of these, and another good option is to come up with your own examples.
Try writing example exercises by writing down some bases and vectors and converting to and from the standard basis vectors.
There's no substitute for getting your hands dirty, especially here.

\section{Matrix representations}
Now we're capable of approaching the key notion, representing matrices in other bases.
Let's think back to how we did this before.
A matrix $A$ induced a linear transformation $T$ defined by $T(\vec{x}) = A \vec{x}$.
Conversely, given a linear transformation $T$, we can write the matrix $A$ whose $i^{th}$ column is $T(\vec{e_i})$.
Together, these yield a correspondence between linear transformation $\R^m \longrightarrow \R^n$ and $n \times m$ matrices.
But now that we know about other bases, and how we can represent vectors in these bases, let's try to understand how we can modify this correspondence to be more general.
Indeed, we will see that linear transformations have a myriad of matrix representations.
The one we used to call ``the" matrix representation is simply the one in the standard basis.
Different bases yield different matrix representations.

We will consider the case that $T: \R^m \longrightarrow \R^m$ is a linear transformation.
Note that the dimension of both sides of the arrow are the same.
If we were to study this in greater generality, we'd necessarily need to put different bases on both sides.
So we restrict to this special case for now.

Written in our new (perhaps tedious) notation, we can write the matrix representation for $T$ as
\[
    [T(\vec{x})]_{\mathcal S} = A[\vec{x}]_{\mathcal S}
\]
for some $n \times n$ matrix $A$.
As such, we are motivated to now denote $A = {}_{\mathcal S}[T]_{\mathcal S}$.
This is nothing more than a change of notation.
We write this equation now as
\[
    [T(\vec{x})]_{\mathcal S} = {}_{\mathcal S}[T]_{\mathcal S}[\vec{x}]_{\mathcal S}
\]
so that as before, the positioning of the subscripts on $_{\mathcal S}[T]_{\mathcal S}$ represent that we input vectors in the $\mathcal S$ language on the right and spit out vectors in the $\mathcal S$ language.
What if we instead wanted to use the basis $\mathcal B$?
Then we'd want an equation like
\[
    [T(\vec{x})]_{\mathcal B} = {}_{\mathcal B}[T]_{\mathcal B}[\vec{x}]_{\mathcal B}
\]
The book calls the matrix $_{\mathcal B}[T]_{\mathcal B} = B$ and writes this equation as
\[
    [T(\vec{x})]_{\mathcal B} = B [\vec{x}]_{\mathcal B}
\]
in Theorem 3.4.3.
I like the trillion subscripts I'm writing because it helps me keep track of how we are interpreting these vectors in their respective coordinate systems.
Furthermore, it allows for the more general statement
\[
    [T(\vec{x})]_{\mathcal B'} = {}_{\mathcal B'}[T]_{\mathcal B}[\vec{x}]_{\mathcal B}
\]
which holds for more general linear transformations, so that ${}_{\mathcal B'}[T]_{\mathcal B}$ need not be square.

Anyways, our natural questions now are do these matrices $_{\mathcal B}[T]_{\mathcal B}$ exist and how do we find them?
Indeed, let's say $\mathcal B = (\vec{v_1}, \dots, \vec{v_m})$
We found $_{\mathcal S}[T]_{\mathcal S}$ by evaluating $T$ at the standard basis vectors $\vec{e_i}$ and writing them in $\mathcal S$ coordinates.
That is, the $i^{th}$ column of $_{\mathcal S}[T]_{\mathcal S}$ is given by $[T(\vec{e_i})]_{\mathcal S}$.
We are motivated now to define the matrix $_{\mathcal B}[T]_{\mathcal B}$ as the matrix whose $i^{th}$ column in $[T(\vec{v_i})]_{\mathcal B}$.
More generally, $_{\mathcal B'}[T]_{\mathcal B}$ has $i^{th}$ column $[T(\vec{v_i})]_{\mathcal B'}$.
Indeed, let's check that this definition satisfies the equation we want, that is
\[
    [T(\vec{x})]_{\mathcal B} = {}_{\mathcal B}[T]_{\mathcal B}[\vec{x}]_{\mathcal B} \tag{\ast}
\]
I'll leave the more general case as an exercise.
Anyways, let's consider $\vec{x} = \vec{v_i}$ above.
The left hand side is $[T(\vec{v_i})]_{\mathcal B}$.
By definition, this was the $i^{th}$ column of $_{\mathcal B}[T]_{\mathcal B}$.
On the other hand, the right side is
\begin{align*}
    _{\mathcal B}[T]_{\mathcal B}[\vec{v_i}]_{\mathcal B} = {}_{\mathcal B}[T]_{\mathcal B}
    \begin{bmatrix}
        0\\
        \vdots\\
        0\\
        1\\
        0\\
        \vdots\\
        0
    \end{bmatrix}
\end{align*}
with $1$ in the $i^{th}$ position.
This is also the $i^{th}$ column of $_{\mathcal B}[T]_{\mathcal B}$.
So both sides of the equation $(*)$ agree when $\vec{x} = \vec{v_i}$.
As the $\vec{v_i}$ form a basis, this is enough to conclude the equation for all $\vec{x}$.

Let me now present an example that exemplifies why we are doing all this work.
First of all, let's recall orthogonal projection onto the $x$ -- axis in $\R^2$.
This is given by $T: \R^2 \longrightarrow \R^2$ via $T\begin{bmatrix}x\\y\end{bmatrix} = \begin{bmatrix}x\\0\end{bmatrix}$.
Its matrix representation in the standard basis $\mathcal S$ is given by
\[
    _{\mathcal S}[T]_{\mathcal S} =
    \begin{bmatrix}
        1 & 0\\
        0 & 0
    \end{bmatrix}
\]
So now let's consider instead $U: \R^2 \longrightarrow \R^2$ given by orthogonal projection onto the line $\{y = x\}$, i.e. the line spanned by $\begin{bmatrix}1\\1\end{bmatrix}$.
You may recall that this is given by the matrix
\[
    _{\mathcal S}[U]_{\mathcal S} =
    \begin{bmatrix}
        \frac{1}{\sqrt 2} & \frac{1}{\sqrt 2}\\
        \frac{1}{\sqrt 2} & \frac{1}{\sqrt 2}
    \end{bmatrix}
\]
in the standard basis.
Ew.
Why is this ew?
The standard basis does not align with the geometry of the problem.
To find the columns of this matrix, we have to do some amount of geometric work to determine how $\vec{e_1}$ and $\vec{e_2}$ map under the projection.
But we are projecting onto the line spanned by $\begin{bmatrix} 1\\1\end{bmatrix}$, and the orthogonal line to this is spanned by $ \begin{bmatrix}-1\\1\end{bmatrix}$.
So let's take this to be our basis instead.
That is,
\begin{align*}
    \vec{v_1} &= 
    \begin{bmatrix}
        1\\
        1
    \end{bmatrix}
    \\
    \vec{v_2} &=
    \begin{bmatrix}
        -1\\
        1
    \end{bmatrix}
\end{align*}
and let $\mathcal B =(\vec{v_1}, \vec{v_2})$.
Then we know $U(\vec{v_1}) = \vec{v_1}$ and $U(\vec{v_2}) = 0$.
That is,
\begin{align*}
    [U(\vec{v_1})]_{\mathcal B} &= 
    \begin{bmatrix}
        1\\
        0
    \end{bmatrix}
    \\
    [U(\vec{v_2})]_{\mathcal B} &= 
    \begin{bmatrix}
        0\\
        0
    \end{bmatrix}
\end{align*}
so we determine that the matrix for $U$ in $\mathcal B$ is
\[
    _{\mathcal B}[U]_{\mathcal B} =
    \begin{bmatrix}
        1 & 0\\
        0 & 0
    \end{bmatrix}
\]
This is exactly as simple as the projection onto the $x$ -- axis!
And why shouldn't it have been?
This problem of projecting onto the line $\{y = x\}$ is the same problem as projecting onto the $x$ -- axis, jut upon rotating our head $45^\circ$.
By allowing ourselves flexibility in which coordinates we choose, we have make life a whole lot easier when computing.
And in fact, we have discovered a pattern.
We could do this same analysis for projection onto any line in $\R^2$, by taking our first basis vector to be on the line and our second basis vector to be perpindicular to the line.
In said basis, the projection onto said line will look exactly like the above.

As an exercise, try doing this sort of analysis for your other favorite transformations.
For instance, for reflections.
The first step is to find a basis that matches the geometry of the problem.
So for instance, try reflecting about the $x$ -- axis first.
When reflecting about a general line, what basis should be choose so the computation is exactly the same as the $x$ -- axis case?

\section{Changing basis for matrices}
As we saw above, choosing appropriate bases often simplifies calculations and explains certain patterns.
We can basically write all projections with the same matrix, with the caveat that we are doing so in different bases.
But suppose we want to write our projections in the standard basis.
How do we do this?
Let's recall how we changed bases for vectors above.
We had a matrix $_{\mathcal S}P_{\mathcal B}$ which satisfied the property
\[
    _{\mathcal S}P_{\mathcal B}[\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal S}
\]
or more generally that
\[
    _{\mathcal B'}P_{\mathcal B}[\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal B'}
\]
So how do we translate between two different matrix representations of a linear transformation $T: \R^m \longrightarrow \R^m$?
Let's focus on translating between a basis $\mathcal B$ and the standard basis $\mathcal S$.
Remember that the matrices $_{\mathcal S}[T]_{\mathcal S}$ and $_{\mathcal B}[T]_{\mathcal B}$ are defined by:
\begin{align*}
    [T(\vec{x})]_{\mathcal S} &= {}_{\mathcal S}[T]_{\mathcal S}[\vec{x}]_{\mathcal S}\\
    [T(\vec{x})]_{\mathcal B} &= {}_{\mathcal B}[T]_{\mathcal B}[\vec{x}]_{\mathcal B}\\
\end{align*}
If we multiply our second equation by $_{\mathcal S}P_{\mathcal B}$ then we are changing from the $\mathcal B$ language to the $\mathcal S$ language.
That is, we can rewrite $[T(\vec{x})]_{\mathcal B}$ in terms of $[T(\vec{x})]_{\mathcal S}$.
\begin{align*}
    [T(\vec{x})]_{\mathcal S} &= {}_{\mathcal S}P_{\mathcal B}[T(\vec{x})]_{\mathcal B}\\
    &= ({}_{\mathcal S}P_{\mathcal B})({}_{\mathcal B}[T]_{\mathcal B})[\vec{x}]_{\mathcal B}
\end{align*}
Plugging this into the first equation above yields
\[
    ({}_{\mathcal S}P_{\mathcal B})({}_{\mathcal B}[T]_{\mathcal B})[\vec{x}]_{\mathcal B} = {}_{\mathcal S}[T]_{\mathcal S}[\vec{x}]_{\mathcal S}\\
\]
We want both sides to be on common footing, but the $\vec{x}$ terms are written in different bases.
So for instance, we can use our change of basis matrix to write
\[
    _{\mathcal S}P_{\mathcal B}[\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal S}
\]
and plug this in to the previous equation to get
\[
    ({}_{\mathcal S}P_{\mathcal B})({}_{\mathcal B}[T]_{\mathcal B})[\vec{x}]_{\mathcal B} = ({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}P_{\mathcal B})[\vec{x}]_{\mathcal B} 
\]
In conclusion, as this held for all vectors $\vec{x}$, we have that
\[
    ({}_{\mathcal S}P_{\mathcal B})({}_{\mathcal B}[T]_{\mathcal B}) = ({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}P_{\mathcal B})
\]
Or that
\[
    _{\mathcal B}[T]_{\mathcal B} = (_{\mathcal S}P_{\mathcal B})^{-1}({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}P_{\mathcal B})
\]
Here, the book writes $S = _{\mathcal S}P_{\mathcal B}$, $A = _{\mathcal S}[T]_{\mathcal S}$, and $B = _{\mathcal B}[T]_{\mathcal B}$ so that the above becomes
\[
    B = S^{-1} A S
\]
This is much more concise, but I find the subscript bash more expressive.

Note that via the identity
\[
    (_{\mathcal S}P_{\mathcal B})^{-1} = {}_{\mathcal B}P_{\mathcal S}
\]
shown above, we can rewrite this equation as
\[
    _{\mathcal B}[T]_{\mathcal B} = (_{\mathcal B}P_{\mathcal S})({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}P_{\mathcal B})
\]
This is sort of clean looking, as the subscripts align very nicely.
That property is why the notation is chosen as it is, to keep track of all the coordinate changes you basically have to keep track of subscripts.

Now, to make an interesting remark, we can try to understand $_{\mathcal S}P_{\mathcal B}$ in our new language.
The defining equation for this is
\[
    _{\mathcal S}P_{\mathcal B} [\vec{x}]_{\mathcal B} = [\vec{x}]_{\mathcal S}
\]
Let's take the trivial step and write $\vec{x} = I(\vec{x})$, where $I$ is the identity linear transformation.
Then this says
\[
    _{\mathcal S}P_{\mathcal B} [\vec{x}]_{\mathcal B} = [I(\vec{x})]_{\mathcal S}
\]
Which allows us to determine that the change of basis matrix $P$ is basically just the identity matrix in these two bases.
Indeed, the matrix $_{\mathcal S}[I]_{\mathcal B}$ is defined by
\[
    _{\mathcal S}[I]_{\mathcal B} [\vec{x}]_{\mathcal B} = [I(\vec{x})]_{\mathcal S}
\]
which shows us that
\[
    _{\mathcal S}[I]_{\mathcal B} = {}_{\mathcal S}P_{\mathcal B}
\]
And hence, we get the formula
\[
    _{\mathcal B}[T]_{\mathcal B} = (_{\mathcal B}[I]_{\mathcal S})({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}[I]_{\mathcal B}) = (_{\mathcal S}[I]_{\mathcal B})^{-1}({}_{\mathcal S}[T]_{\mathcal S})(_{\mathcal S}[I]_{\mathcal B}) 
\]

Some food for thought:
\begin{enumerate}
    \item Can you generalize this formula for translating between two bases $\mathcal B$ and $\mathcal B'$?
    \item Can you write a formula for matrix multiplication within these basis changes?
    Try to look at the above formula with the identity matrix, and recall the subscript patterns that keep arising here.
\end{enumerate}

Finally, I will apply this work to compute the matrix in the standard basis for the orthogonal projection $U: \R^2 \longrightarrow \R^2$ onto the line $\{y = x\}$, which we considered above.
Let $\mathcal B = (\vec{v_1}, \vec{v_2})$ where
\begin{align*}
    \vec{v_1} &=
    \begin{bmatrix}
        1\\
        1
    \end{bmatrix}
    \\
    \vec{v_2} &=
    \begin{bmatrix}
        -1\\
        1
    \end{bmatrix}
\end{align*}
Indeed, we derived here the formula
\[
    _{\mathcal B}[U]_{\mathcal B} = (_{\mathcal S}P_{\mathcal B})^{-1}({}_{\mathcal S}[U]_{\mathcal S})(_{\mathcal S}P_{\mathcal B}) 
\]
Rearranging this yields
\[
    _{\mathcal S}[U]_{\mathcal S} = (_{\mathcal S}P_{\mathcal B})({}_{\mathcal B}[U]_{\mathcal B})(_{\mathcal S}P_{\mathcal B}) ^{-1}
\]
We computed above that
\[
    _{\mathcal B}[U]_{\mathcal B} = 
    \begin{bmatrix}
        1 & 0\\
        0 & 0
    \end{bmatrix}
\]
and we certainly have
\[
    _{\mathcal S}P_{\mathcal B} = 
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}
\]
as its columns are $\vec{v_1}$ and $\vec{v_2}$.
By the way, this is a rescaling of the rotation by $45^\circ$ matrix.
Then we compute
\begin{align*}
    _{\mathcal S}[U]_{\mathcal S} &= 
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}
    \begin{bmatrix}
        1 & 0\\
        0 & 0
    \end{bmatrix}
    \begin{bmatrix}
        1 & -1\\
        1 & 1
    \end{bmatrix}^{-1}
\end{align*}
Which we can bash out to get
\[
    \begin{bmatrix}
        \frac{1}{\sqrt 2} & \frac{1}{\sqrt 2}\\
        \frac{1}{\sqrt 2} & \frac{1}{\sqrt 2}
    \end{bmatrix}
\]
as expected.

This yields us a general procedure to find the matrix of some transformation.
First, pick a relevant basis in which the computation is easy.
Then, find the change of basis matrix $_{\mathcal S}P_{\mathcal B}$, which I prefer to think of as $_{\mathcal S}[I]_{\mathcal B}$.
Write the change of basis equation relating $_{\mathcal S}[T]_{\mathcal S}$ and $_{\mathcal B}[T]_{\mathcal B}$ and solve for $_{\mathcal S}[T]_{\mathcal S}$.
Then bash out some matrix multiplication and inversion.
\end{document}
