When I used the Robomaster Development C Board as my embedding microcontroller system under Linux, I met some problems; I didn’t know how to flash the board, all things about flash on the web were progressed under windows operating system. So, I have investigated by myself and have learned something about stm32, stlink, and jlink knowledge. I’ll share how to deploy the Robomaster Dev Board development Environment under linux for you.

- ST-Link / J-Link (I used ST-link-mini-v2 by waveshare and J-link edu mini)

- Robomaster Development Board (Type A / Type C)

- SWD cable link (4pin)

- A linux host (of course, you can use Raspberry Pi Arm64, but stink v2 mini can’t work on arm64)

The above figure exhibits how to connect the development board and the stlink/jlink flasher. The more information you can reference from the Robomaster Development Board Type C User Manual.

Using STlink v2 mini is very simple, but it only works under Windows and Linux AMD64, so I will only talk about how to use jlink with the dev board because Jlink also supports macOS and linux arm64.

Jlink edu mini use 9-pin JTAG

On the mini board, the No.1 pin is VTref, and you need to use a 9-pin ribbon cable to connect the jlink edu mini because the width between pins is 1.27mm. You also need a SWD (2*5 1.27mm) Cable Breakoutput to connect the robomaster dev board.

Let us connect pins between the Jlink mini and the dev board. Exhibit below figure.

For example, I used Raspberry Pi as my linux host and the connections like below.

- Download and Install
**STM32CubeMX**for Linux - Download and Install
**Opened** - Download and Install
**gcc-arm-none-eabi-10.3-2021.10-aarch64-linux.tar.bz2** - Download and Install
**Jlink Software for arm64** - git clone https://github.com/RoboMaster/Development-Board-C-Examples.git
- open example project ioc file in STM32CubeMX
- Convert the Toolchain/IDE to makefile in ProjectManager option, then click GENERATE CODE
- Run the Make command under the example project you want, then you can find the elf binary file under the build directory

In this sense, I will flash **1.light_led ELF file** to the dev board using the openocd command.

Jlink openocd config file (jlink.cfg)

```
source [find interface/jlink.cfg]
transport select swd
source [find target/stm32f4x.cfg]
program build/light_led.elf verify reset exit
```

STLink openocd config file (stlink.cfg)

```
ource [find interface/stlink.cfg]
source [find target/stm32f4x.cfg]
program build/light_led.elf verify reset exit
```

Flash command line:

```
# Use STLink
openocd -f ./stlink.cfg
# Use Jlink
openocd -f ./jlink.cfg
```

Check if it works, change the main code in Src/main.c, only light the red led.

```
while (1)
{
/* USER CODE END WHILE */
/* USER CODE BEGIN 3 */
//set GPIO output high level
HAL_GPIO_WritePin(LED_R_GPIO_Port, LED_R_Pin, GPIO_PIN_SET);
HAL_GPIO_WritePin(LED_G_GPIO_Port, LED_G_Pin, GPIO_PIN_RESET);
HAL_GPIO_WritePin(LED_B_GPIO_Port, LED_B_Pin, GPIO_PIN_RESET);
}
/* USER CODE END 3 */
}
```

Now, the Red LED is lighting.

Last year when OpenAI published ChatGPT, the large language model (LLM) making waves in AI community. why chatgpt works so well? can we understand the internal of GPT?

Deep learning neural network like a black box, we can only know the result from it, but can’t understand the data process in it’s internal. for example deign a MLP neural neural work, does the number of layers need 5 or 6 or more to get the more accurate answers? sorry no theory to prove that because of the deep neural network can’t explained. so turn to ChatGPT, we will dive into the chatgpt internal to understand it’s operational mechanism.

Tansformer is the basic architecture of GPT, so the first things first, we will use mathematics language to understand the transformer model.

The Transformer model has four parts:

**Embedding****Encoder****Decoder****Softma**x

we will use Wolfram language and mathematics to struct these four parts.

when we put **text sequences to transformer net**, we need to convert the texts string type into **numericArray** type, but what methods can we convert language text to numericArray. The answer is “SubWordTokens” method. Suppose, we have a big dictionary book which have all meta tokens that represent all meta words. we use “**vocabulary**” to representation of this dictionary book. so we define the vocabulary size is N, the sequences of language text can be represent some points set in vocabulary N dimensions space. echo vocabulary element is assigned a uniq index:

\text{[N] = } \left\{1,2,\text{...},N_v\right\}

A piece of text is represented as a sequence of indices , we call it as **Token IDs**, corresponding to its subwords. In vocabulary there are three special tokens:

**mask token**(used in masked language modeling)**bos token**(represent beginning of sequence)**eos token**(represent end of sequence)

For example:

Wolfram

```
vocabulary = NetExtract[ResourceFunction["GPTTokenizer"][], "Tokens"];
voca = NetExtract[ResourceFunction["GPTTokenizer"][], "Tokens"];
(* tokenizer encoder: convert text strings to tokenIDs *)
netTextEncoder = ResourceFunction["GPTTokenizer"][]
(* tokenizer decoder: convert tokenIDs to string sequence*)
netTextDecoder =
NetDecoder[{"Class",
NetExtract[ResourceFunction["GPTTokenizer"][], "Tokens"]}]
seq = netTextEncoder["what is transformer language model"]
(* represent sequence tokenIDs *)
(* output is {10920, 319, 47386, 3304, 2747}*)
(* convert tokenIDs to sequence *)
(* map tokenIDs to vector of vocabulary space *)
tokenids = Map[UnitVector[Length@voca, #] &, seq]
StringReplace[StringJoin@Map[netTextDecoder[#] &, tokenids],
"Ġ" -> " "]
(* output is "what is transformer language model" *)
(* Ġ represent whitespace *)
```

Now we can represent string sequences as tokenIDs that easier to feed into our neural network.

The embedding layer converts an input sequence of tokens into a sequence of **embedding vectors**, we call it as **Context**.

V represent **vocabulary**

N represent **vocabulary index**

e represent **embedding Token**

W represent **matrix**

W(All, i) represent the i column of the matrix

W(i, All) represent the i row of the matrix

t represent index of token in a sequence

l represent length of token sequence

**Token Embedding Algorithm:**

\text{Input} \text{: v $\in $ V} \left[N_v\right] \text{, } a \;\text{token} \; \text{ID} \\ \text{Output} \text{: e $\in $ } \mathbb{R}^{d_e}, \text {the}\, \text {vector} \,\text{of}\, \text{representation} \, \text {of}\, \text{the} \,\text{token} \\ \text{Parameters} \text{: } W_e \text{$\in $ } \mathbb{R}^{d_e * N_v} \text{, the token embedding matrix } \\ \text{Return} \text{: e = } W_e(\text{All},v)

Because transformer mode only use forward network, no recurrent or backward network, the model essentially can not know the positional of token. so we need feed some token positional information into the network, we call it **positional embedding**.

**Positional hardcode Embedding Algorithm:**

W_p \text{ : $\mathbb{N}$ $\rightarrow $ } \mathbb{R}^{d_e} \text{, use equations:} \\ W_p \text{[2i - 1, t] = } \sin \left(\frac{t}{\ell _{\max }^{\frac{2 i}{d_e}}}\right) \\ W_p \text{[2i , t] = } \cos \left(\frac{t}{\ell _{\max }^{\frac{2 i}{d_e}}}\right) \\ \text{0 $<$ i $\leq $ } \frac{d_e}{2} \\ \text{For the t-} \text{th} \text{ token , the embedding is:} \\ \pmb{\text{e = }} W_e(\text{All},x(t))+W_p(\text{All},t)

Wolfram

```
(* learned positional embeding *)
embedding[embedDim_, vocabulary_] :=
NetInitialize@NetGraph@FunctionLayer[
(* learned position embeding *)
Block[{emb1, emb2, posembed},
emb1 = EmbeddingLayer[embedDim][#Input];
posembed = SequenceIndicesLayer[embedDim][#Input];
emb2 = EmbeddingLayer[embedDim][posembed];
emb1 + emb2 ] &,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]}]
```

token embedding + trained able positional embedding net graph, (context dimension is 128):

For example, we set embedding dimension is 64:

Wolfram

```
ArrayPlot[
embedding[64, voca][
StringSplit["what is transformer language model"]]]
```

“what is transformer language model” sequence will be represented by a numericArray. every row representation one word. black, white, grayscale squares mean the value of embedding output, the dimensions of output is (5, 64)

Wolfram

```
(* hard code positional embeding *)
coeffsPositionalEncoding[embedDim_] :=
NetMapOperator[
LinearLayer[embedDim/2,
"Weights" ->
List /@ Table[1./10000.^(2 i/embedDim ), {i, embedDim/2}],
"Biases" -> None, "Input" -> "Integer",
LearningRateMultipliers -> 0
]
];
embeddingSinusoidal[embedDim_, vocabulary_,dropout_ : 0.1] := NetInitialize@NetGraph[
<|
"sequenceLength" -> SequenceIndicesLayer[],
"coeffs" -> coeffsPositionalEncoding[embedDim],
"sin" -> ElementwiseLayer[Sin],
"cos" -> ElementwiseLayer[Cos],
"catenate" -> CatenateLayer[2],
"+" -> ThreadingLayer[Plus],
"dropout" -> DropoutLayer[dropout],
"embeddingTokenID" ->
EmbeddingLayer[embedDim,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]}]
|>,
{
NetPort["Input"] ->
"sequenceLength" ->
"coeffs" -> {"sin", "cos"} -> "catenate" -> "+" -> "dropout",
NetPort["Input"] -> "embeddingTokenID" -> "+"}
]
```

**Positional Learned Embedding Algorithm:**

\text{Input} \text{: $\ell $ $\in $ } \left[\ell _{\max }\right] \text{, position a token in the sequence} \\ \text{Output} \text{: } e_p \text{ $\in $ } \mathbb{R}^{d_e}, \text {the vector representation of the position} \\ \text{Parameters} \text{: } W_p \text{$\in $ } \mathbb{R}^{d_e * \ell _{\max }}, \text {the positional embedding matrix} \\ \text{Return} \text{: e = } W_p(\text{All},\ell )

token embedding + hard code positional embedding net graph (context dimension is 128):

Wolfram

```
ArrayPlot[
embeddingSinusoidal[64, voca][
StringSplit["what is transformer language model"]]]
```

we can see that the squares of the rows are easy to distinguish, just like the learned embedding.

now, we combine all the functions into embeddingBlock:

Wolfram

```
Options[embeddingBlock] = {"depth" -> None, "voca" -> None,
"hardCode" -> False};
embeddingBlock[OptionsPattern[]] := Block[
{embedding, embeddingSinusoidal, posencoding,
embedDim = OptionValue["depth"],
vocabulary = OptionValue["voca"],
posWeightHardCode = OptionValue["hardCode"]},
posencoding[embedDim_] := NetMapOperator[
LinearLayer[embedDim/2,
"Weights" ->
List /@ Table[1./10000.^(2 i/embedDim ), {i, embedDim/2}],
"Biases" -> None, "Input" -> "Integer",
LearningRateMultipliers -> 0
]
];
embedding[embedDim_, vocabulary_] :=
NetInitialize@NetGraph@FunctionLayer[
(* learned position embeding *)
Block[{emb1, emb2, posembed, add, dropout},
emb1 = EmbeddingLayer[embedDim][#Input];
posembed = SequenceIndicesLayer[embedDim][#Input];
emb2 = EmbeddingLayer[embedDim][posembed];
add = emb1 + emb2;
dropout = DropoutLayer[0.1][add]] &,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]}];
embeddingSinusoidal[embedDim_, vocabulary_, dropout_ : 0.1] :=
NetInitialize@NetGraph[
<|
"sequenceLength" -> SequenceIndicesLayer[],
"coeffs" -> posencoding[embedDim],
"sin" -> ElementwiseLayer[Sin],
"cos" -> ElementwiseLayer[Cos],
"catenate" -> CatenateLayer[2],
"+" -> ThreadingLayer[Plus],
"dropout" -> DropoutLayer[dropout],
"embeddingTokenID" ->
EmbeddingLayer[embedDim,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]}]
|>,
{
NetPort["Input"] ->
"sequenceLength" ->
"coeffs" -> {"sin", "cos"} ->
"catenate" -> "+" -> "dropout",
NetPort["Input"] -> "embeddingTokenID" -> "+"}
];
If[posWeightHardCode,
embeddingSinusoidal[embedDim, vocabulary],
embedding[embedDim, vocabulary]]
]
embeddingBlock["depth" -> 128, "voca" -> voca, "hardCode" -> False]
```

Attention is the main architectural component of transformer. It enables a neural network to make use of context information for predicting the current token.

**Basic Single Query Attention Algorithm**:

\text{Input} \text{: e $\in $ } \mathbb{R}^{d_{\text{in}}} \text{ ,} \text {vector represent of the current token, Q in figure-2} \\ \text{Input} \text{: } e_t \text{ $\in $ } \mathbb{R}^{d_{\text{in}}} \text{, } \text {vector represent of context tokens, K, V in figure-2} \\ \text{Output} \text{: } \tilde{v} \text{$\in $ } \mathbb{R}^{d_{\text{out}}} , \text {vector representation of the token and context combined} \\ \text{Parameters} \text{: } W_q,W_k \text{$\in $ } \mathbb{R}^{d_{\text{atten}} * d_{\text{in}}} \text{, } b_q,b_k \text{$\in $ } \mathbb{R}^{d_{\text{atten}}} \text { the query and key linear projections} \\ \text{Parameters} \text{: } W_v \text{ $\in $ } \mathbb{R}^{d_{\text{in}} *d_{\text{out}}} \text{, } b_v \text{$\in $ } \mathbb{R}^{d_{\text{out}}}, \text {the value linear projection}

**Attention Pseudo Code: (UnMasked SelfAttention)**

\text{q $\leftarrow $ } e W_q \text{ + } b_q \\ \text{$\forall $t: } k_t \text{$\leftarrow $ } b_k+e_t W_k \\ \text{$\forall $t: } v_t \text{$\leftarrow $ } b_v+e_t W_v \\ \text{$\forall $t: } \alpha _t \text{= } \frac{e^{\frac{k_t q^T}{\sqrt{d_{\text{atten}}}}}}{\sum _u e^{\frac{k_u q^T}{\sqrt{d_{\text{atten}}}}}} = \text{Attention(Q, K, V) } = \text{softmax}\left(\frac{Q K^T}{\sqrt{d_k}} \right) \text V \\ \tilde{v} \text{ = } \sum _{t=1}^T \alpha _t v_t

- Q, K, V are linear processor, Q maps current token to a query vector, K maps current context to a key vector, V maps current context to a value vector
- Q and K matrix multiplication, interpreted as the degree to which token t is important for predicting the current token q
- softmax combine with V use matrix multiplication

**Masked SelfAttention** **Algorithm**:

\text{Input: X $\in $ } \mathbb{R}^{d_x * \ell _x}, Z \text{$\in $ } \mathbb{R}^{d_z * \ell _z}, \text {vector representations of primary and context sequence.} \\ \text{Output: } \tilde{V} \text{ $\in $ } \mathbb{R}^{d_{\text{out}} * \ell _x} \text {updated representations of tokens in X, folding in information from tokens in Z} \\ \text{Parameters: } W'_{\text{qkv}} \text{ consisting of:} \\ \text{$\quad \quad |$ } W_q \text{$\in $ } \mathbb{R}^{d_{\text{atten}} * d_x} \text{, } b_q \text{ $\in $ } \mathbb{R}^{d_{\text{atten}}} \\ \text{$\quad \quad |$ } W_k \text{$\in $ } \mathbb{R}^{d_{\text{atten}}* d_z} \text{, } b_k \text{ $\in $ } \mathbb{R}^{d_{\text{atten}}} \\ \text{$\quad \quad |$ } W_v \text{$\in $ } \mathbb{R}^{d_{\text{out}}* d_z} \text{, } b_v \text{ $\in $ } \mathbb{R}^{d_{\text{out}}} \\ \text{Hyperparameters: Mask $\in $ } \{0,1\}^{\ell _x \ell _z} \\ \text{Mask}\left[t_z,t_x\right] \text{=1}, \text {for bidirectional attention} \\ \text{Mask}\left[t_z,t_x\right] \text{ = 0,} \text {for unidirectional attention when } t_z < t_x

**Masked SelfAttention pseudo code**:

\text{Q $\leftarrow $ } 1^T b_q+X W_q \text{ [[Query $\in $ } \mathbb{R}^{d_{\text{atten}} * \ell _x} \text{]]} \\ \text{K $\leftarrow $ } 1^T b_k+Z W_k \text{ [[Key $\in $ } \mathbb{R}^{d_{\text{atten}} * \ell _z} \text{]]} \\ \text{V $\leftarrow $ } 1^T b_v+Z W_v \text{ [[Value $\in $ } \mathbb{R}^{d_{\text{atten}}* \ell _z} \text{]]} \\ \text{S $\leftarrow $ } Q K^T \text{ [[Score $\in $ } \mathbb{R}^{\ell _x * \ell _z} \text{]]} \\ \text{For All } t_z, t_x, \text{if} \text{ not } \text{Mask}\left[t_z,t_x\right], \text{then } S\left[t_z\right. \text{, } t_x \text{] $\leftarrow $ -$\infty $} \\ \tilde{V} \text{ = V $\cdot $ softmax(S/} \sqrt{d_{\text{attn}}} )

**classes of selfAttentions:**

**Unmasked SelfAttention**- Apply attention to each token, the the context k has all sequence tokens

**Masked SelfAttention**- Apply attention to each token, but the context has only preceding tokens, using causal mask ensures that each location only has access to the location that come before it, this version can be used for online prediction.

**Cross Attention**- often used in sequence to sequence task, give two sequences of token representations X,Z, use Z as context sequence, and set mask=1, the output V’s length as same as input X, but the Z’s length can be different with X’s length.

\text{MultiHead(Q, K, V) = } \text{Concat}\left(\text{head}_1,\text{head}_2,\text{...}.,\text{head}_n\right. ) W^O

**Multi head self attention Algorithm**

the structure of MHA is same as Basic Single Query Attention, but it has multiple attention heads, with separate learnable parameters. (figure-2)

\text{Input} \text{: X $\in $ } \mathbb{R}^{d_x * \ell _x} \text{, Z $\in $ } \mathbb{R}^{d_z * \ell _z} \text {vector representation of primary and context sequence} \\ \text{output} \text{: } \tilde{v} \text{$\in $ } \mathbb{R}^{d_{\text{out}} * \ell _x}\text {updated representation of tokens in X, folding in information from tokens in Z} \\ \text {Hyperparameters: H, number of attention heads} \\ \text{HyperParameters: } \text{Mask $\in $ } \{0,1\}^{\ell _x * \ell _z} \\ \text{Parameters} \text{: } W' \text{consisting} \text{of}: \\ \text{For h $\in $ [H], } \left(W'\right)_{\text{qkv}}^h \text{ consisting of :} \\ \text{$\quad \quad |$ } W_q^h \text{$\in $ } \mathbb{R}^{d_{\text{atten}}* d_x} \text{, } b_q^h \text{ $\in $ } \mathbb{R}^{d_{\text{atten}}} \\ \text{$\quad \quad |$ } W_k^h \text{$\in $ } \mathbb{R}^{d_{\text{atten}}* d_z} \text{, } b_k^h \text{ $\in $ } \mathbb{R}^{d_{\text{atten}}} \\ \text{$\quad \quad |$ } W_v^h \text{$\in $ } \mathbb{R}^{d_{\text{mid}}*d_z} \text{, } b_v^h \text{ $\in $ } \mathbb{R}^{d_{\text{mid}}} \\ W_O \text{ $\in $ } \mathbb{R}^{d_{\text{out}} *\text{Hd}_{\text{mid}}} \text{, } b_O \text{$\in $ } \mathbb{R}^{d_{\text{out}}}

**Multi head self attention Pseudo Code**

\text{For h $\in $ [H]} \\ Y^h \text{ $\leftarrow $ Attention(X, Z$|$} \left(W'\right)_{\text{qkv}}^h \text{, Mask)} \\ Y \text{ $\leftarrow $ } \left[Y^1;Y^2;Y^3,\text{Null}\ldots \right. \text{;, } \left.Y^H\right] \\ \tilde{V} \text{ = } Y W_O \text{ + } I^T b_O

Wolfram

```
selfAttentionBlock[embedDim_, heads_, masking_ : None] :=
NetInitialize[
NetGraph[
FunctionLayer[
Block[{keys, queries, values, seq, attention, merge},
(* pre layer normalization*)
seq =
NormalizationLayer[2 ;;, "Same",
"Epsilon" -> 0.0001 ][#Input];
keys = NetMapOperator[{heads, embedDim/heads}][seq];
queries = NetMapOperator[{heads, embedDim/heads}][seq];
values = NetMapOperator[{heads, embedDim/heads}][seq];
attention =
AttentionLayer["Dot", "MultiHead" -> True, "Mask" -> masking,
"ScoreRescaling" -> "DimensionSqrt"][<|"Key" -> keys,
"Query" -> queries, "Value" -> values|>];
merge = NetMapOperator[embedDim][attention];
seq = DropoutLayer[0.1][merge];
seq = seq + #Input
] &, "Input" -> {"Varying", embedDim}
]
]
]
```

if the model’s embedding dimension is 128, and use 8 attention heads, then the key’s query’s and value’s dimension is n * 8 * 16, in the selfAttentionBlock we use linearLayer to merge all attention heads to n * 128 dimension.

Wolfram

```
feedForwardBlock[embedDim_] := NetInitialize[
NetGraph[
FunctionLayer[
Block[{seq},
seq =
NormalizationLayer[2 ;;, "Same", "Epsilon" -> 0.0001][#Input];
seq = NetMapOperator[4 embedDim][seq];
seq = ElementwiseLayer["GELU"][seq];
seq = NetMapOperator[embedDim][seq];
seq = DropoutLayer[0.1][seq];
seq = seq + #Input
] &,
"Input" -> {"Varying", embedDim}
]
]
]
Information[feedForwardBlock[128], "SummaryGraphic"]
```

In feed forward network, there are two linearLayers, the first one’s output dimension is n*512, then use a “GELU” activation, after that, the data will be process by another one which the output dimension is n * 128. Notice that in the attention and feedForward network, we use **pre layer normalization** arrangements, this also same in decoder network.

the decoder network’s components are as same as encoder, but the attention layer will be removed by **masked attention layer **and **cross attention layer**. (figure-1)

Wolfram

```
crossAttentionBlock[embedDim_, heads_] := NetInitialize[
NetGraph[
FunctionLayer[
Block[{keys, queries, values, seq},
seq =
NormalizationLayer[2 ;;, "Same",
"Epsilon" -> 0.0001 ][#Input];
keys = NetMapOperator[{heads, embedDim/heads}][seq];
values = NetMapOperator[{heads, embedDim/heads}][seq];
queries = NetMapOperator[{heads, embedDim/heads}][ #Query];
seq =
AttentionLayer["Dot", "MultiHead" -> True,
"ScoreRescaling" -> "DimensionSqrt"][<|"Key" -> keys,
"Query" -> queries, "Value" -> values|>];
seq = NetMapOperator[embedDim][seq];
seq = DropoutLayer[0.1][seq];
seq = #Query + seq
] &,
"Input" -> {"Varying", embedDim},
"Query" -> {"Varying", embedDim}
]
]
]
Information[crossAttentionBlock[128, 8], "SummaryGraphic"]
```

The input socket’s data (Key, Value) of **cross attention layer** is from output of **encoder stack**, the query socket’s data (Query) of cross attention layer is from the output of **decoder masked attention layer**.

Now we have encoder and decoder block, we can make them stack into new big neural network.

Wolfram

```
encoderStack[embedDim_, heads_, blocks_Integer, vocabulary_] :=
Block[
{emBlock, encoderBlock},
emBlock =
embeddingBlock["depth" -> embedDim, "voca" -> vocabulary,
"hardCode" -> False];
encoderBlock =
NetChain[{selfAttentionBlock[embedDim, heads],
feedForwardBlock[embedDim]}];
NetInitialize@NetGraph@FunctionLayer[
Block[{embedding, block},
embedding = emBlock[#Input];
block = encoderBlock[embedding];
Do[block = encoderBlock[block], {blocks - 1}];
block
] &,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]}
]
]
encoderStack[128, 8, 1, voca]
```

Wolfram

```
decoderStack[embedDim_, heads_, blocks_Integer, vocabulary_] :=
Block[
{emBlock, decoderBlock},
emBlock =
embeddingBlock["depth" -> embedDim, "voca" -> vocabulary,
"hardCode" -> False];
decoderBlock = NetFlatten@NetGraph[
<|"maskedSelfAtt" ->
selfAttentionBlock[embedDim, heads, "Causal"],
"crossatt" -> crossAttentionBlock[embedDim, heads],
"FFN" -> feedForwardBlock[embedDim]|>,
{NetPort["Input"] -> NetPort["maskedSelfAtt", "Input"],
"maskedSelfAtt" -> NetPort["crossatt", "Query"],
NetPort["EncoderInput"] -> NetPort["crossatt", "Input"],
"crossatt" -> "FFN"}
];
NetInitialize@NetGraph@FunctionLayer[
Block[{emb, block, , linear, softmax},
emb = emBlock[#Input];
block = decoderBlock[emb];
Do[block = decoderBlock[block], {blocks - 1}];
linear = LinearLayer[] /@ block;
softmax = SoftmaxLayer[] /@ linear
] &,
"Input" -> {"Varying", NetEncoder[{"Class", vocabulary}]},
"Output" -> {"Varying", NetDecoder[{"Class", vocabulary}]}
]
]
decoderStack[128, 8, 1, voca]
```

finally we merge the encoder stack and decoder stack (the completed transformer model):

we will training sequence to sequence transformer model to translate sequence from English to French. So how to training seq to seq transformer model?

**Training Algorithm**:

\text{Input: } \left\{x_n,y_n\right\},\text{a dataset of sequence pairs, dataset size is } N_{\text{data}} \\ \text{Input: $\theta $, initial transformer parameters} \\ \text{Output: } \tilde{\theta } \text{, the trained parameters} \\ \text{for i = 1,2,3 ... } N_{\text{epochs }} \text{do} \\ \text{for n = 1,2,3, ..., } N_{\text{data }} \text{do} \\ \text{$\ell $ $\leftarrow $ } \text{length}\left(y_n\right) \\ \text{P($\theta $) $\leftarrow $ } \text{TransformerNet}\left(x_n\right. \text{, } y_n \text{$|\theta $)} \\ \text{loss($\theta $) = -} \sum _{t=1}^{\ell -1} \text{LogP}(\theta )\left[y_n(t+1),t\right] \\ \theta \leftarrow \theta -\theta \eta \cdot \nabla \text{loss} \\ \text {end for} \\ \text {end for} \\ \text{return } \tilde{\theta } \text{ = $\theta $}

The Algorithm can be explained below diagram: **SourceInput is X, Target Input is Y(Left shifted by1), Labels is Y (Right shifted by 1)**, **[start] [end] representation of BOS(begin of sequence), EOS(end of sequence)**

**Prepare Datasets:**

Wolfram

```
(* Dataset *)
(* please download the database from https://www.manythings.org/anki/ *)
dataFilePath = "~/CineNeural/notebooks/Datasets/fra-eng/fra.txt"
text = Import[dataFilePath];
sentencePairs = Rule @@@ Part[StringSplit[StringSplit[text, "\n"], "\t"], All, {1, 2}];
$randomSeed = 1357;
SeedRandom[$randomSeed];
trainingSet = RandomSample[sentencePairs];
netEncoder = ResourceFunction["GPTTokenizer"][];
netDecoder = NetDecoder[{"Class", NetExtract[ResourceFunction["GPTTokenizer"][], "Tokens"]}]
trainingTokens = MapAt[netEncoder, trainingSet, {All, {1, 2}}];
(* in the training Tokens we use 50257 tokenid as bos and eos *)
trainingTokens = MapAt[Join[{50257}, #1, {50257}] &, trainingTokens, {All, {1, 2}}];
x = Keys[trainingTokens]; (* source input *)
y = Values[trainingTokens]; (* target input *)
```

**create training network:**

Wolfram

```
encoder = encoderStack[128, 8, 6, voca];
decoder = decoderStack[128, 8, 6, voca];
transformerNet =
NetGraph@
FunctionLayer[
Block[{encode, decode, most, rest}, most = Most[#TargetSequence];
rest = Rest[#TargetSequence];
encode = encoder[#SourceSequence];
decode = decoder[<|"Input" -> most, "EncoderInput" -> encode|>];
CrossEntropyLossLayer["Index"][{decode, rest}]] &,
"SourceSequence" -> NetEncoder[{"Class", voca}],
"TargetSequence" -> NetEncoder[{"Class", voca}]]
```

**Training Model:**

Wolfram

```
result =
NetTrain[
transformerNet, <|"SourceSequence" -> x, "TargetSequence" -> y|>,
All, MaxTrainingRounds -> 4, ValidationSet -> Scaled[0.1],
BatchSize -> 16,
TargetDevice -> {"GPU", All}]
(*save trained model*)
net = result["TrainedNet"]
Export["~/transformer-128depth.wlnet", net]
```

**seq2seq trained model predicting algorithm**:

\text{Input}: \text{A seq2seq transformer and trained parameters } \tilde{\theta } \text { of transformer} \\ \text{Input: x $\in $ } V^* \text{, input sequence} \\ \text{Output: } \tilde{x} \text{ $\in $ } V^* \text{, output sequence} \\ \text{Hyperparameters: $\tau $ $\in $ (0, $\infty $)} \\ \tilde{x} \text{ $\leftarrow $ [bos$\_$token]} \\ y\leftarrow 0 \\ \text{while } y\neq \text{eos$\_$token} \text{ do} \\ \text{P $\leftarrow $ TransformerNet(x, } \tilde{x} \text{ $|$ } \tilde{\theta } ) \\ \text{p $\leftarrow $ P[All, } \text{length}\left(\tilde{x}\right) ] \\ \text{sample a token y from q $\propto $ } p^{1/\tau } \\ \tilde{x} \text{ $\leftarrow $ } \left[\tilde{x}\right. \text{, y]} \\ \text {end} \\ \text{return} \tilde{x}

Wolfram

```
trainedNet =
Import["/Users/alexchen/CineNeural/neural-models/transformer/\
transformer-v2-128depth.wlnet"]
trainedEncodeNet = NetExtract[trainedNet, "encode"]
trainedDecodeNet = NetExtract[trainedNet, "decode"]
predictor =
NetReplacePart[
trainedDecodeNet, {"Output" -> NetDecoder[{"Class", voca}]}]
translate[sourceSentence_String] :=
Module[{sourceSequence, translationSequence, translationTokens,
tokenEncoder, tokenDecoder},
sourceSequence =
trainedEncodeNet[
Join[{50257}, netEncoder[sourceSentence], {50257}]]; (* add bos and eos for source sequence *)
tokenEncoder = NetEncoder[{"Class", voca}];
tokenDecoder =
NetDecoder[{"Class",
NetExtract[ResourceFunction["GPTTokenizer"][], "Tokens"]}];
translationSequence = NestWhile[
Append[
#,
tokenEncoder[Last[predictor[<|
"Input" -> #,
"EncoderInput" -> sourceSequence|>]]]] &,
{50257},
(* check last token of the sequence, if not eos token, then append to sequence for next prediction. *)
If[Length@# >= 2, Last[#] != 50257, True] &,
1, 512];
translationTokens =
Map[tokenDecoder[UnitVector[Length@voca, #]] &,
translationSequence];
StringReplace[StringJoin[Cases[translationTokens, _String]],
"Ġ" -> " "]
]
translate["thank you"] (* return Franch: Merci. *)
```

Tensorflow transformer tutorials

wolfram-use-transformer-neural-nets

The Encoder-Decoder Transformer Neural Network Architecture – Wolfram Research

Formal Algorithms for Transformers – DeepMind

Natural Language Processing with Transformers

explainable-ai-for-transformers

cheatsheet-recurrent-neural-networks

The mecanum wheel is an omnidirectional wheel design for a land based vehicle and invented by Swedish Engineer – Bengt Erland Ilon.

There are a series of free moving rollers attached to the whole circumference of vehicle’s wheel, these rollers typically have a 45 degree to the axle line, and freely about axes in the plane of the wheel, but the overall side profile of the wheel is circular. How the Mecanum wheel drives the mobile robot?

First we will define the wheels sequences:

- FrontRightWheel as
**w1** - FrontLeftWheel as
**w2** - RearLeftWheel as
**w3** - RearRightWheel as
**w4**

How the vehicle moves:

- Running all four wheels in the same direction and same speed will result moving the vehicle in a forward or backward.
- Running both wheels on one side in one direction and other side in the opposite direction, will result in a static rotation of the vehicle.
- Running w1 backward, w2 forward, w3 backward, w4 forward, the result of vehicle will moving to right sideway.
- Running w2 forward, w4 forward and stop other wheels, the result of vehicle will moving to top-right diagonally.
- Running w2 and w3 forward, and stop other wheels, the result of vehicle will moving rotate around the point on x-axies.
- Running w2 forward and w1 backward, and stop other wheels, the result of vehicle will moving rotate around the point on y-axies.

How the wheel speed map to the mobile robot velocity, if we know robot’s velocity is:

\left\{v_x,v_y,\omega _z\right\} \\ v_x, v_y \text{ - } \text{robot linear} \text{ velocity} \\ \omega _z \text{ - } \text{robot angular} \text{ velocity}

what is the wheels angular velocity need?

solve this map problem, we need to use **kinematic model**.

- XY – world: inertial frame, call it frame A
- XY – robot: robot’s base frame, call it frame C
- XY – wheel: robot’s wheel base frame as frame B, and wheels center point at {xi, yi) in Frame C
- Robot’s position is (x,y) in Frame A, and it’s orientation angle is 𝜙
- v-x, v-y are wheel’s linear velocity, v-slide is sliding speed, v-drive is direction drive speed.
- free sliding direction angle with the Frame B -y axes is 𝛾.
- frame B (wheel) angle with frame C(robot) is 𝛽.
- 𝜔 – wheel’s angular velocity.
- v-c is robot’s linear velocity in Frame C, v-a is robot’s linear velocity in Frame A.
- 𝜔i – robot’s i-th wheel
- x,y – the distances from the vehicle robot geometric centers to the axis of the wheels geometric centers.

v_{\text{drive}}=v_x+\tan (\gamma ) v_y \\ v_{\text{slide}}=\frac{v_y}{\cos (\gamma )}

\left( \begin{array}{c} v_x \\ v_y \\ \end{array} \right)=\left( \begin{array}{c} 1 \\ 0 \\ \end{array} \right) v_{\text{drive}}+v_{\text{slide}} \left( \begin{array}{c} -\sin (\gamma ) \\ \cos (\gamma ) \\ \end{array} \right)

\omega =\frac{v_{\text{driver}}}{r}=\frac{v_x+\tan (\gamma ) v_y}{r}

v_a=\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)=\left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\phi ) & -\sin (\phi ) \\ 0 & \sin (\phi ) & \cos (\phi ) \\ \end{array} \right).\left( \begin{array}{c} \omega _{\text{cz}} \\ v_{\text{cx}} \\ v_{\text{cy}} \\ \end{array} \right)

v_c=\left( \begin{array}{c} \omega _{\text{cz}} \\ v_{\text{cx}} \\ v_{\text{cy}} \\ \end{array} \right)=\left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\phi ) & \sin (\phi ) \\ 0 & -\sin (\phi ) & \cos (\phi ) \\ \end{array} \right).\left( \begin{array}{c} \frac{d\phi }{dt} \\ \frac{dx}{dt} \\ \frac{dy}{dt} \\ \end{array} \right)=\left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\phi ) & \sin (\phi ) \\ 0 & -\sin (\phi ) & \cos (\phi ) \\ \end{array} \right).\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

we can decompose linear velocity on Frame A (world) to Frame C(robot), then to Frame B(wheel), so we get:

T_1=\left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\phi ) & \sin (\phi ) \\ 0 & -\sin (\phi ) & \cos (\phi ) \\ \end{array} \right).\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

Because our vehicle mobile robot can both translational and rotational movements, so the angular velocity also need to projection of linear velocity of wheel.

T_1=\left\{\dot{\phi },\dot{x} \cos (\phi )+\dot{y} \sin (\phi ),\dot{y} \cos (\phi )-\dot{x} \sin (\phi )\right\}

\omega _{\text{robot}}=\dot{\phi } \\ v_{\text{xRobot}}=\dot{x} \cos (\phi )+\dot{y} \sin (\phi ) \\ v_{\text{yRobot}}=\dot{y} \cos (\phi )-\dot{x} \sin (\phi )

v_{\text{xwheel}}=v_{\text{xRobot}}-\sin (\beta ) \omega _{\text{robot}} \sqrt{x_i^2+y_i^2}=v_{\text{xRobot}}-\frac{y_i \omega _{\text{robot}} \sqrt{x_i^2+y_i^2}}{\sqrt{x_i^2+y_i^2}}=v_{\text{xRobot}}-\dot{\phi } y_i \\ v_{\text{ywheel}}=\cos (\beta ) \omega _{\text{robot}} \sqrt{x_i^2+y_i^2}+v_{\text{yrobot}}=x_i \omega _{\text{robot}} \sqrt{x_i^2+y_i^2}+v_{\text{yrobot}}=\dot{\phi } x_i+v_{\text{yrobot}}

So, get the translational matrix from robot frame to wheel frame

T_2=\left( \begin{array}{ccc} -y_i & 1 & 0 \\ x_i & 0 & 1 \\ \end{array} \right).T_1

then, get the rotational matrix from robot frame to wheel frame

T_3=\left( \begin{array}{cc} \cos \left(\beta _i\right) & \sin \left(\beta _i\right) \\ -\sin \left(\beta _i\right) & \cos \left(\beta _i\right) \\ \end{array} \right).T_2

T_4=\left( \begin{array}{cc} \frac{1}{r_i} & \frac{\tan \left(\gamma _i\right)}{r_i} \\ \end{array} \right).T_3

finally, we get the wheel’s angular velocity

\omega _i=T_1.T_2.T_3.T_4.\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

\omega _i=\left( \begin{array}{cc} \frac{1}{r_i} & \frac{\tan \left(\gamma _i\right)}{r_i} \\ \end{array} \right).\left( \begin{array}{cc} \cos \left(\beta _i\right) & \sin \left(\beta _i\right) \\ -\sin \left(\beta _i\right) & \cos \left(\beta _i\right) \\ \end{array} \right).\left( \begin{array}{ccc} -y_i & 1 & 0 \\ x_i & 0 & 1 \\ \end{array} \right).\left( \begin{array}{ccc} 1 & 0 & 0 \\ 0 & \cos (\phi ) & \sin (\phi ) \\ 0 & -\sin (\phi ) & \cos (\phi ) \\ \end{array} \right).\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

\omega _i=h_i(\phi ).\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

h_i(\phi )=\left( \begin{array}{ccc} \frac{\sec \left(\gamma _i\right) \left(x_i \sin \left(\beta _i+\gamma _i\right)-y_i \cos \left(\beta _i+\gamma _i\right)\right)}{r_i} & \frac{\sec \left(\gamma _i\right) \cos \left(\beta _i+\gamma _i+\phi \right)}{r_i} & \frac{\sec \left(\gamma _i\right) \sin \left(\beta _i+\gamma _i+\phi \right)}{r_i} \\ \end{array} \right)

specially, when the 𝛟 is 0:

h_i(0)=\left( \begin{array}{ccc} \frac{\sec \left(\gamma _i\right) \left(x_i \sin \left(\beta _i+\gamma _i\right)-y_i \cos \left(\beta _i+\gamma _i\right)\right)}{r_i} & \frac{\sec \left(\gamma _i\right) \cos \left(\beta _i+\gamma _i\right)}{r_i} & \frac{\sec \left(\gamma _i\right) \sin \left(\beta _i+\gamma _i\right)}{r_i} \\ \end{array} \right)

Assume our vehicle robot have four wheels, we get H Matrix:

H=\left( \begin{array}{c} h_1(\phi ) \\ h_2(\phi ) \\ h_3(\phi ) \\ h_4(\phi ) \\ \end{array} \right)

now, we consider the angular of robot’s frame with robot wheel’s frame 𝛽 is 0, and set W1, W2, W3, W4 value, “->” means substitute the symbol value of the formula.

W_1=\left\{\gamma _1\to \frac{\pi }{4},\beta _1\to 0,x_1\to x,y_1\to -y,r_1\to r\right\}\\ W_2=\left\{\gamma _2\to -\frac{\pi }{4},\beta _2\to 0,x_2\to x,y_2\to y,r_2\to r\right\} \\ W_3=\left\{\gamma _3\to \frac{\pi }{4},\beta _3\to 0,x_3\to -x,y_3\to y,r_3\to r\right\} \\ W_4=\left\{\gamma _4\to -\frac{\pi }{4},\beta _4\to 0,x_4\to -x,y_4\to -y,r_4\to r\right\}

define x + y = l, we get the inverse kinematic:

\left( \begin{array}{c} \omega _1 \\ \omega _2 \\ \omega _3 \\ \omega _4 \\ \end{array} \right)=\left( \begin{array}{c} \frac{v_{\text{xRobot}}+v_{\text{yRobot}}+(x+y) \omega _{\text{zRobot}}}{r} \\ \frac{v_{\text{xRobot}}-v_{\text{yRobot}}-(x+y) \omega _{\text{zRobot}}}{r} \\ \frac{v_{\text{xRobot}}+v_{\text{yRobot}}-(x+y) \omega _{\text{zRobot}}}{r} \\ \frac{v_{\text{xRobot}}-v_{\text{yRobot}}+(x+y) \omega _{\text{zRobot}}}{r} \\ \end{array} \right)=\frac{1}{r}.\left( \begin{array}{ccc} 1 & 1 & l \\ 1 & -1 & -l \\ 1 & 1 & -l \\ 1 & -1 & l \\ \end{array} \right).\left( \begin{array}{c} \dot{x} \\ \dot{y} \\ \dot{\phi } \\ \end{array} \right)

we know the inverse kinematic, so it’s very easy to let us compute the forward kinematic through inverse matrix operation.

T=H(0)

\omega =T.\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

T=\left( \begin{array}{ccc} \frac{l}{r} & \frac{1}{r} & \frac{1}{r} \\ -\frac{l}{r} & \frac{1}{r} & -\frac{1}{r} \\ -\frac{l}{r} & \frac{1}{r} & \frac{1}{r} \\ \frac{l}{r} & \frac{1}{r} & -\frac{1}{r} \\ \end{array} \right)=\frac{1}{r}.\left( \begin{array}{ccc} l & 1 & 1 \\ -l & 1 & -1 \\ -l & 1 & 1 \\ l & 1 & -1 \\ \end{array} \right)

T^{-1}.\omega =\left( \begin{array}{c} \dot{x} \\ \dot{y} \\ \dot{\phi } \\ \end{array} \right)

T^{-1}=\left(T^T.T\right)^{-1}.T^T

T^{-1}=\left( \begin{array}{cccc} \frac{r}{4 l} & -\frac{r}{4 l} & -\frac{r}{4 l} & \frac{r}{4 l} \\ \frac{r}{4} & \frac{r}{4} & \frac{r}{4} & \frac{r}{4} \\ \frac{r}{4} & -\frac{r}{4} & \frac{r}{4} & -\frac{r}{4} \\ \end{array} \right)

T^{-1}=\left( \begin{array}{cccc} \frac{r}{4 l} & -\frac{r}{4 l} & -\frac{r}{4 l} & \frac{r}{4 l} \\ \frac{r}{4} & \frac{r}{4} & \frac{r}{4} & \frac{r}{4} \\ \frac{r}{4} & -\frac{r}{4} & \frac{r}{4} & -\frac{r}{4} \\ \end{array} \right)=\frac{r}{4}.\left( \begin{array}{cccc} \frac{1}{l} & -\frac{1}{l} & -\frac{1}{l} & \frac{1}{l} \\ 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ \end{array} \right)

finally, we get the forward kinematic:

\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)=\frac{r}{4}.\left( \begin{array}{cccc} \frac{1}{l} & -\frac{1}{l} & -\frac{1}{l} & \frac{1}{l} \\ 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ \end{array} \right).\left( \begin{array}{c} \omega _1 \\ \omega _2 \\ \omega _3 \\ \omega _4 \\ \end{array} \right)

Through this article, we get the forward kinematic (FK) and inverse kinematic (IK) of Mecanum wheels. Using the FK and IK, we can more controllable to drive our robot which using mecanum wheel.

Robot’s Mecanum wheels parameters:

wheel id | wheels Name | 𝛾 | 𝛽 | X | Y | R |

1 | FrontRight (FR) | 𝜋/4 | 0 | x | -y | r |

2 | FrontLeft (FL) | -𝜋/4 | 0 | x | y | r |

3 | RearLeft (RL) | 𝜋/4 | 0 | -x | y | r |

4 | RearRight (RR) | -𝜋/4 | 0 | -x | -y | r |

\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)=\frac{r}{4}.\left( \begin{array}{cccc} \frac{1}{l} & -\frac{1}{l} & -\frac{1}{l} & \frac{1}{l} \\ 1 & 1 & 1 & 1 \\ 1 & -1 & 1 & -1 \\ \end{array} \right).\left( \begin{array}{c} \omega _1 \\ \omega _2 \\ \omega _3 \\ \omega _4 \\ \end{array} \right)

\left( \begin{array}{c} \omega _1 \\ \omega _2 \\ \omega _3 \\ \omega _4 \\ \end{array} \right)=\frac{1}{r}.\left( \begin{array}{ccc} l & 1 & 1 \\ -l & 1 & -1 \\ -l & 1 & 1 \\ l & 1 & -1 \\ \end{array} \right).\left( \begin{array}{c} \dot{\phi } \\ \dot{x} \\ \dot{y} \\ \end{array} \right)

- Refining your idea
- To have a clear path forward
- Inspiration
- Reference
- Blockout studies
- Photobash && Concept

- To have a rough idea of what, final layout and camera angle will be
- Ref man
- Move fast and make big changes

- Finishing the blockout
- To detail out the final blockout and solidify the camera angle
- Add simple details
- Play with camera angles

- Find the crack : to not get overwhelmed with our scene and start designing and creating some props
- Decide which prop to make
- Using Reference, design and create our props
- Consider functionality and efficiency

- To go over useful steps for efficiently creating props
- Reference searching basics
- Reusing props and elements of props
- Replacing blockout shapes with props

- Initial lighting set up
- Speeding up the rendering processes
- Initial things to consider
- Very basic render settings

- Render pass and composite
- Proper way to composite layers
- Decals and paint adjustments
- Glow. Flares, Dof (depth of field). Finishing touch

*

]]>1902 – Georges Méliès films *A Trip to the Moon*, using actors in front of painted backdrops to create a fanciful journey.

1903 – Edwin S. Porter directs *The Great Train Robbery*. Porter creates some of the first matte composites by rewinding the film in camera.

1905 – Norman Dawn, commercial artist and photographer for the Thorpe Engraving Company, experiments with the glass paintings on still photographs on advice of his boss, Max Handschiegl.

1907 – With *The Missions of California*, Norman Dawn produces the first known example of the glass shot. Using the technique to “restore” damage caused by weather to the neglected missions, he places a glass with the painted corrections between the camera and existing buildings.

1912 – Edward Rogers produces what is possibly the first glass shot in England.

1913 – Norman Dawn employs one of the first known uses of “rear projection” by projecting a still film image on a frosted glass plate behind an actor during photography for his western, *The Drifter*.

1914 – Norman Dawn purchases the new Bell & Howell 2709 camera that is precise enough to do convincing multiple exposures. The camera helps Dawn to develop original negative matte painting technique.

1916 – Walter Hall, the English art director of D. W. Griffith’s *Intolerance*, develops his own method of creating the glass shot. He paints the additions to the scene on composition board, cuts them out with a beveled edge, and mounts them in front of the camera. He patented this variation of the glass shot technique, known as “The Hall Process” in 1921.

1921 – Ferdinand Pinney Earle directs and paints mattes for *The Rubaiyat of Omar Khayyam*. Paul Detlefsen assists.

1922 – Walter Percy (“Pop”) Day introduces the “The Hall Process” to the French film industry in *Les Opprimés*.

1925 – Warren Newcombe becomes head of the MGM matte department.Ralph Hammeras paints mattes for *The Lost World*. Ferdinand P. Earle paints mattes, which include a shooting star over Bethlehem in *Ben-Hur*.

1927 – Clarence Slifer arrives in Hollywood to become an assistant cameraman after winning a contest in *Screenland* magazine. Still in Paris, Percy Day uses the “The Hall Process” for *Napoleon*.

1928 – Linwood G. Dunn joins the visual effects department at RKO.

1929 – Bud Thackery and Paul Grimm paint glass shots of the ark, photographed at the Iverson Ranch for *Noah’s Ark*.

1930 – Percy Day develops his version of the latent image technique and applies it in *Au Bonheur des Dames*.

1933 – Mario and Juan Larrinaga, Byron L. Crabbe, and Henri Hillinck paint the ominous Skull Island and views of New York City for *King Kong*.

1934 – Returning to England, Percy Day and his assistant and stepson Peter Ellenshaw paint mattes for producer Alexander Korda. Day will head the visual effects departments at Denham Studio and later at the Shepperton Studio. Jack Cosgrove and Russell Lawson paint mattes for *The Black Cat*. They team up at the beginning of the 1930s, establishing headquarters at Universal, among other studios.

1935 – Director Alfred Hitchcock has illustrator Fortunino Matania create a matte painting for the trap sequence at the Royal Albert Hall in *The Man Who Knew Too Much*.

1936 – Clarence Slifer and Jack Cosgrove paint mattes for *Garden of Allah* the first Technicolor film to use original negative matte paintings.

Jack Cosgrove becomes head of the Selznick International visual effects department.

Ray Kellogg becomes the chief matte painter at Twentieth Century Fox. Emil Kosa, Jr., is his assistant.

Percy Day and assistant Peter Ellenshaw paint mattes for *Things to Come*.

1937 – Albert Maxwell Simpson and Byron Crabbe paint mattes for *The Prisoner of Zenda*.

1939 – Jack Cosgrove supervises and paints mattes along with Albert Maxwell Simpson, Jack Shaw, and Fitch Fulton to create the establishing shots of Scarlett’s Tara and views of Atlanta under siege in *Gone With the Wind*. Clarence Slifer supervises matte camera effects and opticals.

Chesley Bonestell paints mattes for *The Hunchback of Notre Dame* and *Only Angels Have Wings*.

Fred Sersen supervises and paints mattes along with Ray Kellogg on *The Rains Came*.

Warren Newcombe and his department paint mattes for *The Wizard of Oz*, including one of the most famous matte shots the Emerald City.

*Matte World Digital © 2002*

“The invading army was the technical people who built the machines. At first we [artists] were all confused traditional matte painting and digital was a head-on collision. There was lots of carnage. Then, eventually, the smoke cleared and it became clear what to do. What happened was artists who were afraid of the thing eventually said, ‘Step aside, let me take a look at that.'”

Robert Stromberg, digital matte painter

Although computer-generated effects had begun appearing in the 1980s, notably with ILM’s “Genesis Effect” of a barren planet becoming transformed into a garden world for Star Trek II, it was not until a decade later that digital technology became reliable and cost effective. The turning point was ILM’s creation of the realistic computer-generated dinosaurs for the 1993 release *Jurassic Park*.

Predictions of the time, which prophesied the end of all traditional visual effects, were greatly exaggerated. Makeup, creature costumes, animatronic effects, miniatures, and scale models all remain vital crafts, although every one of those disciplines has been changed by computer technology.

But the computer did have a sudden impact on other aspects of the craft. Almost overnight, optical printers were replaced because of the new freedom to scan images into a computer and seamlessly create final composites free of image degradation. And traditional matte painting was soon transformed, with digital paint programs allowing for new freedoms and, potentially, more complex shots.

But for the new breed of digital matte painter, the transition from brush and oils and canvas to software and computer monitors has not altered the irreducible essence at the heart of any creative equation the inventive mind and talent of the individual artist.

When director James Cameron was making this film about the doomed 1912 maiden voyage of the *Titanic*, the production logistics included a nearly full-scale re-creation of the luxury ship and a special studio, built in the Mexican seaside town of Rosarito, that included an eight-acre water tank and three large stages. The production’s scale, and the price tag that went with it, had Hollywood and movie critics primed for the kind of legendary box-office failure that bankrupts studios. What Cameron delivered was the most successful box-office film of all time, with eleven Oscars awarded at the Academy’s annual ceremony.

While the effects-heavy film took full advantage of digital technology, this climactic image of the crew of the *Carpathia* searching the icy waters for survivors was a fusion of traditional and digital techniques. The shot, created by Matte World Digital, combined a live plate of lifeboats shot in Mexico, physical and computer-generated models of floating icebergs, and a live-action smoke element added to the painted smokestack of the *Carpathia*. The rescue ship itself was created by Chris Evans as an old-fashioned, acrylic-on-Masonite board painting. The painting was then photographed and scanned into the computer along with the other elements, including a digitally painted dawn sky, in the final composite image.

The *Carpathia* painting marked a full circle for Evans, the first artist to take a matte painting into the digital realm (created at Industrial Light + Magic for a scene of a stained-glass knight magically coming to life in the 1985 release *Young Sherlock Holmes*). Although Matte World Digital had originally considered doing the ship with computer graphics, the looming deadline allowed only two weeks for the creating all the elements and a final composite. It was Evans who suggested it would be quicker to create the ship as a traditional matte painting, a rare recourse to brush and paints in the digital age.

For Evans, the *Titanic* assignment had a personal echo. His great-grandfather, John Bartholomew, worked for the White Star Line as chief victuals officer and was scheduled to sail on the *Titanic* maiden voyage as a company VIP. The night before the launch, however, Bartholomew was stricken with an illness and canceled his trip. The notice came so late his luggage was already aboard the ship, and early reports on the disaster listed Bartholomew as one of the casualties. “When he heard that the *Titanic* went down with so many of the friends he’d worked with for thirty or forty years, he was heartbroken,” Evans recalled in a December 1997 *Cinefex* special issue on the making of *Titanic*.

The 1990s was an intriguing decade for matte painting, a time when new digital tools appeared and began to be applied, but also a time when entire productions still embraced traditional effects. One such was *Bram Stoker’s Dracula*, with director Francis Ford Coppola contracting Matte World specifically to create matte shots the old-fashioned way. The film, set in the Victorian times that coincided with the earliest days of movies, inspired Coppola to attempt to use effects appropriate to that era. (Matte World did, however, dissuade the director from shooting glass shots on location, which, although a seminal effect, had always been laborious and time-consuming even under the best conditions.)

In this shot of a horse-driven carriage approaching Dracula’s castle, the live-action matte element was combined with artist Bill Mather’s painting on the same strip of film, with the camera rewound to film each new element. The film was then put into a high-speed camera to shoot several passes of “snow,” actually baking soda shaken through a wire mesh screen.

This shot also demonstrates the subliminal effects that can be achieved by a matte painting. Beginning with a production sketch by artist Jim Steranko, Matte World concept artist Sean Joyce worked with the director to develop the initial idea of the vampire’s castle being shaped like a body on a throne. “Francis wanted this subconscious effect of a tortured man, screaming some kind of plea to Heaven,” Joyce noted.

This Martin Scorsese film was set in the Las Vegas of 1974, a time period when the fabled Strip was dominated by the Tropicana and Flamingo hotels and such iconic structures as the glittering, 180-foot-tall Dunes sign. But twenty years later, when Scorsese was making *Casino*, those landmarks had been demolished. Enter Matte World Digital, the traditional matte-painting company having adapted to the new digital verities in both name and technology. Scorsese’s assignment was that the effects house re-create the Strip’s period look and add the fictitious Tangiers Hotel to the mix.

For the shot pictured here, Matte World Digital combined a live-action plate with computer-generated images of the Dunes sign and the Tangiers Hotel, with the glittering neon itself created through radiosity lighting software developed by Lightscape, a Silicon Valley firm. Prior to radiosity, the rendering of a 3-D computer model only accounted for light coming from a specific source, ignoring the way light actually interacts and breaks up. The complex interplay of direct illumination and “bounce light” is the way the real world looks, which is why the earliest computer-generated models with only direct light sources look so flat and unrealistic.

Using the 3-D wireframe models that Matte World Digital built in the computer, the radiosity software allowed for the computer-generated surfaces to incorporate a 2-D “mesh,” made up of triangles and rectangles, which helped to automatically determine and represent all illuminative gradations, from strong light sources to diffuse bounce-light effects. The Tangiers sign here is composed of some 158,000 mesh elements, the Dunes sign a staggering 2.5 million.

“What’s great about matte painting is you get to control a little bit of the movie. Sure, you’ve got everybody telling you how to do it, but you get to bring across some narration, maybe even an emotion, and that’s heady stuff. It’s your moment. It can be intoxicating, can make you feel powerful. You’re fighting for control over this image!”

Harrison Ellenshaw, Disney Studio/Industrial Light + Magic matte painter

In 1975 a young director named Steven Spielberg saw his movie *Jaws* become a national phenomenon, while another young director named George Lucas was in production on a little film called *Star Wars*. This would be the beginning of a new era, with genres resurrected from science fiction themes to adventures patterned after old Saturday matinee serials and pumped up into effects-fueled spectacles with crossover appeal and boffo box office. Although Lucas had always imagined *Star Wars* as a saga requiring a number of films, in the decades to come any successful movie might produce potential sequels. Marketing would become more sophisticated, with the phrase “summer movie” understood to mean a potential blockbuster. And with the billions that *Star Wars* licensed products have generated (the rights to which George Lucas shrewdly kept during his first *Star Wars* negotiation with Twentieth Century Fox), once marginal, ancillary marketing tie-ins became potentially more lucrative than the box office itself.

The new blockbuster era was also a time of transition. Lucas’ Industrial Light + Magic (ILM) effects house, organized to create the effects for *Star Wars*, was soon being hired out for effects assignments at other studios, and other independent effects houses such as Apogee and Boss Films entered the field. Meanwhile, behind the scenes, the think tanks within ILM and other effects shops were busily making the first feature films to venture into the digital realm.

With fantasy and adventure themes so popular, matte painting was more important than ever. Thus, in a time of change the tradition continued, with brush and oils truly conjuring worlds.

Many *Star Wars* fans find this sequel to the phenomenally popular first film their favorite chapter, with dramatic plot turns including Luke Skywalker’s first encounter and apprenticeship with Jedi master Yoda and the dramatic final confrontation between the aspiring Jedi Knight and the evil Darth Vader. The fantastic ILM effects ranged from the stop-motion animation of the Imperial Walkers during their attack on the Rebel base on ice planet Hoth to matte paintings creating everything from an asteroid field in space, to the swamp planet of Dagobah and entrancing visions of Cloud City.

Here we see a shot from the dramatic duel between Luke and Vader in an air shaft on Cloud City. ILM’s matte department composited the live action of the doorway and actors with a Ralph McQuarrie matte painting, via a “front-projection system” developed for *Empire* by Richard Edlund and Neil Krepela. The light-saber beams themselves were rotoscoped animation elements provided by ILM’s animation department, which Krepela’s matte-camera assistant Craig Barron, who was working on his first movie, composited into the scene using the front-projection system.

For McQuarrie, who had developed the look of *Star Wars* back when Lucas was first dreaming everything up, *Empire* was a chance not only to create production designs of characters and environments, but to follow through and bring them to life as final matte-painted shots. McQuarrie laughed as he recalled that the exact nature of the environment pictured here was never totally explained before his department set to work: “I never quite figured it out, frankly. That’s one of the things that was total fantasy. I’ve forgotten whether it was my idea, or George’s, or somebody else’s to use an air shaft. Basically, the point was George wanted a cliffhanger location for the duel.”

In this David Lynch production of the Frank Herbert novel, the planet Arrakis is a desert world in which water is more precious than gold and Melange the “spice” vital for interstellar travel is mined. Against this backdrop a Holy War is brewing, as the people long for a messiah to lead them against the evil Harkonnen empire.

*Dune*, released by Universal, was another challenge for Albert Whitlock’s matte department. Whitlock gave his apprentice Syd Dutton (who would cofound Illusion Arts with Universal matte cameraman Bill Taylor) complete freedom to create this shot of a cable car passing over the labyrinthine city of Giedi Prime, the domain of Baron Harkonnen and a center of spice processing. This shot followed the Whitlock philosophy of shooting matte paintings onto original negative.

Dutton’s work was inspired by a sketch provided by *Dune* production designer Tony Masters. It was also a unique effect on the production, as Bill Taylor described in an article documenting the making of the film for the April 1985 issue of *Cinefex*: “The Giedi Prime shot was unusual for several reasons…. First, because there was no set involved at all. The set had long been struck by the time we got the assignment. So it’s a full painting, with just a couple of live-action inserts. Second, the painting was actually begun before the live-action elements were shot. Third, it also marked the first time we photographed our own motion-control miniature the cable car for incorporation into a matte shot.”

One of Disney’s classic live-action, family fantasy films, *The Love Bug* was released in a downtime for visual effects. It was the twilight of the studio system, as most studios were in the process of selling their backlots, auctioning off their assets, and closing their production departments. There were rare exceptions, notably Albert Whitlock’s matte-painting department at Universal. But it was Disney Studio having won its enduring fame with cartoon animation that maintained the tradition of a backlot and soundstages and in-house effects departments for live-action films.

In this *Love Bug* scene we see Herbie, the magical Volkswagen, in front of an old San Francisco firehouse overlooking the bay. Although painter Alan Maley visited San Francisco for research, this and other scenes were created in Burbank on the Disney lot. It is a city of the imagination, with a fanciful firehouse on a street that doesn’t exist, created with a bit of soundstage set and Maley’s masterful glass painting.

Matte painter Harrison Ellenshaw commented on this unsung example of matte-painting magic: “I wish I could have been there to watch Alan work on this painting. Very few matte artists would be so daring and clever. The composition is brilliant the idea of putting the telephone pole in the foreground helps balance the shot, and it’s a nice touch adding the two traffic cones. But when we watch the film we look past the foreground to see Herbie in front of this wonderful old firehouse, which is what the shot is all about.

“Note how Alan even incorporated some lens distortion into his painting the horizontals and verticals near the edges curve slightly, which is a subtle yet effective touch. We know it’s late in the day because of the long shadows, another daring and clever idea. Alan did this painting a few years before I joined the department as an apprentice, but I recall him telling me, years later, that he’d started the painting to match the live-action plate as if it were shot in bright sunlight. But since the set was inside a soundstage with stage lighting, he decided, after much struggling, to try it as if the live action were in shadow. Alan was very proud of this shot and actually kept it intact, a rarity in those days when a finished painting on glass was scraped off so the glass could be used again.”

]]>*Matte World Digital © 2002*

“Al Whitlock taught me things mostly by osmosis. It was about being around him, seeing him. Al didn’t believe in drawing out a shot. It was about the energy of the moment when he was painting. He believed you had to come to a matte painting with focus and a certain energy.”

Syd Dutton, Whitlock protégé and cofounder, Illusion Arts

As film production began changing with the digital breakthroughs of the 1990s, it became necessary to make a distinction between digital and “traditional” effects. While computer graphics entailed complex new digital technology, traditional effects artists worked in a hands-on world with hallowed tools of the trade, traditions, and techniques passed down the lineage of their craft.

Although Albert Whitlock always worked in the traditional era, he stands in the first rank of the pantheon of matte painters, be they traditional or digital artists. Like others from those predigital days, he could wield his brush expertly, each stroke leaving impressionistic dabs of paint that “read” as real when a final painting was filmed. But Whitlock was also a master at designing and enhancing his paintings with special effects, optical illusions, and effects photography. During his reign as head of the Universal matte department he won two Academy Awards, became the trusted effects guru for Alfred Hitchcock, and was in demand by such directors as John Huston, Robert Wise, and Hal Ashby.

This film about a wealthy eccentric (played by Lucille Ball) whose adventures span the flapper era of the Roaring Twenties, the stock market crash, and the Depression, featured this fantasy scene, created by Albert Whitlock, in which Mame and her young ward Patrick (Kirby Furlong) sit on the most precarious of perches. The stage setup needed only one practical spike of the Statue of Liberty’s crown, and the painted blue floor was not a “bluescreen” effect but a guide for the ocean that would be part of Whitlock’s final painting.

Although the shot seems impossible, Whitlock actually designed the scene as it could potentially have been filmed, he revealed: “I don’t like the omnipotent viewpoint, those kinds of shots never feel real to me. So I designed it to look as if the camera had been set up on the torch of the statue’s raised arm. I think taking a realistic approach to how you could really shoot something like this helps make the shot seem more real. Of course, they would never let you shoot it like that for real, and the actors would refuse, anyway. I remember that although the little boy had a safety belt on him, he was nervous at first, but got more comfortable after the first take.”

This film, released by Universal the year men first walked on the moon, still seemed like science fiction in a world in which personal computers didn’t exist. But big mainframes, kept in the province of scientists and academics, did exist as did fears that those mysterious machines might someday supplant humans. That fear fuels this film’s premise, with Dr. Forbin working in a mountain fortress research center where he has developed Colossus, a thinking machine that decides human beings are The Enemy.

In this ominous shot, crafted by Albert Whitlock, Dr. Forbin turns on the seemingly endless computer banks that comprise Colossus. Whitlock was always an advocate for an original-negative approach, trying to get a shot “in-camera” and thus avoid the inevitable degradation of a filmed image that is rephotographed many times in the optical duping process. Here, Whitlock utilized painted cel animation overlays of silver light panels to create the illusion of his computer turning on in stages. The effect was captured on the original negative, with the camera rewound several times for each new exposure.

The shot also had to match the practical lighting effect for the live action element of Dr. Forbin walking down the vast computer corridor. “I thought we’d get lucky and smoothly match my animation in the painting to the on-set lighting,” Whitlock recalled. “What helped was Forbin standing exactly between the lighting effect, near the matte line so you didn’t notice the changeover. And he’s wearing a white suit that distracts your eye just at the right time. I remember this shot went over very big with the brass in the front office. They loved the way the lights turned on.”

]]>*Matte World Digital © 2002*

“It was always effective, when you went to the big films in those days, that people actually were so moved by the impact of these tremendous big screen cataclysms and effects… But the secrets [of their creation] were well kept.”

Jesse Lasky, Jr., screenwriter, The Ten Commandments

The decade of the 1950s could be called the Big Picture era. “Wide-screen” debuted in 1952 with Cinerama, which utilized three cameras for filming, three electronically synchronized projectors running at twenty-four frames per second, and a gigantic curved screen, creating a feeling of limitless space. In 1953, Twentieth Century Fox ushered in CinemaScope, and a host of other big-screen innovations followed from the various studios.

The term Big Picture also sums up the type of production then in vogue. While the movies had always been enamored of epics, wide-screen technology provided an irresistible staging ground for blockbuster productions and the matte paintings needed to bring those grand visions to life.

*The Great Race* is a madcap chase movie and a particularly complex production, given its period setting at the turn of the twentieth century and its premise of an around-the-world race between rival automobile companies. This production was another challenge for Linwood G. Dunn’s Film Effects of Hollywood, with matte artists Albert Simpson and Cliff Silsby creating more than twenty-five matte paintings for the show. *The Great Race* also marked a reunion for Simpson and Dunn, both of them veterans of RKO’s glory days and such productions as *The Devil and Daniel Webster*.

The effect seen here is a classic “jeopardy shot,” successfully and safely created by combining Albert Simpson’s matte-painted building with live action of this performer hanging from a ledge set that was, in reality, only a few feet off a soundstage floor.

Lew Wallace’s story of Judah Ben-Hur, set in the time of ancient Rome, was first adapted for the screen in 1927. The early film was legendary for its sea battle and chariot race when this remake went into production. But audiences were ready for an updated version, and the new blockbuster swept the Oscars with an astounding eleven golden statuettes, including Best Picture, Best Actor (Charles Heston in the title role), and Best Visual Effects.

Matte paintings were vital in re-creating the grandeur of the lost world in *Ben-Hur*. For this image of the emperor and an enthusiastic crowd greeting legions of soldiers returning from another victorious campaign, MGM matte painter Matthew Yuricich not only painted Rome in all its imperial splendor, but added that old matte painter’s trick dabs of paint to represent people, from “marching” soldiers to waving crowds. Yuricich didn’t use a glass but Masonite board; he then poked holes behind the painted people and, by moving another painting behind the holes, created a flickering effect and the illusion of movement.

In the “before” image we see the painting as Yuricich worked on it, complete with a black-and-white photograph taken on the small live-action set and pasted onto his Masonite. Painting to a photographic reference helped MGM matte painters to line up their paintings perfectly with live-action sets. This finished painting was then combined with the live-action element in the optical printer, the live film simply replacing the photographic reference.

Yuricich explained that his creative partner Clarence Slifer (who had moved to MGM in the last days of Selznick International) used an optical printer to replicate photography of a group of live-action soldiers, rephotographing the same element in smaller perspective to create the image of columns of soldiers marching in review. But on the first optical composite test combining the new dupe negative of a legion of soldiers with the matte painting, things got a little out of whack, Yuricich recalled: “The foremost legions were to reach the base of the steps [of the emperor’s reviewing stand], turn screen right and exit frame. Unfortunately, because our timing was off, the entire legion turned at once, marched under the matte line and disappeared! We had a good laugh seeing that. By take two we had it figured right.”

Director Stanley Kramer’s all-star comedy with a cast ranging from Spencer Tracy and Milton Berle to Jonathan Winters and Phil Silvers, plus a Three Stooges cameo thrown in for good measure is a madcap dash for buried cash. In this penultimate scene, the *Mad* gang are trying to get off the collapsing fire escape of a dilapidated building, but have tumbled onto a fire engine ladder that becomes overweighted and starts swaying dangerously back and forth.

“What we were doing here was trying to make things look real and scary,” explained Linwood G. Dunn, who created this effect with matte artist Cliff Silsby at Film Effects of Hollywood, the independent company Dunn formed after RKO closed in the 1950s. “Our matte-painted building completes a fictitious twelve-story building that had, as its base, a two-story set shot on location. It was our job to make this sequence look convincing, and who’s going to know it’s painted if it’s a good job? That’s the matte painter’s job on a shot like this to be invisible.”

Dunn, ever the innovator, years later noted this shot was an example of what would become known in the digital age as “previsualization,” the rough computer graphics imagery that works out the look of a shot or effects element. Dunn’s version was a crude test of a swaying, three-foot miniature fire ladder, which became a running joke between Dunn and Kramer, the director harboring a worried suspicion that the rough test was going to be as good as it got.

This *Mad* shot was also part of a traveling presentation in which Dunn revealed secrets of visual effects, in the process inspiring many a young person to enter the business, including Syd Dutton, a future matte painter and cofounder of Illusion Arts.

*Matte World Digital © 2002*

“We had a sign over our barracks: ‘Is This Trip Really Necessary?'”

Lou Lichtenfield, matte artist and WW II B-17 bomber pilot

United States participation in World War II with war declared on the Axis powers after the Japanese bombing of Pearl Harbor on December 7, 1941 was the country’s last total war, a conflict demanding home-front sacrifices and support for the soldiers risking their lives and spilling their blood abroad.

Hollywood played a major role in the war effort, from creating gung-ho live-action films that hit the nerve of awakened patriotism to producing special animated training films that prepared recruits for combat. But moviemakers also had to deal with domestic shortages, particularly the scarcity of basic materials needed to build sets, making matte painting more important than ever to the success of a production.

After the disaster of Pearl Harbor and an early string of military setbacks, U.S. morale needed a victory. It got it when Colonel Jimmy Doolittle, a former stunt pilot, commanded a 1942 air raid on the heart of Japan, a daring maneuver at sea in which sixteen B-25 bombers were launched from the deck of the carrier *Hornet*. The Doolittle raid hit targets from Tokyo to Kobe, and while damage was minimal, Japan was dealt a severe psychological blow.

The MGM feature *Thirty Seconds Over Tokyo* celebrated the mission’s danger and valor. In this scene, an actor playing Admiral William “Bull” Halsey, commander of Carriers, Pacific Fleet, salutes the Raiders as the first bomber (with Doolittle at the controls) takes off from the *Hornet*. Amazingly, the production never went to sea; this image was created through the magic of Warren Newcombe’s matte department. The elements combined in the optical printer by effects cameraman Mark Davis included performers shot on an MGM soundstage (a partial carrier set with three B-25 mock-ups), an ocean element filmed at the studio’s outdoor water tank, and a matte painting (the rest of the carrier deck and the squadron of B-25s).

*Thirty Seconds Over Tokyo* won Warren Newcombe and his department an Academy Award for Visual Effects in 1945.

While most wartime films captured the grit and blood of combat, this David Selznick production celebrated the challenges and stoicism of those left behind on the home front. One of the film’s most famous shots was this scene of a young couple saying good-bye at a train station. The matte painting, deftly executed by rookie artist Spencer Bagdatopoulos, provided the dramatic backdrop for the tender interlude performed by actors Robert Walker and Jennifer Jones.

Selznick stalwart Clarence Slifer recalled the making of the scene: “After Selznick got the performance he wanted from the actors, we shot on another set the background of the people and their long shadows. The long, cast shadows were a theme of the film to show sadness, such as here with the boy leaving for the war. The final step to finish the shot was to put in the matte painting of the train station ceiling. That was the first painting Spencer Bagdatopoulos did for me when he came to the Selznick matte department.”

]]>