Intermediate Activations in Llama 2.7B

there is a country layer in the llama 2 transformer

Published

13 August 2023

I found these parts interesting from this LessWrong analysis of the Llama 2 attention outputs.

“By layer 24, the model is quite certain about the correct answer, and the remaining computations are mostly redundant, mainly re-weighting alternative less obvious completion paths such as ‘The capital of Germany is {a, the, one, home, located…}’. Interestingly, the model becomes less certain about ‘Berlin’ from layers 24-31 as it figures out more alternative options.”

“The attention output of layer 24 of the llama 2 transformer consistently represents relevant information related to countries, even when neither the prompt nor the higher probability completions are related to countries”