Visualizing Self Attention

Three implementations of self attention, one for visualizing the self attention between two tensors, one for language, and one in PyTorch