https://jalammar.github.io/illustrated-transformer/
https://goyalpramod.github.io/blogs/Transformers_laid_out/
https://data-science-blog.com/blog/2021/04/07/multi-head-attention-mechanism/
https://www.youtube.com/watch?v=OyFJWRnt_AY
https://aman.ai/primers/ai/transformers/
https://github.com/Denis2054/Transformers-for-NLP-2nd-Edition