Blogs:

https://medium.com/@ottaviocalzone/an-intuitive-explanation-of-lstm-a035eb6ab42c few links
https://weberna.github.io/blog/2017/11/15/LSTM-Vanishing-Gradients.html
https://smerity.com/articles/2016/orthogonal_init.html Contains few papers
https://data-science-blog.com/blog/2020/09/07/back-propagation-of-lstm/

Books:

Alex graves thesis on RNN https://www.cs.toronto.edu/~graves/phd.pdf