streamlit langdetect transformers nltk numpy python-docx gensim scikit-learn torch sentencepiece langchain langchain-groq langchain-core python-dotenv Pillow pytesseract Python-IO