streamlit PyPDF2 transformers huggingface_hub datasets