--- title: README emoji: 🏃 colorFrom: indigo colorTo: indigo sdk: static pinned: false ---

🏡 Homepage 👋 Discord 💬 WeChat Group

**English**🌎|[简体中文](https://github.com/opendatalab/opendatalab-datasets/blob/main/introduction%20CN.md)🀄 > [!NOTE] > 📚 In 2025, we have open-sourced a high-quality multilingual dataset, **WanJuan 3.0 (WanJuan Silu)** which comprises over 1.2TB of indigenous textual corpora from five countries. Each subset includes seven major categories and 34 subcategories, covering a wide range of local characteristics, such as history, politics, culture, real estate, shopping, weather, dining, encyclopedic knowledge, and professional expertise. Here are the download links for the five subsets, and we welcome everyone to download and use them. > > WanJuan3.0 [Korean](https://opendatalab.com/OpenDataLab/WanJuan-Korean) • [Arabic](https://opendatalab.com/OpenDataLab/WanJuan-Arabic) • [Vietnamese](https://opendatalab.com/OpenDataLab/WanJuan-Vietnamese)• [Russian](https://opendatalab.com/OpenDataLab/WanJuan-Russian)• [Thai](https://opendatalab.com/OpenDataLab/WanJuan-Thai) --- **🔥🔥🔥OpenDataLab Provide ecology for high-quality datasets for community.** It provides: # 🌟Extensive open data resources for AI Model ● High-speed and simple way to access open datasets ● 7700+ Large scale and high-quality open datasets for large model ● 1200+ Open datasets for Computer Vision
● 200+ Open datasets by CVPR ● Categorized datasets for hot topics # ✨Open-source data processing toolkits ● Data acquisition toolkits supporting large datasets ● Data acquisition toolkits supporting kinds of tasks ● Open source intelligent Toolbox for Labeling # 💫Dataset description language ● Format standardization ● DSDL: Dataset Description Language ● Define a CV dataset by DSDL ● OpenDataLab Standardized 100+ CV Datasets Check our [tutorials videos](https://www.youtube.com/watch?v=LjbRt7uddyw) (in Chinese) to get started. --- 📣 We have upgraded and launched the function of authors uploading datasets independently. We hereby invite you to participate in using it to better promote your open source datasets, AI research results, etc., so that more people can access, obtain and use your dataset. This is an introduction to the dataset autonomous upload function [【help doc】](https://github.com/opendatalab/opendatalab-datasets/blob/main/help%20doc.md)，You can create and share your dataset according to our guidelines. 💪 If you have any questions or obstacles, please feel free to contact us OpenDataLab@pjlab.org.cn. [![](https://github.com/opendatalab/opendatalab-datasets/blob/main/%E9%A1%B6%E4%BC%9A%E9%A1%B6%E5%88%8A%E6%95%B0%E6%8D%AE%E9%9B%86/ECCV/img/create%20your%20dataset.png?raw=true)](https://opendatalab.com/create?source=R2l0aHVi)