nyuuzyou's picture

nyuuzyou PRO

nyuuzyou

AI & ML interests

None yet

Recent Activity

Organizations

Social Post Explorers's profile picture AI Starter Pack's profile picture

nyuuzyou's activity

reacted to BrigitteTousi's post with πŸš€ about 23 hours ago
reacted to JingzeShi's post with πŸš€ 4 days ago
posted an update 4 days ago
view post
Post
2176
🐴 Fimfiction.net Writings Dataset - nyuuzyou/fimfiction

Collection of 815,740+ stories from Fimfiction.net featuring:
- Full story content from diverse fanfiction authors across the platform
- Complete metadata including titles, unique identifiers, and publication details
- Rich structural information preserving story formatting and author notes
- English-language content with diverse writing styles and narrative approaches
  • 1 reply
Β·
reacted to clem's post with ❀️ 4 days ago
posted an update 10 days ago
view post
Post
535
🌐 Public MediaWiki Collection Dataset - nyuuzyou/wikis

Collection of 1.66M+ articles from 930 public MediaWiki instances featuring:

- Full article content from diverse public wikis across the internet
- Complete metadata including templates, categories, and section structure
- Rich structural information preserving wiki organization and links
- Multilingual content across 35+ languages including English, Chinese, Spanish, and more
- Regional language variants including US/UK English, Brazilian Portuguese, and Traditional/Simplified Chinese

Key contents:
- 1,662,448 wiki articles with full text
- Extensive metadata including templates, categories, sections
- Internal wikilinks and external reference information
- Cross-domain knowledge spanning multiple topics and fields
posted an update 13 days ago
view post
Post
2455
πŸ“š Historical Russian Technical Journal Images Dataset - nyuuzyou/journals

Π‘ollection of digitized pages from vintage Russian technical journals featuring:

- 7.47k high-quality images
- Machine-generated descriptions in Russian
- Valuable historical technical content for image-to-text applications

Content descriptions are dedicated to the public domain under the CC0 1.0 license, allowing unrestricted use without attribution.
New activity in nyuuzyou/journals 13 days ago
posted an update 14 days ago
view post
Post
1981
🌐 Grustnogram Social Media Dataset - nyuuzyou/grustnogram

A comprehensive collection of 597K posts from Grustnogram.ru featuring:

- 597K social media posts with full text and image content (all images are black and white)
- Rich metadata including user IDs, post interactions (likes, comments)
- Content from anonymous text-only posts
- Approximately 278.9 GB of content

Content is dedicated to the public domain under the CC0 1.0 license, allowing unrestricted reuse without attribution or share-alike requirements.
reacted to ngxson's post with πŸš€ 14 days ago
view post
Post
3003
A comprehensive matrix for which format should you use.

Read more on my blog post: https://huggingface.co/blog/ngxson/common-ai-model-formats

| Hardware        | GGUF      | PyTorch                | Safetensors              | ONNX  |
|-----------------|-----------|------------------------|--------------------------|-------|
| CPU             | βœ… (best) | 🟑                      | 🟑                       | βœ…    |
| GPU             | βœ…        | βœ…                      | βœ…                       | βœ…    |
| Mobile          | βœ…        | 🟑 (via executorch)     | ❌                       | βœ…    |
| Apple silicon   | βœ…        | 🟑                      | βœ… (via MLX framework)   | βœ…    |
  • 1 reply
Β·
posted an update 16 days ago
view post
Post
634
πŸ›« AEX.ru Aviation News Dataset - nyuuzyou/aex

Key contents:
- 249,149 aviation news articles with full text
- Metadata including tags, image captions, and attributions
- URL information for reference
- Russian language content focusing on aviation topics