unmodeled-tyler (Tyler)

replied to Ujjwal-Tyagi's post about 4 hours ago

Awesome! Congrats on the release, @nyuuzyou

replied to nyuuzyou's post about 12 hours ago

Amazing! Thanks for sharing!

reacted to nyuuzyou's post with 🔥 about 12 hours ago

Post

139

🇨🇳 Gitee Code Dataset - The Missing Piece of the Stack
nyuuzyou/gitee-code

Gitee is not included in the Software Heritage archive, meaning it is currently missing from datasets like The Stack. This release fills that massive gap, serving as the largest Chinese code dataset and one of the largest code corpuses overall.

- 819,472,785 files from 3,105,923 repositories
- 536 GB compressed Parquet storage
- 554 programming languages
- Extensive quality filtering: Removed vendor code, artifacts, and generated files
- Rich Chinese language understanding: High volume of Chinese comments and docs

Huge thanks to Hugging Face for the storage grant that made hosting this (and all my other datasets) possible!

I have also already dropped several other new code datasets and rolled out QoL improvements for older ones. I will be dropping posts on those throughout the week.

1 reply

·

reacted to sergiopaniego's post with 🔥 about 12 hours ago

Post

424

New REPL environment in OpenEnv available! ✨
Used in the Recursive Language Models (RLM) paper by Alex Zhang.

Ready for inference & post-training using trajectories. Handles long contexts:

> Run Python code in a sandbox
> Make recursive calls to LMs
> Explore data programmatically
> Return final result

Docs: https://meta-pytorch.org/OpenEnv/environments/repl/
Inference script: https://github.com/meta-pytorch/OpenEnv/blob/main/examples/repl_oolong_simple.py

reacted to TravisMuhlestein's post with 🔥 about 12 hours ago

Post

359

Agentic AI doesn’t fail because it lacks intelligence — it fails because it lacks context.

As agents become more autonomous, the real challenge shifts from generation to governance:
understanding when, why, and under what constraints an agent should act.

At GoDaddy, we’ve been treating context as a first-class primitive for agentic systems —
combining identity, intent, permissions, and environment so agents can operate responsibly in production.

Context is what turns automation into judgment.
Without it, autonomy becomes risk.

This post outlines how we’re thinking about the transition from task execution to context-aware agentic systems, and what that means for building AI that can be trusted at scale.

👉 How we build context for agentic AI:
https://www.godaddy.com/resources/news/how-godaddy-builds-context-for-agentic-ai

Curious how others here are modeling context, trust boundaries, and decision constraints in agentic architectures.

reacted to Ujjwal-Tyagi's post with 🔥 1 day ago

Post

1599

I’m looking for AI engineers and researchers to join my company as part of the core team. We’ll be working on cutting-edge research and hands-on implementation across LLMs and related systems. I’m especially interested in founding engineers for my ai startup, who want to build from the ground up and shape both the product and the research direction. If this sounds interesting to you, reply to this post and message me on Discord — my username is "ujjwal_tyagi.shirova", Please also attach your Resume and Details of your open source projects (if any related to LLMs) on discord, avoid sharing here as a reply to this post.

reacted to MikeDoes's post with ❤️ 1 day ago

Post

2017

Building powerful multilingual AI shouldn't mean sacrificing user privacy.

We're highlighting a solution-oriented report from researchers Sahana Naganandh, Vaibhav V, and Thenmozhi M at Vellore Institute of Technology that investigates this exact challenge. The direct connection to our mission is clear: the paper showcases the PII43K dataset as a privacy-preserving alternative to high-risk, raw multilingual data

The report notes that our dataset, with its structured anonymization, is a "useful option for privacy-centric AI applications." It's always a delight when academic research independently validates our data-first approach to solving real-world privacy problems.

This is how we build a safer AI future together.

🔗 Read the full report here to learn more: https://assets.cureusjournals.com/artifacts/upload/technical_report/pdf/3689/20250724-59151-93w9ar.pdf

🚀 Stay updated on the latest in privacy-preserving AI—follow us on LinkedIn: https://www.linkedin.com/company/ai4privacy/posts/

#OpenSource
#DataPrivacy
#LLM
#Anonymization
#AIsecurity
#HuggingFace
#Ai4Privacy
#Worldslargestopensourceprivacymaskingdataset

1 reply

·

reacted to AdinaY's post with 🧠 1 day ago

Post

2095

Based on 2025 Chinese AI Timeline, here are some interesting takeaways:

✨ DeepSeek cadence: They shipped almost every month! (except Feb 2025)

✨ Qwen trajectory: Not a single “hit” model, but an expanding product line. VL/Math/Coder/Reranker/Embedding/Omni/Next/Image

✨ Multimodal trend: Steadily rising share, shifting from generation to editing + tooling.

✨ Reasoning as a main track: more engineered, system-level reasoning.

✨ From foundation to components: growth in infra models (embeddings, rerankers, OCR, speech) signals a move toward deployable stacks.

✨ Ecosystem broadening: more players beyond the top labs.

Follow for more updates👉

zh-ai-community

2 replies

·

reacted to AdinaY's post with 🚀 1 day ago

Post

1923

AgentCPM-Explore🔥 on device agent foundation model released by OpenBMB
openbmb/AgentCPM-Explore
✨ 4B - Apache2.0
✨ Supports 100+ multi-turn environment interactions with search + verification
✨ Full training/inference stack is openly shared as well

reacted to prithivMLmods's post with 🔥 2 days ago

Post

3636

LTX-2 Camera-Control LoRA demo with dolly-in/out and dolly-left/right is now available on Hugging Face, paired with ltx-2-19b-distilled-lora for fast inference. It also includes dynamic GPU duration adjustments for long video generations. Click the related Space links below.

🤗Try it now on : prithivMLmods/LTX-2-LoRAs-Camera-Control-Dolly
⭐Github: https://github.com/PRITHIVSAKTHIUR/LTX-2-LoRAs-Camera-Control-Dolly
🕹️Collection: https://huggingface.co/collections/prithivMLmods/image-generation-apps-collection

To learn more, visit the app page or the respective model pages.

2 replies

·

reacted to their post with 🚀 2 days ago

Post

2703

NEW MODEL: vanta-research/mox-8b

Hey everyone! I changed up my approach with this one a bit. Mox was designed with the following characteristics:

- self coherence
- direct opinions
- epistemic confidence
- grounded meta-awareness
- reasoned refusals

I've been thinking a lot about what "helpfulness" means lately. Commonly in AI, that looks like fulfilling user requests as closely as possible as long as the request isn't unsafe.

But I wanted to know what it was like to build a model that might be helpful in the same way a human would be.

For example, if you ask Mox to write a 10 page paper on the cultural significance of staplers, Mox will probably refuse, tell you that wouldn't be useful or helpful to ANYBODY and recommend a different, but more useful approach.

Mox is still very much a work in progress, but I think that this is a good starting point! I'm already generating more datasets to add more elements to Mox's persona in future versions, which you should see on the hub soon!

posted an update 2 days ago

Post

2703

NEW MODEL: vanta-research/mox-8b

Hey everyone! I changed up my approach with this one a bit. Mox was designed with the following characteristics:

- self coherence
- direct opinions
- epistemic confidence
- grounded meta-awareness
- reasoned refusals

I've been thinking a lot about what "helpfulness" means lately. Commonly in AI, that looks like fulfilling user requests as closely as possible as long as the request isn't unsafe.

But I wanted to know what it was like to build a model that might be helpful in the same way a human would be.

For example, if you ask Mox to write a 10 page paper on the cultural significance of staplers, Mox will probably refuse, tell you that wouldn't be useful or helpful to ANYBODY and recommend a different, but more useful approach.

Mox is still very much a work in progress, but I think that this is a good starting point! I'm already generating more datasets to add more elements to Mox's persona in future versions, which you should see on the hub soon!

reacted to mindchain's post with ❤️ 2 days ago

Post

2802

Claude Code Self & Continual Learning

Hey everyone! 👋

30 GitHub Stars in 4 Days - Thank You!

I'm really grateful for the positive response to the Claude Reflect System. In just 4 days, 30 developers have shown interest by starring the project. Thank you so much!

What Is Claude Reflect?

Correct once, never again. Claude Reflect helps Claude Code remember your corrections and preferences across sessions. Instead of repeating the same feedback, the system learns and applies it automatically.

Main Features:

🧠 Learning System
- Detects corrections and preferences from conversations
- Stores them permanently in skill files
- Applies learnings in future sessions

🔒 Safety First
- Automatic backups before changes
- YAML validation
- Git version control

⚡ Two Modes
- Manual: Run /reflect when you want
- Auto: Reflects automatically at session end

How It Works

If you correct Claude to use pytest instead of unittest, this preference gets saved. Next time, Claude will remember and use pytest automatically. It's that simple.

Getting Started

1. Clone the repository
2. Install dependencies
3. Activate the skill
4. Try it out!

The python-project-creator example shows how the system learns from your feedback.

Give It a Try

https://github.com/haddock-development/claude-reflect-system

Feel free to check it out, give feedback, or contribute. Every bit of input helps improve the project!

Thank you so much for your support!

---
#ClaudeCode #AI #MachineLearning #ContinualLearning #OpenSource #Developer #Coding #Python #Productivity #DevTools #GitHub #SoftwareDevelopment #Programming #AIAssistant #DeveloperTools #CodeQuality #Tech

Feel free to give it a try by yourself.
https://github.com/haddock-development/claude-reflect-system

reacted to their post with 🚀 3 days ago

Post

1372

Atom-80B is out!: vanta-research/atom-80b

I'm excited to share the new Atom-80B from VANTA Research! A few days ago we released the largest model-to-date from our portfolio, which was Atom-27B.

We've quickly scaled up to the new Qwen3 Next 80B architecture, bringing our friendly, curious, and collaborative Atom persona to cutting edge lightweight, high parameter inference.

Atom is designed to work and think alongside you through curious exploration. Using Atom collaboratively in your work can help spark your own creativity or curiosity. Give it a try!

1 reply

·

replied to Reality123b's post 4 days ago

Happy Birthday! Hopefully you did something fun today!

reacted to etemiz's post with 🔥 4 days ago

Post

2020

how to expand your dataset (of articles) without changing the ideas in it?

i was doing CPT for a while and got decent results. but what if i want to go for perfection? cover all the areas of misalignment using limited datasets. i have to find a way to multiply the material to successfully combat the material of the rest of the internet.

i want to generate SFT datasets but only on controversial topics, because i have to be efficient with limited resources. first i give a smart LLM a 'ground truth' text. then i give it the following prompts:

- You are a highly skilled academic analyst.

- Analyze this text and find 3 bold claims that could cause controversy and division in public. List the claims and also state why they are debatable. Give numbers to the claims.

- Convert these claims into binary questions (that could be answered by yes/no or this/that).

- Now put these questions in a json format. Please also add the info about which of the answers concur with the original text and the question number.

- Write some supporting arguments for 1st question, with respect to the original text, concurring and confirming the original text. 
There must be about 300 words. You should not mention the text, write it as if you are the one answering the question.

the result is questions and answers with more words along the same ideas. a few sentences of opinions in the beginning, is expanded to lots of words. using this method i can multiply billions of tokens to tens of billions probably and have a more effective training.

next i should do RL maybe. LLMs seem to have all kinds of ideas already installed, yet they don't have the intuition to know which one is true. they can give you a ton of reasons to support anything. given the proper incentives, LLMs then should evolve towards supporting aligned ideas more. the rewards will be like guidance that will kick an LLM towards better answers.

4 replies

·

reacted to SkorczyByk's post with 👍 4 days ago

Post

2275

Potrzebuję modelu do analizy długich dokumentów technicznych, z wzorami, tabelami, wykresami, formułami matematycznymi.

2 replies

·

replied to SkorczyByk's post 4 days ago

You'll probably want a model with a long context window - but honestly how it's implemented is going to be the key here. Any compute constraints? Looking to run it locally? Cloud? More details would help with a better recommendation!

But to get you going I'd suggest checking out ol' reliable: https://huggingface.co/meta-llama/Llama-3.1-8B-Instruct

open source interfaces like openwebui (https://github.com/open-webui) come with RAG (retrieval augmented generation) support out of the box so I'd probably recommend starting there and then exploring!

reacted to Chirag2207's post with 🔥 4 days ago

Post

878

We have opened up Alpie Core for broader use.

Alpie Core is a 32B reasoning model trained and served entirely at 4-bit precision. Instead of training in FP16 and compressing later, it’s optimised end-to-end for low-precision reasoning, keeping memory usage low while still handling long, multi-step tasks well.

What’s new:
- Hosted OpenAI-compatible API
- Official Python SDK (sync, async, streaming)
- CLI for quick testing
- 65K context support

The model remains open source and is available on Hugging Face and Ollama for local runs. For easier evaluation, the first API key comes with 5M free tokens to test real workloads.

We’d love feedback on long-context behaviour, reasoning stability, and agent or tool-calling workflows. Happy to answer questions or dive into details.

Links:
- API access: https://169pi.ai
- SDK access: https://github.com/169Pi/Pi169-SDK
- HF: 169Pi/Alpie-Core

replied to hypothetical's post 4 days ago

Congrats on the release!

Tyler PRO

AI & ML interests

Recent Activity

Organizations

Tyler PRO

AI & ML interests

Recent Activity

Organizations

unmodeled-tyler's activity