Flax Community

non-profit

https://github.com/huggingface/transformers/tree/master/examples/research_projects/jax-projects

Activity Feed

AI & ML interests

JAX, Flax, TPU, 🤗

Recent Activity

yhavinga new activity 6 days ago

flax-community/roberta-base-thai:Adding `safetensors` variant of this model

gagan3012 authored a paper about 2 months ago

DateLogicQA: Benchmarking Temporal Biases in Large Language Models

versae authored a paper about 2 months ago

The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective

View all activity

flax-community's activity

Muennighoff

authored a paper 4 days ago

s1: Simple test-time scaling

Paper • 2501.19393 • Published 7 days ago • 88

yhavinga

in flax-community/roberta-base-thai 6 days ago

Adding `safetensors` variant of this model

#1 opened 16 days ago by

SFconvertbot

g8a9

authored a paper 16 days ago

MSTS: A Multimodal Safety Test Suite for Vision-Language Models

Paper • 2501.10057 • Published 21 days ago • 8

rwightman

posted an update 21 days ago

Post

1299

I re-worked the JuptyerLab Space template recently. It's optimized for timm use, but will work great with transformers and other libs. Updated the base image, Python 3.12, Pillow-SIMD before better CPU use with image preprocessing, and made a number of other tweaks. From the Jupyter launcher you can run the terminal and setup a timm environment in moments with setup_timm_dev or setup_timm_scripts helpers. Give it a try, timm/jupyterlab-timm

rwightman

posted an update 28 days ago

Post

1157

New timm 1.0.13 and OpenCLIP 2.30.0 releases to start the year. Both modest but worthwhile updates.

timm added a number of new model weights, supporting loading of:
* PaliGemma2 encoders (ported from google/paligemma-2-release-67500e1e1dbfdd4dee27ba48)
* AIMv-2 encoders (ported from apple/aimv2-6720fe1558d94c7805f7688c)

A few higher resolution 384x384 ConvNeXt-Nano ImageNet-12k pretrain & finetunes. See other changes here: https://github.com/huggingface/pytorch-image-models/releases/tag/v1.0.13

And support added in both OpenCLIP and timm for two CLIP models that were missed. The DFN L/14 is 🔥
* DFN CLIP L/14 w/ 39B samples seen - apple/DFN2B-CLIP-ViT-L-14-39B, timm/vit_large_patch14_clip_224.dfn2b_s39b
* MetaCLIP H/14 (altogether) - timm/vit_huge_patch14_clip_224.metaclip_altogether

And last, ~70-80 models that were relying on timm remapping from OpenCLIP got their own timm hub instances to allow use with the upcoming Transformers TimmWrapperModel

ncoop57

authored 2 papers about 2 months ago

Stable Code Technical Report

Paper • 2404.01226 • Published Apr 1, 2024 • 1

Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference

Paper • 2412.13663 • Published Dec 18, 2024 • 126

versae

authored a paper about 2 months ago

The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective

Paper • 2412.09460 • Published Dec 12, 2024 • 7

christopher

posted an update 2 months ago

Post

1648

The folks at Foursquare released a dataset of 104.5 million places of interest ( foursquare/fsq-os-places) and here's all of them on a plot

4 replies

christopher

posted an update 2 months ago

Post

2390

The Lichess database of games, puzzles, and engine evaluations is now on the Hub: https://huggingface.co/Lichess

Billions of chess data points to download, query, and stream and we're excited to see what you'll build with it! ♟️ 🤗

- Lichess/positions-datasets-66f50837db5cd3287d60d489
- Lichess/games-datasets-66f508df78f4b43e1bb2d353

paws

authored a paper 2 months ago

Surveying the Effects of Quality, Diversity, and Complexity in Synthetic Data From Large Language Models

Paper • 2412.02980 • Published Dec 4, 2024 • 12

rwightman

posted an update 2 months ago

Post

1421

There's a new timm release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers() and new way to register optimizers and their attributes. As always you can use an timm optimizer like a torch one, just replace torch.optim with timm.optim

New optimizers include:
* AdafactorBigVision - adfactorbv
* ADOPT - adopt / adoptw (decoupled decay)
* MARS - mars
* LaProp - laprop
* Cautious Optimizers - a modification to all of the above, prefix with c as well as cadamw, cnadamw, csgdw, clamb, crmsproptf

I shared some caution comparisons in this model repo: rwightman/timm-optim-caution

For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

3 replies

rwightman

posted an update 3 months ago

Post

1351

I'm currently on a push to expand the scope of image based datasets on the Hub. There's certainly a lot already, but for anyone who's looked closely, there's not a whole lot of standardization. I am to fix that, datasets under the https://huggingface.co/timm and https://huggingface.co/pixparse orgs will serve as canonical examples for various task / modality combinations and be useable without fuss in libraries like timm, OpenCLIP, and hopefully more.

I just uploaded the first multi-label dataset that I'll support with timm scripts soon: timm/plant-pathology-2021

Next up object detection & segmentation! I've got an annotation spec sorted out, a lot of datasets ready to rip, and yeah that means timm support for object detection, eventually segmentation, is finally under development :O

rwightman

posted an update 3 months ago

Post

1070

Want to validate some hparams or figure out what timm model to use before commiting to download or training with a large dataset? Try mini-imagenet: timm/mini-imagenet

I had this sitting on my drive and forgot where I pulled it together from. It's 100 classes of imagenet, 50k train and 10k val images (from ImageNet-1k train set), and 5k test images (from ImageNet-1k val set). 7.4GB instead of > 100GB for the full ImageNet-1k. This ver is not reduced resolution like some other 'mini' versions. Super easy to use with timm train/val scripts, checkout the dataset card.

I often check fine-tuning with even smaller datasets like:
* timm/resisc45
* timm/oxford-iiit-pet
But those are a bit small to train any modest size model w/o starting from pretrained weights.

rwightman

posted an update 3 months ago

Post

1598

New MobileNetV4 weights were uploaded a few days ago -- more ImageNet-12k training at 384x384 for the speedy 'Conv Medium' models.

There are 3 weight variants here for those who like to tinker. On my hold-out eval they are ordered as below, not that different, but the Adopt 180 epochs closer to AdamW 250 than to AdamW 180.
* AdamW for 250 epochs - timm/mobilenetv4_conv_medium.e250_r384_in12k
* Adopt for 180 epochs - timm/mobilenetv4_conv_medium.e180_ad_r384_in12k
* AdamW for 180 epochs - timm/mobilenetv4_conv_medium.e180_r384_in12k

This was by request as a user reported impressive results using the 'Conv Large' ImagNet-12k pretrains as object detection backbones. ImageNet-1k fine-tunes are pending, the weights do behave differently with the 180 vs 250 epochs and the Adopt vs AdamW optimizer.