AI & ML interests

Document and User Interface Parsing, Understanding, Q&A.

rwightmanย 
in pixparse/cc3m-wds 4 months ago

Original url?

1
#7 opened 4 months ago by
rtqichen
rwightmanย 
in pixparse/cc3m-wds 7 months ago

Url of images

2
#6 opened 7 months ago by
deenmohan
rwightmanย 
posted an update 8 months ago
view post
Post
1991
I re-worked the JuptyerLab Space template recently. It's optimized for timm use, but will work great with transformers and other libs. Updated the base image, Python 3.12, Pillow-SIMD before better CPU use with image preprocessing, and made a number of other tweaks. From the Jupyter launcher you can run the terminal and setup a timm environment in moments with setup_timm_dev or setup_timm_scripts helpers. Give it a try, timm/jupyterlab-timm
rwightmanย 
posted an update 8 months ago
view post
Post
1210
New timm 1.0.13 and OpenCLIP 2.30.0 releases to start the year. Both modest but worthwhile updates.

timm added a number of new model weights, supporting loading of:
* PaliGemma2 encoders (ported from google/paligemma-2-release-67500e1e1dbfdd4dee27ba48)
* AIMv-2 encoders (ported from apple/aimv2-6720fe1558d94c7805f7688c)

A few higher resolution 384x384 ConvNeXt-Nano ImageNet-12k pretrain & finetunes. See other changes here: https://github.com/huggingface/pytorch-image-models/releases/tag/v1.0.13

And support added in both OpenCLIP and timm for two CLIP models that were missed. The DFN L/14 is ๐Ÿ”ฅ
* DFN CLIP L/14 w/ 39B samples seen - apple/DFN2B-CLIP-ViT-L-14-39B, timm/vit_large_patch14_clip_224.dfn2b_s39b
* MetaCLIP H/14 (altogether) - timm/vit_huge_patch14_clip_224.metaclip_altogether

And last, ~70-80 models that were relying on timm remapping from OpenCLIP got their own timm hub instances to allow use with the upcoming Transformers TimmWrapperModel
lhoestqย 
posted an update 9 months ago
view post
Post
2582
Made a HF Dataset editor a la gg sheets here: lhoestq/dataset-spreadsheets

With Dataset Spreadsheets:
โœ๏ธ Edit datasets in the UI
๐Ÿ”— Share link with collaborators
๐Ÿ Use locally in DuckDB or Python

Available for the 100,000+ parquet datasets on HF :)
rwightmanย 
posted an update 9 months ago
view post
Post
1479
There's a new timm release, v 1.0.12, with a focus on optimizers. The optimizer factory has been refactored, there's now a timm.optim.list_optimizers() and new way to register optimizers and their attributes. As always you can use an timm optimizer like a torch one, just replace torch.optim with timm.optim

New optimizers include:
* AdafactorBigVision - adfactorbv
* ADOPT - adopt / adoptw (decoupled decay)
* MARS - mars
* LaProp - laprop
* Cautious Optimizers - a modification to all of the above, prefix with c as well as cadamw, cnadamw, csgdw, clamb, crmsproptf

I shared some caution comparisons in this model repo: rwightman/timm-optim-caution

For details, references, see the code: https://github.com/huggingface/pytorch-image-models/tree/main/timm/optim

  • 3 replies
ยท
rwightmanย 
posted an update 9 months ago
view post
Post
1399
I'm currently on a push to expand the scope of image based datasets on the Hub. There's certainly a lot already, but for anyone who's looked closely, there's not a whole lot of standardization. I am to fix that, datasets under the timm and pixparse orgs will serve as canonical examples for various task / modality combinations and be useable without fuss in libraries like timm, OpenCLIP, and hopefully more.

I just uploaded the first multi-label dataset that I'll support with timm scripts soon: timm/plant-pathology-2021

Next up object detection & segmentation! I've got an annotation spec sorted out, a lot of datasets ready to rip, and yeah that means timm support for object detection, eventually segmentation, is finally under development :O
rwightmanย 
posted an update 10 months ago
view post
Post
1088
Want to validate some hparams or figure out what timm model to use before commiting to download or training with a large dataset? Try mini-imagenet: timm/mini-imagenet

I had this sitting on my drive and forgot where I pulled it together from. It's 100 classes of imagenet, 50k train and 10k val images (from ImageNet-1k train set), and 5k test images (from ImageNet-1k val set). 7.4GB instead of > 100GB for the full ImageNet-1k. This ver is not reduced resolution like some other 'mini' versions. Super easy to use with timm train/val scripts, checkout the dataset card.

I often check fine-tuning with even smaller datasets like:
* timm/resisc45
* timm/oxford-iiit-pet
But those are a bit small to train any modest size model w/o starting from pretrained weights.
rwightmanย 
posted an update 10 months ago
view post
Post
1628
New MobileNetV4 weights were uploaded a few days ago -- more ImageNet-12k training at 384x384 for the speedy 'Conv Medium' models.

There are 3 weight variants here for those who like to tinker. On my hold-out eval they are ordered as below, not that different, but the Adopt 180 epochs closer to AdamW 250 than to AdamW 180.
* AdamW for 250 epochs - timm/mobilenetv4_conv_medium.e250_r384_in12k
* Adopt for 180 epochs - timm/mobilenetv4_conv_medium.e180_ad_r384_in12k
* AdamW for 180 epochs - timm/mobilenetv4_conv_medium.e180_r384_in12k

This was by request as a user reported impressive results using the 'Conv Large' ImagNet-12k pretrains as object detection backbones. ImageNet-1k fine-tunes are pending, the weights do behave differently with the 180 vs 250 epochs and the Adopt vs AdamW optimizer.

rwightmanย 
in pixparse/cc3m-wds 10 months ago

Error in downloading

1
#4 opened 10 months ago by
KKNakka
rwightmanย 
posted an update 11 months ago
view post
Post
673
A new timm release (1.0.11) is out now. A also wrote an article on one of the included models: https://huggingface.co/blog/rwightman/mambaout

Featured in the release are:
* The MambaOut model, a cheeky arch inspired by SSM but without the SSM part, a ConvNeXt with gating.
* Several timm trained MambaOut variations with arch tweaks and ImageNet-12k pretrain to verify scaling, supplement ported weights.
* The smallest MobileNetV4, a 0.5x width scaled Conv-Small.
* Two impressive MobileNetV3 Large models outperforming all previous, using MNV4 Small recipe.
* 'Zepto,' a new compact ConvNeXt variant even smaller than the previous Atto, 2.2M params, RMSNorm, and solid results for its size.
* Newly ported SigLIP SO400M/16 ViT multi-lingual weights, the largest i18n weights, prevous was B/16.
* Two ImageNet-1k fine-tuned SigLIP SO400M models at 378x378
* InternViT 300M weight port. A really solid ViT encoder distilled from OpenGVLab 6B VL model encoder.
* An assortment of very small, sub 1M param pretrained test models to improve library unit tests and serve low-resource applications.
rwightmanย 
posted an update 11 months ago
view post
Post
2619
A 'small' MobileNet-V4 update, I just pushed weights for the smallest model I've trained in the series, a 0.5 width multiplier version of the MobileNet-V4 Conv Small.

Now you may look at this and say hey, why is this impressive? 64.8% top-1 and 2.2M params? MobileNetV3-Small 0.75, and MobileNet-V2 0.5 are both fewer params (at ~2M) and over 65% top-1, what gives? Well this is where MobileNet-V4 differs from the previous versions of the model family, it trades off (gives up) a little parameter efficiency for some computational efficiency.

So, let's look at the speed. On a 4090 w/ torchcompile
* 98K img/sec - timm/mobilenetv4_conv_small_050.e3000_r224_in1k
* 58K img/sec - timm/mobilenetv3_small_075.lamb_in1k
* 37K img/sec - timm/mobilenetv2_050.lamb_in1k

And there you go, if you have a need for speed, MNV4 is the better option.
rwightmanย 
posted an update 12 months ago
view post
Post
1312
The timm leaderboard timm/leaderboard has been updated with the ability to select different hardware benchmark sets: RTX4090, RTX3090, two different CPUs along with some NCHW / NHWC layout and torch.compile (dynamo) variations.

Also worth pointing out, there are three rather newish 'test' models that you'll see at the top of any samples/sec comparison:
* test_vit ( timm/test_vit.r160_in1k)
* test_efficientnet ( timm/test_efficientnet.r160_in1k)
* test_byobnet ( timm/test_byobnet.r160_in1k, a mix of resnet, darknet, effnet/regnet like blocks)

They are < 0.5M params, insanely fast and originally intended for unit testing w/ real weights. They have awful ImageNet top-1, it's rare to have anyone bother to train a model this small on ImageNet (the classifier is roughly 30-70% of the param count!). However, they are FAST on very limited hadware and you can fine-tune them well on small data. Could be the model you're looking for?
rwightmanย 
posted an update about 1 year ago
view post
Post
2093
The latest timm validation & test set results are now viewable by a leaderboard space: timm/leaderboard

As of yesterday, I updated all of the results for ImageNet , ImageNet-ReaL, ImageNet-V2, ImageNet-R, ImageNet-A, and Sketch sets. The csv files can be found in the GH repo https://github.com/huggingface/pytorch-image-models/tree/main/results

Unfortunately the latest benchmark csv files are not yet up to date, there are some gaps in dataset results vs throughput/flop numbers impact the plots.

h/t to @MohamedRashad for making the first timm leaderboard.
  • 1 reply
ยท