Vui Seng Chua
update readme

This repo contains sparsity report for each of the pruned model in the table below.

The report (csv) shows layer-wise sparsity, sparsity by tile of 128x16, sparsity by col and row global to its layers.

Perplexity over Sparsity

Pruning meta-llama/Meta-Llama-3.1-8B with Wanda

Weight Target Sparsity Perplexity (lower is better)
0 (dense, baseline) 5.8393
10 5.8781
20 6.0102
30 6.3076
40 7.0094
50 9.0642
60 20.2265
70 103.5209

For a more granular sparsity report within a given tile, pls continue below.


pip install torch ipython pandas

Interative look up a specific tile of a layer

# pls make sure git lfs is installed at your end
git clone
cd 24-0830-wanda-llama3.1-8B

Expected outcome in as follows, it will be in ipython console with the needed functionality loaded.

$ ./ 
Python 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0]
Type 'copyright', 'credits' or 'license' for more information
IPython 8.26.0 -- An enhanced Interactive Python. Type '?' for help.

- Help ------------------

h = SparseBlob("path to sparsity blob")

        preview sparsity dataframe, intend to show row id, short id for look up
        eg. h.preview()

        list all available layer ids for look up
        eg. h.ls_layers()

        return a sparsity stats of a layer via short_id lookup.
        eg. h.get_sparsity_by_short_id('tx.0.attn.v')

        return a sparsity stats of a layer via row id lookup.
        eg. h.get_sparsity_by_row_id(36)

        zoom into a specific layer and a specific tile,
        return the sparsity stats of the tile down to col, row granularity
        eg. h.get_sparsity_by_row_id(36, (5, 6))

        print help for available function of SparseBlob
        eg. h.show_help()

- End of Help ------------------
In [1]: 

Sample usage:

In [1]: ls blob*

In [2]: h = SparseBlob("blob.sparsity._Meta-Llama-3.1-8B-wanda-unstructured-0.5")

In [3]: h.preview()
                             layer_id        short_id  ... row_med row_max
0     model.layers.0.self_attn.q_proj     tx.0.attn.q  ...  0.5000  1.0000
1     model.layers.0.self_attn.k_proj     tx.0.attn.k  ...  0.5000  1.0000
2     model.layers.0.self_attn.v_proj     tx.0.attn.v  ...  0.5000  1.0000
3     model.layers.0.self_attn.o_proj     tx.0.attn.o  ...  0.5000  1.0000
4        model.layers.0.mlp.gate_proj   tx.0.mlp.gate  ...  0.5000  1.0000
5          model.layers.0.mlp.up_proj     tx.0.mlp.up  ...  0.5000  1.0000
6        model.layers.0.mlp.down_proj   tx.0.mlp.down  ...  0.5000  1.0000
7     model.layers.1.self_attn.q_proj     tx.1.attn.q  ...  0.5000  1.0000
8     model.layers.1.self_attn.k_proj     tx.1.attn.k  ...  0.5000  1.0000
9     model.layers.1.self_attn.v_proj     tx.1.attn.v  ...  0.5000  1.0000
10    model.layers.1.self_attn.o_proj     tx.1.attn.o  ...  0.5000  1.0000
11       model.layers.1.mlp.gate_proj   tx.1.mlp.gate  ...  0.5000  1.0000
222       model.layers.31.mlp.up_proj    tx.31.mlp.up  ...  0.5000  1.0000
223     model.layers.31.mlp.down_proj  tx.31.mlp.down  ...  0.5000  1.0000
224                           lm_head         lm_head  ...  0.0000  0.0000

[225 rows x 23 columns]

In [4]: h.get_sparsity_by_row_id(10)
layer_id        model.layers.1.self_attn.o_proj
short_id                            tx.1.attn.o
layer_type                               Linear
param_type                               weight
shape                              [4096, 4096]
nparam                                 16777216
nnz                                     8388608
sparsity                                 0.5000
tile_shape                            (128, 16)
n_tile                                 32 x 256
n_tile_total                               8192
tile_avg                                 0.5000
tile_min                                 0.2197
tile_med                                 0.5073
tile_max                                 0.9678
col_avg                                  0.5000
col_min                                  0.0312
col_med                                  0.4609
col_max                                  1.0000
row_avg                                  0.5000
row_min                                  0.0000
row_med                                  0.5000
row_max                                  1.0000
Name: 10, dtype: object

In [5]: h.get_sparsity_of_tile(10, (30, 245))
                               (30, 245) : tile_id
         model.layers.1.self_attn.o_proj : layer_id
                               (128, 16) : tiled by
                                  0.2861 : tile_sparsity
                                      16 : col_count
                                  0.2861 : col_avg
                                  0.2266 : col_min
                                  0.2734 : col_med
                                  0.3594 : col_max
                                     128 : row_count
                                  0.2861 : row_avg
                                  0.0000 : row_min
                                  0.2500 : row_med
                                  0.6250 : row_max

Internal notes

see patch wanda branch here. see the raw!