jukofyork/Dusk-Miqu-70B · Simple interrogations.

1 day ago

•

Hey Jukofyork. I'm being a bit maniac, but it's bugging me :

merge_method: linear
parameters:
  weight: 1.0
slices:
  - sources:
      - model: 152334H/miqu-1-70b-sf
        layer_range: [0, 16]  # -> shouldn't it be [0, 15]  ?
      - model: jukofyork/dark-miqu-70b
        layer_range: [0, 16] # -> shouldn't it be [0, 15]  ?
        parameters:
          weight: 0
  - sources:
      - model: jukofyork/dark-miqu-70b
        layer_range: [16, 64]
  - sources:
      - model: 152334H/miqu-1-70b-sf
        layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79]  ? And is "80" the final output weight?
      - model: jukofyork/dark-miqu-70b
        layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79]  ? And is "80" the final output weight?
        parameters:
          weight: 0
dtype: float16
tokenizer_source: model:miqu-1-70b-sf

Thanks for all the explanations you gave in Dark Miqu's comments.

jukofyork

Owner about 19 hours ago

•

edited about 19 hours ago

Hi, it's a half-open range, written as [0, 16)in mathematical terms:

https://en.m.wikipedia.org/wiki/Interval_(mathematics)

So the bottom number is included, but the top number isn't included, eg:

layer_range: [0, 16] is layer 0 up to and including layer 15.

layer_range: [64, 80] is layer 64 up to and including layer 79.

and so on.

Nexesenex

about 14 hours ago

Hey! Thank you for making it clear to me.
I'm very uneducated in maths beyond arithmetic and euclidean geometry. ^^