Simple interrogations.

#2
by Nexesenex - opened

Hey Jukofyork. I'm being a bit maniac, but it's bugging me :

merge_method: linear
parameters:
  weight: 1.0
slices:
  - sources:
      - model: 152334H/miqu-1-70b-sf
        layer_range: [0, 16]  # -> shouldn't it be [0, 15]  ?
      - model: jukofyork/dark-miqu-70b
        layer_range: [0, 16] # -> shouldn't it be [0, 15]  ?
        parameters:
          weight: 0
  - sources:
      - model: jukofyork/dark-miqu-70b
        layer_range: [16, 64]
  - sources:
      - model: 152334H/miqu-1-70b-sf
        layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79]  ? And is "80" the final output weight?
      - model: jukofyork/dark-miqu-70b
        layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79]  ? And is "80" the final output weight?
        parameters:
          weight: 0
dtype: float16
tokenizer_source: model:miqu-1-70b-sf

Thanks for all the explanations you gave in Dark Miqu's comments.

Hi, it's a half-open range, written as [0, 16)in mathematical terms:

https://en.m.wikipedia.org/wiki/Interval_(mathematics)

So the bottom number is included, but the top number isn't included, eg:

layer_range: [0, 16] is layer 0 up to and including layer 15.

layer_range: [64, 80] is layer 64 up to and including layer 79.

and so on.

Hey! Thank you for making it clear to me.
I'm very uneducated in maths beyond arithmetic and euclidean geometry. ^^

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment