Simple interrogations.
#2
by
Nexesenex
- opened
Hey Jukofyork. I'm being a bit maniac, but it's bugging me :
merge_method: linear
parameters:
weight: 1.0
slices:
- sources:
- model: 152334H/miqu-1-70b-sf
layer_range: [0, 16] # -> shouldn't it be [0, 15] ?
- model: jukofyork/dark-miqu-70b
layer_range: [0, 16] # -> shouldn't it be [0, 15] ?
parameters:
weight: 0
- sources:
- model: jukofyork/dark-miqu-70b
layer_range: [16, 64]
- sources:
- model: 152334H/miqu-1-70b-sf
layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79] ? And is "80" the final output weight?
- model: jukofyork/dark-miqu-70b
layer_range: [64, 80] # -> shouldn't it be [65, 80] , or even [64, 79] ? And is "80" the final output weight?
parameters:
weight: 0
dtype: float16
tokenizer_source: model:miqu-1-70b-sf
Thanks for all the explanations you gave in Dark Miqu's comments.
Hi, it's a half-open range, written as [0, 16)
in mathematical terms:
https://en.m.wikipedia.org/wiki/Interval_(mathematics)
So the bottom number is included, but the top number isn't included, eg:
layer_range: [0, 16]
is layer 0 up to and including layer 15.
layer_range: [64, 80]
is layer 64 up to and including layer 79.
and so on.
Hey! Thank you for making it clear to me.
I'm very uneducated in maths beyond arithmetic and euclidean geometry. ^^