MASR / transformers /docs /source /ko /create_a_model.md
Yuvarraj's picture
Initial commit
a0db2f9

๋งž์ถคํ˜• ์•„ํ‚คํ…์ฒ˜ ๋งŒ๋“ค๊ธฐ[[create-a-custom-architecture]]

AutoClass๋Š” ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ž๋™์œผ๋กœ ์ถ”๋ก ํ•˜๊ณ  ๋ฏธ๋ฆฌ ํ•™์Šต๋œ configuration๊ณผ ๊ฐ€์ค‘์น˜๋ฅผ ๋‹ค์šด๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์ฒดํฌํฌ์ธํŠธ์— ๊ตฌ์• ๋ฐ›์ง€ ์•Š๋Š” ์ฝ”๋“œ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด AutoClass๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค. ํ•˜์ง€๋งŒ ํŠน์ • ๋ชจ๋ธ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๋ณด๋‹ค ์„ธ๋ฐ€ํ•˜๊ฒŒ ์ œ์–ดํ•˜๊ณ ์ž ํ•˜๋Š” ์‚ฌ์šฉ์ž๋Š” ๋ช‡ ๊ฐ€์ง€ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋งŒ์œผ๋กœ ์ปค์Šคํ…€ ๐Ÿค— Transformers ๋ชจ๋ธ์„ ์ƒ์„ฑํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Š” ๐Ÿค— Transformers ๋ชจ๋ธ์„ ์—ฐ๊ตฌ, ๊ต์œก ๋˜๋Š” ์‹คํ—˜ํ•˜๋Š” ๋ฐ ๊ด€์‹ฌ์ด ์žˆ๋Š” ๋ชจ๋“  ์‚ฌ์šฉ์ž์—๊ฒŒ ํŠนํžˆ ์œ ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด ๊ฐ€์ด๋“œ์—์„œ๋Š” 'AutoClass'๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์ปค์Šคํ…€ ๋ชจ๋ธ์„ ๋งŒ๋“œ๋Š” ๋ฐฉ๋ฒ•์— ๋Œ€ํ•ด ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค:

  • ๋ชจ๋ธ configuration์„ ๊ฐ€์ ธ์˜ค๊ณ  ์‚ฌ์šฉ์ž ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.
  • ๋ชจ๋ธ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ํ…์ŠคํŠธ์— ์‚ฌ์šฉํ•  ๋Š๋ฆฌ๊ฑฐ๋‚˜ ๋น ๋ฅธ ํ† ํฐํ™”๊ธฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค.
  • ๋น„์ „ ์ž‘์—…์„ ์œ„ํ•œ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ์˜ค๋””์˜ค ์ž‘์—…์„ ์œ„ํ•œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
  • ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์šฉ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.

Configuration[[configuration]]

configuration์€ ๋ชจ๋ธ์˜ ํŠน์ • ์†์„ฑ์„ ๋‚˜ํƒ€๋ƒ…๋‹ˆ๋‹ค. ๊ฐ ๋ชจ๋ธ ๊ตฌ์„ฑ์—๋Š” ์„œ๋กœ ๋‹ค๋ฅธ ์†์„ฑ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ๋ชจ๋“  NLP ๋ชจ๋ธ์—๋Š” hidden_size, num_attention_heads, num_hidden_layers ๋ฐ vocab_size ์†์„ฑ์ด ๊ณตํ†ต์œผ๋กœ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ์†์„ฑ์€ ๋ชจ๋ธ์„ ๊ตฌ์„ฑํ•  attention heads ๋˜๋Š” hidden layers์˜ ์ˆ˜๋ฅผ ์ง€์ •ํ•ฉ๋‹ˆ๋‹ค.

DistilBERT ์†์„ฑ์„ ๊ฒ€์‚ฌํ•˜๊ธฐ ์œ„ํ•ด [DistilBertConfig]์— ์ ‘๊ทผํ•˜์—ฌ ์ž์„ธํžˆ ์‚ดํŽด๋ด…๋‹ˆ๋‹ค:

>>> from transformers import DistilBertConfig

>>> config = DistilBertConfig()
>>> print(config)
DistilBertConfig {
  "activation": "gelu",
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.16.2",
  "vocab_size": 30522
}

[DistilBertConfig]๋Š” ๊ธฐ๋ณธ [DistilBertModel]์„ ๋นŒ๋“œํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋˜๋Š” ๋ชจ๋“  ๊ธฐ๋ณธ ์†์„ฑ์„ ํ‘œ์‹œํ•ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ์†์„ฑ์€ ์ปค์Šคํ„ฐ๋งˆ์ด์ง•์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์‹คํ—˜์„ ์œ„ํ•œ ๊ณต๊ฐ„์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด ๊ธฐ๋ณธ ๋ชจ๋ธ์„ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ์ปค์Šคํ„ฐ๋งˆ์ด์ฆˆํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

  • activation ํŒŒ๋ผ๋ฏธํ„ฐ๋กœ ๋‹ค๋ฅธ ํ™œ์„ฑํ™” ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•ด ๋ณด์„ธ์š”.
  • attention_dropout ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์–ดํ…์…˜ ํ™•๋ฅ ์— ๋” ๋†’์€ ๋“œ๋กญ์•„์›ƒ ๋น„์œจ์„ ์‚ฌ์šฉํ•˜์„ธ์š”.
>>> my_config = DistilBertConfig(activation="relu", attention_dropout=0.4)
>>> print(my_config)
DistilBertConfig {
  "activation": "relu",
  "attention_dropout": 0.4,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "transformers_version": "4.16.2",
  "vocab_size": 30522
}

์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ ์†์„ฑ์€ [~PretrainedConfig.from_pretrained] ํ•จ์ˆ˜์—์„œ ์ˆ˜์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> my_config = DistilBertConfig.from_pretrained("distilbert-base-uncased", activation="relu", attention_dropout=0.4)

๋ชจ๋ธ ๊ตฌ์„ฑ์ด ๋งŒ์กฑ์Šค๋Ÿฌ์šฐ๋ฉด [~PretrainedConfig.save_pretrained]๋กœ ์ €์žฅํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์„ค์ • ํŒŒ์ผ์€ ์ง€์ •๋œ ์ž‘์—… ๊ฒฝ๋กœ์— JSON ํŒŒ์ผ๋กœ ์ €์žฅ๋ฉ๋‹ˆ๋‹ค:

>>> my_config.save_pretrained(save_directory="./your_model_save_path")

configuration ํŒŒ์ผ์„ ์žฌ์‚ฌ์šฉํ•˜๋ ค๋ฉด [~PretrainedConfig.from_pretrained]๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฐ€์ ธ์˜ค์„ธ์š”:

>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")

configuration ํŒŒ์ผ์„ ๋”•์…”๋„ˆ๋ฆฌ๋กœ ์ €์žฅํ•˜๊ฑฐ๋‚˜ ์‚ฌ์šฉ์ž ์ •์˜ configuration ์†์„ฑ๊ณผ ๊ธฐ๋ณธ configuration ์†์„ฑ์˜ ์ฐจ์ด์ ๋งŒ ์ €์žฅํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค! ์ž์„ธํ•œ ๋‚ด์šฉ์€ configuration ๋ฌธ์„œ๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ชจ๋ธ[[model]]

๋‹ค์Œ ๋‹จ๊ณ„๋Š” ๋ชจ๋ธ(model)์„ ๋งŒ๋“œ๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ๋Š์Šจํ•˜๊ฒŒ ์•„ํ‚คํ…์ฒ˜๋ผ๊ณ ๋„ ๋ถˆ๋ฆฌ๋Š” ๋ชจ๋ธ์€ ๊ฐ ๊ณ„์ธต์ด ์ˆ˜ํ–‰ํ•˜๋Š” ๋™์ž‘๊ณผ ๋ฐœ์ƒํ•˜๋Š” ์ž‘์—…์„ ์ •์˜ํ•ฉ๋‹ˆ๋‹ค. configuration์˜ num_hidden_layers์™€ ๊ฐ™์€ ์†์„ฑ์€ ์•„ํ‚คํ…์ฒ˜๋ฅผ ์ •์˜ํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค. ๋ชจ๋“  ๋ชจ๋ธ์€ ๊ธฐ๋ณธ ํด๋ž˜์Šค [PreTrainedModel]๊ณผ ์ž…๋ ฅ ์ž„๋ฒ ๋”ฉ ํฌ๊ธฐ ์กฐ์ • ๋ฐ ์…€ํ”„ ์–ดํ…์…˜ ํ—ค๋“œ ๊ฐ€์ง€ ์น˜๊ธฐ์™€ ๊ฐ™์€ ๋ช‡ ๊ฐ€์ง€ ์ผ๋ฐ˜์ ์ธ ๋ฉ”์†Œ๋“œ๋ฅผ ๊ณต์œ ํ•ฉ๋‹ˆ๋‹ค. ๋˜ํ•œ ๋ชจ๋“  ๋ชจ๋ธ์€ torch.nn.Module, tf.keras.Model ๋˜๋Š” flax.linen.Module์˜ ์„œ๋ธŒํด๋ž˜์Šค์ด๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ๋ชจ๋ธ์€ ๊ฐ ํ”„๋ ˆ์ž„์›Œํฌ์˜ ์‚ฌ์šฉ๋ฒ•๊ณผ ํ˜ธํ™˜๋ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ์ž ์ง€์ • configuration ์†์„ฑ์„ ๋ชจ๋ธ์— ๊ฐ€์ ธ์˜ต๋‹ˆ๋‹ค:
>>> from transformers import DistilBertModel

>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/config.json")
>>> model = DistilBertModel(my_config)

์ด์ œ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋Œ€์‹  ์ž„์˜์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋ธ์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ธฐ ์ „๊นŒ์ง€๋Š” ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์€ ๋น„์šฉ๊ณผ ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋˜๋Š” ํ”„๋กœ์„ธ์Šค์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค์˜ ์ผ๋ถ€๋งŒ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ๋” ๋นจ๋ฆฌ ์–ป์œผ๋ ค๋ฉด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ [~PreTrainedModel.from_pretrained]๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> model = DistilBertModel.from_pretrained("distilbert-base-uncased")

๐Ÿค— Transformers์—์„œ ์ œ๊ณตํ•œ ๋ชจ๋ธ์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration์„ ์ž๋™์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration ์†์„ฑ์˜ ์ผ๋ถ€ ๋˜๋Š” ์ „๋ถ€๋ฅผ ์‚ฌ์šฉ์ž ์ง€์ •์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> model = DistilBertModel.from_pretrained("distilbert-base-uncased", config=my_config)
์‚ฌ์šฉ์ž ์ง€์ • configuration ์†์„ฑ์„ ๋ชจ๋ธ์— ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค:
>>> from transformers import TFDistilBertModel

>>> my_config = DistilBertConfig.from_pretrained("./your_model_save_path/my_config.json")
>>> tf_model = TFDistilBertModel(my_config)

์ด์ œ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜ ๋Œ€์‹  ์ž„์˜์˜ ๊ฐ’์„ ๊ฐ€์ง„ ๋ชจ๋ธ์ด ์ƒ์„ฑ๋ฉ๋‹ˆ๋‹ค. ์ด ๋ชจ๋ธ์„ ํ›ˆ๋ จํ•˜๊ธฐ ์ „๊นŒ์ง€๋Š” ์œ ์šฉํ•˜๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Šต๋‹ˆ๋‹ค. ํ›ˆ๋ จ์€ ๋น„์šฉ๊ณผ ์‹œ๊ฐ„์ด ๋งŽ์ด ์†Œ์š”๋˜๋Š” ํ”„๋กœ์„ธ์Šค์ž…๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ํ›ˆ๋ จ์— ํ•„์š”ํ•œ ๋ฆฌ์†Œ์Šค์˜ ์ผ๋ถ€๋งŒ ์‚ฌ์šฉํ•˜๋ฉด์„œ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ๋” ๋นจ๋ฆฌ ์–ป์œผ๋ ค๋ฉด ์‚ฌ์ „ ํ›ˆ๋ จ๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ข‹์Šต๋‹ˆ๋‹ค.

์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ [~TFPreTrainedModel.from_pretrained]๋กœ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> tf_model = TFDistilBertModel.from_pretrained("distilbert-base-uncased")

๐Ÿค— Transformers์—์„œ ์ œ๊ณตํ•œ ๋ชจ๋ธ์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๊ฐ€์ค‘์น˜๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration์„ ์ž๋™์œผ๋กœ ๋ถˆ๋Ÿฌ์˜ต๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์›ํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ ๋ชจ๋ธ configuration ์†์„ฑ์˜ ์ผ๋ถ€ ๋˜๋Š” ์ „๋ถ€๋ฅผ ์‚ฌ์šฉ์ž ์ง€์ •์œผ๋กœ ๋ฐ”๊ฟ€ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> tf_model = TFDistilBertModel.from_pretrained("distilbert-base-uncased", config=my_config)

๋ชจ๋ธ ํ—ค๋“œ[[model-heads]]

์ด ์‹œ์ ์—์„œ *์€๋‹‰ ์ƒํƒœ(hidden state)*๋ฅผ ์ถœ๋ ฅํ•˜๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์„ ๊ฐ–๊ฒŒ ๋ฉ๋‹ˆ๋‹ค. ์€๋‹‰ ์ƒํƒœ๋Š” ์ตœ์ข… ์ถœ๋ ฅ์„ ์ƒ์„ฑํ•˜๊ธฐ ์œ„ํ•ด ๋ชจ๋ธ ํ—ค๋“œ์— ์ž…๋ ฅ์œผ๋กœ ์ „๋‹ฌ๋ฉ๋‹ˆ๋‹ค. ๐Ÿค— Transformers๋Š” ๋ชจ๋ธ์ด ํ•ด๋‹น ์ž‘์—…์„ ์ง€์›ํ•˜๋Š” ํ•œ ๊ฐ ์ž‘์—…๋งˆ๋‹ค ๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค(์ฆ‰, ๋ฒˆ์—ญ๊ณผ ๊ฐ™์€ ์‹œํ€€์Šค ๊ฐ„ ์ž‘์—…์—๋Š” DistilBERT๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†์Œ).

์˜ˆ๋ฅผ ๋“ค์–ด, [`DistilBertForSequenceClassification`]์€ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๊ฐ€ ์žˆ๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๋Š” ํ’€๋ง๋œ ์ถœ๋ ฅ ์œ„์— ์žˆ๋Š” ์„ ํ˜• ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.
>>> from transformers import DistilBertForSequenceClassification

>>> model = DistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋กœ ์ „ํ™˜ํ•˜์—ฌ ์ด ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋‹ค๋ฅธ ์ž‘์—…์— ์‰ฝ๊ฒŒ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ์ž‘์—…์˜ ๊ฒฝ์šฐ, [DistilBertForQuestionAnswering] ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ํ—ค๋“œ๋Š” ์ˆจ๊ฒจ์ง„ ์ƒํƒœ ์ถœ๋ ฅ ์œ„์— ์„ ํ˜• ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

>>> from transformers import DistilBertForQuestionAnswering

>>> model = DistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased")
์˜ˆ๋ฅผ ๋“ค์–ด, [`TFDistilBertForSequenceClassification`]์€ ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๊ฐ€ ์žˆ๋Š” ๊ธฐ๋ณธ DistilBERT ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ๋Š” ํ’€๋ง๋œ ์ถœ๋ ฅ ์œ„์— ์žˆ๋Š” ์„ ํ˜• ๋ ˆ์ด์–ด์ž…๋‹ˆ๋‹ค.
>>> from transformers import TFDistilBertForSequenceClassification

>>> tf_model = TFDistilBertForSequenceClassification.from_pretrained("distilbert-base-uncased")

๋‹ค๋ฅธ ๋ชจ๋ธ ํ—ค๋“œ๋กœ ์ „ํ™˜ํ•˜์—ฌ ์ด ์ฒดํฌํฌ์ธํŠธ๋ฅผ ๋‹ค๋ฅธ ์ž‘์—…์— ์‰ฝ๊ฒŒ ์žฌ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ์ž‘์—…์˜ ๊ฒฝ์šฐ, [TFDistilBertForQuestionAnswering] ๋ชจ๋ธ ํ—ค๋“œ๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์งˆ์˜์‘๋‹ต ํ—ค๋“œ๋Š” ์ˆจ๊ฒจ์ง„ ์ƒํƒœ ์ถœ๋ ฅ ์œ„์— ์„ ํ˜• ๋ ˆ์ด์–ด๊ฐ€ ์žˆ๋‹ค๋Š” ์ ์„ ์ œ์™ธํ•˜๋ฉด ์‹œํ€€์Šค ๋ถ„๋ฅ˜ ํ—ค๋“œ์™€ ์œ ์‚ฌํ•ฉ๋‹ˆ๋‹ค.

>>> from transformers import TFDistilBertForQuestionAnswering

>>> tf_model = TFDistilBertForQuestionAnswering.from_pretrained("distilbert-base-uncased")

ํ† ํฌ๋‚˜์ด์ €[[tokenizer]]

ํ…์ŠคํŠธ ๋ฐ์ดํ„ฐ์— ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๊ธฐ ์ „์— ๋งˆ์ง€๋ง‰์œผ๋กœ ํ•„์š”ํ•œ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋Š” ์›์‹œ ํ…์ŠคํŠธ๋ฅผ ํ…์„œ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ† ํฌ๋‚˜์ด์ €์ž…๋‹ˆ๋‹ค. ๐Ÿค— Transformers์— ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ํ† ํฌ๋‚˜์ด์ €๋Š” ๋‘ ๊ฐ€์ง€ ์œ ํ˜•์ด ์žˆ์Šต๋‹ˆ๋‹ค:

  • [PreTrainedTokenizer]: ํŒŒ์ด์ฌ์œผ๋กœ ๊ตฌํ˜„๋œ ํ† ํฌ๋‚˜์ด์ €์ž…๋‹ˆ๋‹ค.
  • [PreTrainedTokenizerFast]: Rust ๊ธฐ๋ฐ˜ ๐Ÿค— Tokenizer ๋ผ์ด๋ธŒ๋Ÿฌ๋ฆฌ๋กœ ๋งŒ๋“ค์–ด์ง„ ํ† ํฌ๋‚˜์ด์ €์ž…๋‹ˆ๋‹ค. ์ด ํ† ํฌ๋‚˜์ด์ €๋Š” Rust๋กœ ๊ตฌํ˜„๋˜์–ด ๋ฐฐ์น˜ ํ† ํฐํ™”์—์„œ ํŠนํžˆ ๋น ๋ฆ…๋‹ˆ๋‹ค. ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋Š” ํ† ํฐ์„ ์›๋ž˜ ๋‹จ์–ด๋‚˜ ๋ฌธ์ž์— ๋งคํ•‘ํ•˜๋Š” ์˜คํ”„์…‹ ๋งคํ•‘๊ณผ ๊ฐ™์€ ์ถ”๊ฐ€ ๋ฉ”์†Œ๋“œ๋„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋‘ ํ† ํฌ๋‚˜์ด์ € ๋ชจ๋‘ ์ธ์ฝ”๋”ฉ ๋ฐ ๋””์ฝ”๋”ฉ, ์ƒˆ ํ† ํฐ ์ถ”๊ฐ€, ํŠน์ˆ˜ ํ† ํฐ ๊ด€๋ฆฌ์™€ ๊ฐ™์€ ์ผ๋ฐ˜์ ์ธ ๋ฐฉ๋ฒ•์„ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค.

๋ชจ๋“  ๋ชจ๋ธ์ด ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฒƒ์€ ์•„๋‹™๋‹ˆ๋‹ค. ์ด ํ‘œ์—์„œ ๋ชจ๋ธ์˜ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ € ์ง€์› ์—ฌ๋ถ€๋ฅผ ํ™•์ธํ•˜์„ธ์š”.

ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ง์ ‘ ํ•™์Šตํ•œ ๊ฒฝ์šฐ, ์–ดํœ˜(vocabulary) ํŒŒ์ผ์—์„œ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค:

>>> from transformers import DistilBertTokenizer

>>> my_tokenizer = DistilBertTokenizer(vocab_file="my_vocab_file.txt", do_lower_case=False, padding_side="left")

์‚ฌ์šฉ์ž ์ง€์ • ํ† ํฌ๋‚˜์ด์ €์˜ ์–ดํœ˜๋Š” ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ํ† ํฌ๋‚˜์ด์ €์—์„œ ์ƒ์„ฑ๋œ ์–ดํœ˜์™€ ๋‹ค๋ฅผ ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์„ ๊ธฐ์–ตํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ์–ดํœ˜๋ฅผ ์‚ฌ์šฉํ•ด์•ผ ํ•˜๋ฉฐ, ๊ทธ๋ ‡์ง€ ์•Š์œผ๋ฉด ์ž…๋ ฅ์ด ์˜๋ฏธ๋ฅผ ๊ฐ–์ง€ ๋ชปํ•ฉ๋‹ˆ๋‹ค. [DistilBertTokenizer] ํด๋ž˜์Šค๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์˜ ์–ดํœ˜๋กœ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import DistilBertTokenizer

>>> slow_tokenizer = DistilBertTokenizer.from_pretrained("distilbert-base-uncased")

[DistilBertTokenizerFast] ํด๋ž˜์Šค๋กœ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import DistilBertTokenizerFast

>>> fast_tokenizer = DistilBertTokenizerFast.from_pretrained("distilbert-base-uncased")

[AutoTokenizer]๋Š” ๊ธฐ๋ณธ์ ์œผ๋กœ ๋น ๋ฅธ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฐ€์ ธ์˜ค๋ ค๊ณ  ํ•ฉ๋‹ˆ๋‹ค. ์ด ๋™์ž‘์„ ๋น„ํ™œ์„ฑํ™”ํ•˜๋ ค๋ฉด from_pretrained์—์„œ use_fast=False๋ฅผ ์„ค์ •ํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ[[image-processor]]

์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ(image processor)๋Š” ๋น„์ „ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ [~image_processing_utils.ImageProcessingMixin] ํด๋ž˜์Šค์—์„œ ์ƒ์†ํ•ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉํ•˜๋ ค๋ฉด ์‚ฌ์šฉ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋œ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ด๋ฏธ์ง€ ๋ถ„๋ฅ˜์— ViT๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ [ViTImageProcessor]๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import ViTImageProcessor

>>> vit_extractor = ViTImageProcessor()
>>> print(vit_extractor)
ViTImageProcessor {
  "do_normalize": true,
  "do_resize": true,
  "feature_extractor_type": "ViTImageProcessor",
  "image_mean": [
    0.5,
    0.5,
    0.5
  ],
  "image_std": [
    0.5,
    0.5,
    0.5
  ],
  "resample": 2,
  "size": 224
}

์‚ฌ์šฉ์ž ์ง€์ •์„ ์›ํ•˜์ง€ ์•Š๋Š” ๊ฒฝ์šฐ from_pretrained ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ถˆ๋Ÿฌ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ์ž ์ง€์ • ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ๋ฅผ ์ƒ์„ฑํ•˜๋ ค๋ฉด [ViTImageProcessor] ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import ViTImageProcessor

>>> my_vit_extractor = ViTImageProcessor(resample="PIL.Image.BOX", do_normalize=False, image_mean=[0.3, 0.3, 0.3])
>>> print(my_vit_extractor)
ViTImageProcessor {
  "do_normalize": false,
  "do_resize": true,
  "feature_extractor_type": "ViTImageProcessor",
  "image_mean": [
    0.3,
    0.3,
    0.3
  ],
  "image_std": [
    0.5,
    0.5,
    0.5
  ],
  "resample": "PIL.Image.BOX",
  "size": 224
}

ํŠน์„ฑ ์ถ”์ถœ๊ธฐ[[feature-extractor]]

ํŠน์„ฑ ์ถ”์ถœ๊ธฐ(feature extractor)๋Š” ์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•ฉ๋‹ˆ๋‹ค. ๊ธฐ๋ณธ [~feature_extraction_utils.FeatureExtractionMixin] ํด๋ž˜์Šค์—์„œ ์ƒ์†๋˜๋ฉฐ, ์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด [SequenceFeatureExtractor] ํด๋ž˜์Šค์—์„œ ์ƒ์†ํ•  ์ˆ˜๋„ ์žˆ์Šต๋‹ˆ๋‹ค.

์‚ฌ์šฉํ•˜๋ ค๋ฉด ์‚ฌ์šฉ ์ค‘์ธ ๋ชจ๋ธ๊ณผ ์—ฐ๊ฒฐ๋œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์˜ค๋””์˜ค ๋ถ„๋ฅ˜์— Wav2Vec2๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ ๊ธฐ๋ณธ [Wav2Vec2FeatureExtractor]๋ฅผ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import Wav2Vec2FeatureExtractor

>>> w2v2_extractor = Wav2Vec2FeatureExtractor()
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
  "do_normalize": true,
  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
  "feature_size": 1,
  "padding_side": "right",
  "padding_value": 0.0,
  "return_attention_mask": false,
  "sampling_rate": 16000
}

์‚ฌ์šฉ์ž ์ง€์ •์ด ํ•„์š”ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ from_pretrained ๋ฉ”์†Œ๋“œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ์˜ ๊ธฐ๋ณธ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ใ…๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋ถˆ๋Ÿฌ ์˜ค๋ฉด ๋ฉ๋‹ˆ๋‹ค.

์‚ฌ์šฉ์ž ์ง€์ • ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ๋งŒ๋“ค๋ ค๋ฉด [Wav2Vec2FeatureExtractor] ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ˆ˜์ •ํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import Wav2Vec2FeatureExtractor

>>> w2v2_extractor = Wav2Vec2FeatureExtractor(sampling_rate=8000, do_normalize=False)
>>> print(w2v2_extractor)
Wav2Vec2FeatureExtractor {
  "do_normalize": false,
  "feature_extractor_type": "Wav2Vec2FeatureExtractor",
  "feature_size": 1,
  "padding_side": "right",
  "padding_value": 0.0,
  "return_attention_mask": false,
  "sampling_rate": 8000
}

ํ”„๋กœ์„ธ์„œ[[processor]]

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ์ž‘์—…์„ ์ง€์›ํ•˜๋Š” ๋ชจ๋ธ์˜ ๊ฒฝ์šฐ, ๐Ÿค— Transformers๋Š” ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ๋ฐ ํ† ํฌ๋‚˜์ด์ €์™€ ๊ฐ™์€ ์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ๋‹จ์ผ ๊ฐ์ฒด๋กœ ํŽธ๋ฆฌํ•˜๊ฒŒ ๋ž˜ํ•‘ํ•˜๋Š” ํ”„๋กœ์„ธ์„œ ํด๋ž˜์Šค๋ฅผ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด, ์ž๋™ ์Œ์„ฑ ์ธ์‹ ์ž‘์—…(Automatic Speech Recognition task (ASR))์— [Wav2Vec2Processor]๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค๊ณ  ๊ฐ€์ •ํ•ด ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค. ์ž๋™ ์Œ์„ฑ ์ธ์‹ ์ž‘์—…์€ ์˜ค๋””์˜ค๋ฅผ ํ…์ŠคํŠธ๋กœ ๋ณ€ํ™˜ํ•˜๋ฏ€๋กœ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ์™€ ํ† ํฌ๋‚˜์ด์ €๊ฐ€ ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค.

์˜ค๋””์˜ค ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•  ํŠน์„ฑ ์ถ”์ถœ๊ธฐ๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค:

>>> from transformers import Wav2Vec2FeatureExtractor

>>> feature_extractor = Wav2Vec2FeatureExtractor(padding_value=1.0, do_normalize=True)

ํ…์ŠคํŠธ ์ž…๋ ฅ์„ ์ฒ˜๋ฆฌํ•  ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๋งŒ๋“ญ๋‹ˆ๋‹ค:

>>> from transformers import Wav2Vec2CTCTokenizer

>>> tokenizer = Wav2Vec2CTCTokenizer(vocab_file="my_vocab_file.txt")

[Wav2Vec2Processor]์—์„œ ํŠน์„ฑ ์ถ”์ถœ๊ธฐ์™€ ํ† ํฌ๋‚˜์ด์ €๋ฅผ ๊ฒฐํ•ฉํ•ฉ๋‹ˆ๋‹ค:

>>> from transformers import Wav2Vec2Processor

>>> processor = Wav2Vec2Processor(feature_extractor=feature_extractor, tokenizer=tokenizer)

configuration๊ณผ ๋ชจ๋ธ์ด๋ผ๋Š” ๋‘ ๊ฐ€์ง€ ๊ธฐ๋ณธ ํด๋ž˜์Šค์™€ ์ถ”๊ฐ€ ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค(ํ† ํฌ๋‚˜์ด์ €, ์ด๋ฏธ์ง€ ํ”„๋กœ์„ธ์„œ, ํŠน์„ฑ ์ถ”์ถœ๊ธฐ ๋˜๋Š” ํ”„๋กœ์„ธ์„œ)๋ฅผ ์‚ฌ์šฉํ•˜๋ฉด ๐Ÿค— Transformers์—์„œ ์ง€์›ํ•˜๋Š” ๋ชจ๋“  ๋ชจ๋ธ์„ ๋งŒ๋“ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ฐ ๊ธฐ๋ณธ ํด๋ž˜์Šค๋Š” ๊ตฌ์„ฑ์ด ๊ฐ€๋Šฅํ•˜๋ฏ€๋กœ ์›ํ•˜๋Š” ํŠน์ • ์†์„ฑ์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํ•™์Šต์„ ์œ„ํ•ด ๋ชจ๋ธ์„ ์‰ฝ๊ฒŒ ์„ค์ •ํ•˜๊ฑฐ๋‚˜ ๊ธฐ์กด์˜ ์‚ฌ์ „ ํ•™์Šต๋œ ๋ชจ๋ธ์„ ์ˆ˜์ •ํ•˜์—ฌ ๋ฏธ์„ธ ์กฐ์ •ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.