MASR / transformers /docs /source /ko /pipeline_tutorial.md
Yuvarraj's picture
Initial commit
a0db2f9

์ถ”๋ก ์„ ์œ„ํ•œ Pipeline[[pipelines-for-inference]]

[pipeline]์„ ์‚ฌ์šฉํ•˜๋ฉด ์–ธ์–ด, ์ปดํ“จํ„ฐ ๋น„์ „, ์˜ค๋””์˜ค ๋ฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํƒœ์Šคํฌ์— ๋Œ€ํ•œ ์ถ”๋ก ์„ ์œ„ํ•ด Hub์˜ ์–ด๋–ค ๋ชจ๋ธ์ด๋“  ์‰ฝ๊ฒŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠน์ • ๋ถ„์•ผ์— ๋Œ€ํ•œ ๊ฒฝํ—˜์ด ์—†๊ฑฐ๋‚˜, ๋ชจ๋ธ์„ ์ด๋ฃจ๋Š” ์ฝ”๋“œ๊ฐ€ ์ต์ˆ™ํ•˜์ง€ ์•Š์€ ๊ฒฝ์šฐ์—๋„ [pipeline]์„ ์‚ฌ์šฉํ•ด์„œ ์ถ”๋ก ํ•  ์ˆ˜ ์žˆ์–ด์š”! ์ด ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” ๋‹ค์Œ์„ ๋ฐฐ์›Œ๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

  • ์ถ”๋ก ์„ ์œ„ํ•ด [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ํŠน์ • ํ† ํฌ๋‚˜์ด์ € ๋˜๋Š” ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•
  • ์–ธ์–ด, ์ปดํ“จํ„ฐ ๋น„์ „, ์˜ค๋””์˜ค ๋ฐ ๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ ํƒœ์Šคํฌ์—์„œ [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•

์ง€์›ํ•˜๋Š” ๋ชจ๋“  ํƒœ์Šคํฌ์™€ ์“ธ ์ˆ˜ ์žˆ๋Š” ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋‹ด์€ ๋ชฉ๋ก์€ [pipeline] ์„ค๋ช…์„œ๋ฅผ ์ฐธ๊ณ ํ•ด์ฃผ์„ธ์š”.

Pipeline ์‚ฌ์šฉํ•˜๊ธฐ[[pipeline-usage]]

๊ฐ ํƒœ์Šคํฌ๋งˆ๋‹ค ๊ณ ์œ ์˜ [pipeline]์ด ์žˆ์ง€๋งŒ, ๊ฐœ๋ณ„ ํŒŒ์ดํ”„๋ผ์ธ์„ ๋‹ด๊ณ ์žˆ๋Š” ์ถ”์ƒํ™”๋œ [pipeline]๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์ผ๋ฐ˜์ ์œผ๋กœ ๋” ๊ฐ„๋‹จํ•ฉ๋‹ˆ๋‹ค. [pipeline]์€ ํƒœ์Šคํฌ์— ์•Œ๋งž๊ฒŒ ์ถ”๋ก ์ด ๊ฐ€๋Šฅํ•œ ๊ธฐ๋ณธ ๋ชจ๋ธ๊ณผ ์ „์ฒ˜๋ฆฌ ํด๋ž˜์Šค๋ฅผ ์ž๋™์œผ๋กœ ๋กœ๋“œํ•ฉ๋‹ˆ๋‹ค.

  1. ๋จผ์ € [pipeline]์„ ์ƒ์„ฑํ•˜๊ณ  ํƒœ์Šคํฌ๋ฅผ ์ง€์ •ํ•˜์„ธ์š”.
>>> from transformers import pipeline

>>> generator = pipeline(task="automatic-speech-recognition")
  1. ๊ทธ๋ฆฌ๊ณ  [pipeline]์— ์ž…๋ ฅ์„ ๋„ฃ์–ด์ฃผ์„ธ์š”.
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP LIVE UP THE TRUE MEANING OF ITS TREES'}

๊ธฐ๋Œ€ํ–ˆ๋˜ ๊ฒฐ๊ณผ๊ฐ€ ์•„๋‹Œ๊ฐ€์š”? Hub์—์„œ ๊ฐ€์žฅ ๋งŽ์ด ๋‹ค์šด๋กœ๋“œ๋œ ์ž๋™ ์Œ์„ฑ ์ธ์‹ ๋ชจ๋ธ๋กœ ๋” ๋‚˜์€ ๊ฒฐ๊ณผ๋ฅผ ์–ป์„ ์ˆ˜ ์žˆ๋Š”์ง€ ํ™•์ธํ•ด๋ณด์„ธ์š”. ๋‹ค์Œ์€ openai/whisper-large๋กœ ์‹œ๋„ํ•ด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

>>> generator = pipeline(model="openai/whisper-large")
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': ' I have a dream that one day this nation will rise up and live out the true meaning of its creed.'}

ํ›จ์”ฌ ๋” ๋‚˜์•„์กŒ๊ตฐ์š”! Hub์˜ ๋ชจ๋ธ๋“ค์€ ์—ฌ๋Ÿฌ ๋‹ค์–‘ํ•œ ์–ธ์–ด์™€ ์ „๋ฌธ๋ถ„์•ผ๋ฅผ ์•„์šฐ๋ฅด๊ธฐ ๋•Œ๋ฌธ์— ๊ผญ ์ž์‹ ์˜ ์–ธ์–ด๋‚˜ ๋ถ„์•ผ์— ํŠนํ™”๋œ ๋ชจ๋ธ์„ ์ฐพ์•„๋ณด์‹œ๊ธฐ ๋ฐ”๋ž๋‹ˆ๋‹ค. ๋ธŒ๋ผ์šฐ์ €๋ฅผ ๋ฒ—์–ด๋‚  ํ•„์š”์—†์ด Hub์—์„œ ์ง์ ‘ ๋ชจ๋ธ์˜ ์ถœ๋ ฅ์„ ํ™•์ธํ•˜๊ณ  ๋‹ค๋ฅธ ๋ชจ๋ธ๊ณผ ๋น„๊ตํ•ด์„œ ์ž์‹ ์˜ ์ƒํ™ฉ์— ๋” ์ ํ•ฉํ•œ์ง€, ์• ๋งคํ•œ ์ž…๋ ฅ์„ ๋” ์ž˜ ์ฒ˜๋ฆฌํ•˜๋Š”์ง€๋„ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ๋งŒ์•ฝ ์ƒํ™ฉ์— ์•Œ๋งž๋Š” ๋ชจ๋ธ์„ ์—†๋‹ค๋ฉด ์–ธ์ œ๋‚˜ ์ง์ ‘ ํ›ˆ๋ จ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค!

์ž…๋ ฅ์ด ์—ฌ๋Ÿฌ ๊ฐœ ์žˆ๋Š” ๊ฒฝ์šฐ, ๋ฆฌ์ŠคํŠธ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

generator(
    [
        "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac",
        "https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/1.flac",
    ]
)

์ „์ฒด ๋ฐ์ดํ„ฐ์„ธํŠธ์„ ์ˆœํšŒํ•˜๊ฑฐ๋‚˜ ์›น์„œ๋ฒ„์— ์˜ฌ๋ ค๋‘์–ด ์ถ”๋ก ์— ์‚ฌ์šฉํ•˜๊ณ  ์‹ถ๋‹ค๋ฉด, ๊ฐ ์ƒ์„ธ ํŽ˜์ด์ง€๋ฅผ ์ฐธ์กฐํ•˜์„ธ์š”.

๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ Pipeline ์‚ฌ์šฉํ•˜๊ธฐ

์›น์„œ๋ฒ„์—์„œ Pipeline ์‚ฌ์šฉํ•˜๊ธฐ

๋งค๊ฐœ๋ณ€์ˆ˜[[parameters]]

[pipeline]์€ ๋งŽ์€ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ํŠน์ • ํƒœ์Šคํฌ์šฉ์ธ ๊ฒƒ๋„ ์žˆ๊ณ , ๋ฒ”์šฉ์ธ ๊ฒƒ๋„ ์žˆ์Šต๋‹ˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ์›ํ•˜๋Š” ์œ„์น˜์— ์–ด๋””๋“  ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๋„ฃ์„ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

generator(model="openai/whisper-large", my_parameter=1)
out = generate(...)  # This will use `my_parameter=1`.
out = generate(..., my_parameter=2)  # This will override and use `my_parameter=2`.
out = generate(...)  # This will go back to using `my_parameter=1`.

์ค‘์š”ํ•œ 3๊ฐ€์ง€ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ดํŽด๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.

๊ธฐ๊ธฐ(device)[[device]]

device=n์ฒ˜๋Ÿผ ๊ธฐ๊ธฐ๋ฅผ ์ง€์ •ํ•˜๋ฉด ํŒŒ์ดํ”„๋ผ์ธ์ด ์ž๋™์œผ๋กœ ํ•ด๋‹น ๊ธฐ๊ธฐ์— ๋ชจ๋ธ์„ ๋ฐฐ์น˜ํ•ฉ๋‹ˆ๋‹ค. ํŒŒ์ดํ† ์น˜์—์„œ๋‚˜ ํ…์„œํ”Œ๋กœ์šฐ์—์„œ๋„ ๋ชจ๋‘ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.

generator(model="openai/whisper-large", device=0)

๋ชจ๋ธ์ด GPU ํ•˜๋‚˜์— ๋Œ์•„๊ฐ€๊ธฐ ๋ฒ„๊ฒ๋‹ค๋ฉด, device_map="auto"๋ฅผ ์ง€์ •ํ•ด์„œ ๐Ÿค— Accelerate๊ฐ€ ๋ชจ๋ธ ๊ฐ€์ค‘์น˜๋ฅผ ์–ด๋–ป๊ฒŒ ๋กœ๋“œํ•˜๊ณ  ์ €์žฅํ• ์ง€ ์ž๋™์œผ๋กœ ๊ฒฐ์ •ํ•˜๋„๋ก ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

#!pip install accelerate
generator(model="openai/whisper-large", device_map="auto")

๋ฐฐ์น˜ ์‚ฌ์ด์ฆˆ[[batch-size]]

๊ธฐ๋ณธ์ ์œผ๋กœ ํŒŒ์ดํ”„๋ผ์ธ์€ ์—ฌ๊ธฐ์— ๋‚˜์˜จ ์ด์œ ๋กœ ์ถ”๋ก ์„ ์ผ๊ด„ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. ๊ฐ„๋‹จํžˆ ์„ค๋ช…ํ•˜์ž๋ฉด ์ผ๊ด„ ์ฒ˜๋ฆฌ๊ฐ€ ๋ฐ˜๋“œ์‹œ ๋” ๋น ๋ฅด์ง€ ์•Š๊ณ  ์˜คํžˆ๋ ค ๋” ๋Š๋ ค์งˆ ์ˆ˜๋„ ์žˆ๊ธฐ ๋•Œ๋ฌธ์ž…๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ์ž์‹ ์˜ ์ƒํ™ฉ์— ์ ํ•ฉํ•˜๋‹ค๋ฉด, ์ด๋ ‡๊ฒŒ ์‚ฌ์šฉํ•˜์„ธ์š”.

generator(model="openai/whisper-large", device=0, batch_size=2)
audio_filenames = [f"audio_{i}.flac" for i in range(10)]
texts = generator(audio_filenames)

ํŒŒ์ดํ”„๋ผ์ธ ์œ„ ์ œ๊ณต๋œ 10๊ฐœ์˜ ์˜ค๋””์˜ค ํŒŒ์ผ์„ ์ถ”๊ฐ€๋กœ ์ฒ˜๋ฆฌํ•˜๋Š” ์ฝ”๋“œ ์—†์ด (์ผ๊ด„ ์ฒ˜๋ฆฌ์— ๋ณด๋‹ค ํšจ๊ณผ์ ์ธ GPU ์œ„) ๋ชจ๋ธ์— 2๊ฐœ์”ฉ ์ „๋‹ฌํ•ฉ๋‹ˆ๋‹ค. ์ถœ๋ ฅ์€ ์ผ๊ด„ ์ฒ˜๋ฆฌํ•˜์ง€ ์•Š์•˜์„ ๋•Œ์™€ ๋˜‘๊ฐ™์•„์•ผ ํ•ฉ๋‹ˆ๋‹ค. ํŒŒ์ดํ”„๋ผ์ธ์—์„œ ์†๋„๋ฅผ ๋” ๋‚ผ ์ˆ˜๋„ ์žˆ๋Š” ๋ฐฉ๋ฒ• ์ค‘ ํ•˜๋‚˜์ผ ๋ฟ์ž…๋‹ˆ๋‹ค.

ํŒŒ์ดํ”„๋ผ์ธ์€ ์ผ๊ด„ ์ฒ˜๋ฆฌ์˜ ๋ณต์žกํ•œ ๋ถ€๋ถ„์„ ์ค„์—ฌ์ฃผ๊ธฐ๋„ ํ•ฉ๋‹ˆ๋‹ค. (์˜ˆ๋ฅผ ๋“ค์–ด ๊ธด ์˜ค๋””์˜ค ํŒŒ์ผ์ฒ˜๋Ÿผ) ์—ฌ๋Ÿฌ ๋ถ€๋ถ„์œผ๋กœ ๋‚˜๋ˆ ์•ผ ๋ชจ๋ธ์ด ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ๋Š” ๊ฒƒ์„ chunk batching์ด๋ผ๊ณ  ํ•˜๋Š”๋ฐ, ํŒŒ์ดํ”„๋ผ์ธ์„ ์‚ฌ์šฉํ•˜๋ฉด ์ž๋™์œผ๋กœ ๋‚˜๋ˆ ์ค๋‹ˆ๋‹ค.

ํŠน์ • ํƒœ์Šคํฌ์šฉ ๋งค๊ฐœ๋ณ€์ˆ˜[[task-specific-parameters]]

๊ฐ ํƒœ์Šคํฌ๋งˆ๋‹ค ๊ตฌํ˜„ํ•  ๋•Œ ์œ ์—ฐ์„ฑ๊ณผ ์˜ต์…˜์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ํƒœ์Šคํฌ์šฉ ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด [transformers.AutomaticSpeechRecognitionPipeline.__call__] ๋ฉ”์„œ๋“œ์—๋Š” ๋™์˜์ƒ์˜ ์ž๋ง‰์„ ๋„ฃ์„ ๋•Œ ์œ ์šฉํ•  ๊ฒƒ ๊ฐ™์€ return_timestamps ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

>>> # Not using whisper, as it cannot provide timestamps.
>>> generator = pipeline(model="facebook/wav2vec2-large-960h-lv60-self", return_timestamps="word")
>>> generator("https://huggingface.co/datasets/Narsil/asr_dummy/resolve/main/mlk.flac")
{'text': 'I HAVE A DREAM BUT ONE DAY THIS NATION WILL RISE UP AND LIVE OUT THE TRUE MEANING OF ITS CREED', 'chunks': [{'text': 'I', 'timestamp': (1.22, 1.24)}, {'text': 'HAVE', 'timestamp': (1.42, 1.58)}, {'text': 'A', 'timestamp': (1.66, 1.68)}, {'text': 'DREAM', 'timestamp': (1.76, 2.14)}, {'text': 'BUT', 'timestamp': (3.68, 3.8)}, {'text': 'ONE', 'timestamp': (3.94, 4.06)}, {'text': 'DAY', 'timestamp': (4.16, 4.3)}, {'text': 'THIS', 'timestamp': (6.36, 6.54)}, {'text': 'NATION', 'timestamp': (6.68, 7.1)}, {'text': 'WILL', 'timestamp': (7.32, 7.56)}, {'text': 'RISE', 'timestamp': (7.8, 8.26)}, {'text': 'UP', 'timestamp': (8.38, 8.48)}, {'text': 'AND', 'timestamp': (10.08, 10.18)}, {'text': 'LIVE', 'timestamp': (10.26, 10.48)}, {'text': 'OUT', 'timestamp': (10.58, 10.7)}, {'text': 'THE', 'timestamp': (10.82, 10.9)}, {'text': 'TRUE', 'timestamp': (10.98, 11.18)}, {'text': 'MEANING', 'timestamp': (11.26, 11.58)}, {'text': 'OF', 'timestamp': (11.66, 11.7)}, {'text': 'ITS', 'timestamp': (11.76, 11.88)}, {'text': 'CREED', 'timestamp': (12.0, 12.38)}]}

๋ณด์‹œ๋‹ค์‹œํ”ผ ๋ชจ๋ธ์ด ํ…์ŠคํŠธ๋ฅผ ์ถ”๋ก ํ•  ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฐ ๋‹จ์–ด๋ฅผ ๋งํ•œ ์‹œ์ ๊นŒ์ง€๋„ ์ถœ๋ ฅํ–ˆ์Šต๋‹ˆ๋‹ค.

ํƒœ์Šคํฌ๋งˆ๋‹ค ๋‹ค์–‘ํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋Š”๋ฐ์š”. ์›ํ•˜๋Š” ํƒœ์Šคํฌ์˜ API๋ฅผ ์ฐธ์กฐํ•ด์„œ ๋ฐ”๊ฟ”๋ณผ ์ˆ˜ ์žˆ๋Š” ์—ฌ๋Ÿฌ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์‚ดํŽด๋ณด์„ธ์š”! ์ง€๊ธˆ๊นŒ์ง€ ๋‹ค๋ค„๋ณธ [~transformers.AutomaticSpeechRecognitionPipeline]์—๋Š” chunk_length_s ๋งค๊ฐœ๋ณ€์ˆ˜๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ํ™”๋‚˜ 1์‹œ๊ฐ„ ๋ถ„๋Ÿ‰์˜ ๋™์˜์ƒ์˜ ์ž๋ง‰ ์ž‘์—…์„ ํ•  ๋•Œ์ฒ˜๋Ÿผ, ์ผ๋ฐ˜์ ์œผ๋กœ ๋ชจ๋ธ์ด ์ž์ฒด์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์—†๋Š” ๋งค์šฐ ๊ธด ์˜ค๋””์˜ค ํŒŒ์ผ์„ ์ฒ˜๋ฆฌํ•  ๋•Œ ์œ ์šฉํ•˜์ฃ .

๋„์›€์ด ๋  ๋งŒํ•œ ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์ฐพ์ง€ ๋ชปํ–ˆ๋‹ค๋ฉด ์–ธ์ œ๋“ ์ง€ ์š”์ฒญํ•ด์ฃผ์„ธ์š”!

๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ Pipeline ์‚ฌ์šฉํ•˜๊ธฐ[[using-pipelines-on-a-dataset]]

ํŒŒ์ดํ”„๋ผ์ธ์€ ๋Œ€๊ทœ๋ชจ ๋ฐ์ดํ„ฐ์„ธํŠธ์—์„œ๋„ ์ถ”๋ก  ์ž‘์—…์„ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋•Œ ์ดํ„ฐ๋ ˆ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฑธ ์ถ”์ฒœ๋“œ๋ฆฝ๋‹ˆ๋‹ค.

def data():
    for i in range(1000):
        yield f"My example {i}"


pipe = pipe(model="gpt2", device=0)
generated_characters = 0
for out in pipe(data()):
    generated_characters += len(out["generated_text"])

์ดํ„ฐ๋ ˆ์ดํ„ฐ data()๋Š” ๊ฐ ๊ฒฐ๊ณผ๋ฅผ ํ˜ธ์ถœ๋งˆ๋‹ค ์ƒ์„ฑํ•˜๊ณ , ํŒŒ์ดํ”„๋ผ์ธ์€ ์ž…๋ ฅ์ด ์ˆœํšŒํ•  ์ˆ˜ ์žˆ๋Š” ์ž๋ฃŒ๊ตฌ์กฐ์ž„์„ ์ž๋™์œผ๋กœ ์ธ์‹ํ•˜์—ฌ GPU์—์„œ ๊ธฐ์กด ๋ฐ์ดํ„ฐ๊ฐ€ ์ฒ˜๋ฆฌ๋˜๋Š” ๋™์•ˆ ์ƒˆ๋กœ์šด ๋ฐ์ดํ„ฐ๋ฅผ ๊ฐ€์ ธ์˜ค๊ธฐ ์‹œ์ž‘ํ•ฉ๋‹ˆ๋‹ค.(์ด๋•Œ ๋‚ด๋ถ€์ ์œผ๋กœ DataLoader๋ฅผ ์‚ฌ์šฉํ•ด์š”.) ์ด ๊ณผ์ •์€ ์ „์ฒด ๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ๋ฉ”๋ชจ๋ฆฌ์— ์ ์žฌํ•˜์ง€ ์•Š๊ณ ๋„ GPU์— ์ตœ๋Œ€ํ•œ ๋น ๋ฅด๊ฒŒ ์ƒˆ๋กœ์šด ์ž‘์—…์„ ๊ณต๊ธ‰ํ•  ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์— ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค.

๊ทธ๋ฆฌ๊ณ  ์ผ๊ด„ ์ฒ˜๋ฆฌ๊ฐ€ ๋” ๋น ๋ฅผ ์ˆ˜ ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, batch_size ๋งค๊ฐœ๋ณ€์ˆ˜๋ฅผ ์กฐ์ •ํ•ด๋ด๋„ ์ข‹์•„์š”.

๋ฐ์ดํ„ฐ์„ธํŠธ๋ฅผ ์ˆœํšŒํ•˜๋Š” ๊ฐ€์žฅ ๊ฐ„๋‹จํ•œ ๋ฐฉ๋ฒ•์€ ๐Ÿค— Datasets๋ฅผ ํ™œ์šฉํ•˜๋Š” ๊ฒƒ์ธ๋ฐ์š”.

# KeyDataset is a util that will just output the item we're interested in.
from transformers.pipelines.pt_utils import KeyDataset

pipe = pipeline(model="hf-internal-testing/tiny-random-wav2vec2", device=0)
dataset = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation[:10]")

for out in pipe(KeyDataset(dataset["audio"])):
    print(out)

์›น์„œ๋ฒ„์—์„œ Pipeline ์‚ฌ์šฉํ•˜๊ธฐ[[using-pipelines-for-a-webserver]]

์ถ”๋ก  ์—”์ง„์„ ๋งŒ๋“œ๋Š” ๊ณผ์ •์€ ๋”ฐ๋กœ ํŽ˜์ด์ง€๋ฅผ ์ž‘์„ฑํ• ๋งŒํ•œ ๋ณต์žกํ•œ ์ฃผ์ œ์ž…๋‹ˆ๋‹ค.

Link

๋น„์ „ Pipeline[[vision-pipeline]]

๋น„์ „ ํƒœ์Šคํฌ๋ฅผ ์œ„ํ•ด [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ์ผ์€ ๊ฑฐ์˜ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

ํƒœ์Šคํฌ๋ฅผ ์ง€์ •ํ•˜๊ณ  ์ด๋ฏธ์ง€๋ฅผ ๋ถ„๋ฅ˜๊ธฐ์— ์ „๋‹ฌํ•˜๋ฉด ๋ฉ๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋Š” ์ธํ„ฐ๋„ท ๋งํฌ ๋˜๋Š” ๋กœ์ปฌ ๊ฒฝ๋กœ์˜ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ด์ฃผ์„ธ์š”. ์˜ˆ๋ฅผ ๋“ค์–ด ์•„๋ž˜์— ํ‘œ์‹œ๋œ ๊ณ ์–‘์ด๋Š” ์–ด๋–ค ์ข…์ธ๊ฐ€์š”?

pipeline-cat-chonk

>>> from transformers import pipeline

>>> vision_classifier = pipeline(model="google/vit-base-patch16-224")
>>> preds = vision_classifier(
...     images="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pipeline-cat-chonk.jpeg"
... )
>>> preds = [{"score": round(pred["score"], 4), "label": pred["label"]} for pred in preds]
>>> preds
[{'score': 0.4335, 'label': 'lynx, catamount'}, {'score': 0.0348, 'label': 'cougar, puma, catamount, mountain lion, painter, panther, Felis concolor'}, {'score': 0.0324, 'label': 'snow leopard, ounce, Panthera uncia'}, {'score': 0.0239, 'label': 'Egyptian cat'}, {'score': 0.0229, 'label': 'tiger cat'}]

ํ…์ŠคํŠธ Pipeline[[text-pipeline]]

NLP ํƒœ์Šคํฌ๋ฅผ ์œ„ํ•ด [pipeline]์„ ์‚ฌ์šฉํ•˜๋Š” ์ผ๋„ ๊ฑฐ์˜ ๋™์ผํ•ฉ๋‹ˆ๋‹ค.

>>> from transformers import pipeline

>>> # This model is a `zero-shot-classification` model.
>>> # It will classify text, except you are free to choose any label you might imagine
>>> classifier = pipeline(model="facebook/bart-large-mnli")
>>> classifier(
...     "I have a problem with my iphone that needs to be resolved asap!!",
...     candidate_labels=["urgent", "not urgent", "phone", "tablet", "computer"],
... )
{'sequence': 'I have a problem with my iphone that needs to be resolved asap!!', 'labels': ['urgent', 'phone', 'computer', 'not urgent', 'tablet'], 'scores': [0.504, 0.479, 0.013, 0.003, 0.002]}

๋ฉ€ํ‹ฐ๋ชจ๋‹ฌ Pipeline[[multimodal-pipeline]]

[pipeline]์€ ์—ฌ๋Ÿฌ ๋ชจ๋‹ฌ๋ฆฌํ‹ฐ(์—ญ์ฃผ: ์˜ค๋””์˜ค, ๋น„๋””์˜ค, ํ…์ŠคํŠธ์™€ ๊ฐ™์€ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ)๋ฅผ ์ง€์›ํ•ฉ๋‹ˆ๋‹ค. ์˜ˆ์‹œ๋กœ ์‹œ๊ฐ์  ์งˆ์˜์‘๋‹ต(VQA; Visual Question Answering) ํƒœ์Šคํฌ๋Š” ํ…์ŠคํŠธ์™€ ์ด๋ฏธ์ง€๋ฅผ ๋ชจ๋‘ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค. ๊ทธ ์–ด๋–ค ์ด๋ฏธ์ง€ ๋งํฌ๋‚˜ ๋ฌป๊ณ  ์‹ถ์€ ์งˆ๋ฌธ๋„ ์ž์œ ๋กญ๊ฒŒ ์ „๋‹ฌํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ์ด๋ฏธ์ง€๋Š” URL ๋˜๋Š” ๋กœ์ปฌ ๊ฒฝ๋กœ์˜ ํ˜•ํƒœ๋กœ ์ „๋‹ฌํ•ด์ฃผ์„ธ์š”.

์˜ˆ๋ฅผ ๋“ค์–ด ์ด ๊ฑฐ๋ž˜๋ช…์„ธ์„œ ์‚ฌ์ง„์—์„œ ๊ฑฐ๋ž˜๋ช…์„ธ์„œ ๋ฒˆํ˜ธ๋ฅผ ๋ฌป๊ณ  ์‹ถ๋‹ค๋ฉด,

>>> from transformers import pipeline

>>> vqa = pipeline(model="impira/layoutlm-document-qa")
>>> vqa(
...     image="https://huggingface.co/spaces/impira/docquery/resolve/2359223c1837a7587402bda0f2643382a6eefeab/invoice.png",
...     question="What is the invoice number?",
... )
[{'score': 0.42514941096305847, 'answer': 'us-001', 'start': 16, 'end': 16}]