λ³Έλ¬Έ λ°”λ‘œκ°€κΈ°

Platform/☁️ Google Cloud

☁️ Google Cloud * Speech to Text μ•Œμ•„λ³΄κΈ°

 

 

이 κΈ€μ—μ„œ μ„€λͺ…ν•œ λ‚΄μš©μ˜ μ˜ˆμ œλŠ” Speech To Text λ²„μ „ 1.9X.XX, v1p1beta1 workspaceλ₯Ό μ‚¬μš©ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.

λ²„μ „λ³„λ‘œ ν¬ν•¨ν•˜κ³  μžˆλŠ” κΈ°λŠ₯이 μƒμ΄ν•˜λ‹ˆ μ‹€ν–‰ν•˜μ‹€ λ•Œ κΌ­ μ°Έκ³ ν•˜μ„Έμš”!

 

λ°”λ‘œ λˆμ„ 내지 μ•Šμ•„λ„ $300으 무료 ν¬λ ˆλ”§μ„ 12κ°œμ›”κ°„ μ‚¬μš©ν•  수 있으며, 무료 ν‰κ°€νŒ μ’…λ£Œ ν›„ μžλ™ μ²­κ΅¬λ˜μ§€ μ•ŠλŠ”λ‹€κ³  ν•œλ‹€.

μ‹ μš©μΉ΄λ“œλ₯Ό λ“±λ‘ν•˜κ²Œ λ˜μ–΄ μžˆλŠ”λ° μ΄λŠ” μžλ™ κ°€μž…μ„ λ°©μ§€ν•˜κΈ° μ΄ν•΄μ„œμ΄λ©° μ‚¬μš©μžκ°€ 유료 κ³„μ •μœΌλ‘œ 직접 μ—…κ·Έλ ˆμ΄λ“œν•˜μ§€ μ•ŠλŠ” ν•œ μš”κΈˆμ΄ μ²­κ΅¬λ˜μ§€ μ•ŠλŠ”λ‹€κ³  λ‚˜μ™€μžˆλ‹€.

 

 

ν΄λΌμš°λ“œ μŒμ„± ν…μŠ€νŠΈ Cloud Speech-to-Text

μŒμ„± ν…μŠ€νŠΈ(STT) λ³€ν™˜μ€ λ¨Έμ‹ λŸ¬λ‹(κΈ°κ³„ν•™μŠ΅)을 μ‚¬μš©ν•˜λ©° μ§§κ±°λ‚˜ κΈ΄ ν˜•μ‹μ˜ μ˜€λ””μ˜€λ₯Ό μ‚¬μš©ν•  수 μžˆλ‹€.
Speech-to-text conversion powered by machine learning and available for short-form or long-form audio.

STTλ₯Ό μœ„ν•œ λ¬Έμ„œλ³΄κΈ° View Documentation for this product.

 

κ°•λ ₯ν•œ μŒμ„± 인식 Powerful speech recognition

ꡬ글 ν΄λΌμš°λ“œ STTλŠ” κ°•λ ₯ν•œ μ‹ κ²½ λ„€νŠΈμ›Œν¬ λͺ¨λΈμ„ μ‚¬μš©ν•˜κΈ° μ‰¬μš΄ API에 μ μš©ν•˜μ—¬ κ°œλ°œμžκ°€ μ˜€λ””μ˜€λ₯Ό ν…μŠ€νŠΈλ‘œ λ³€ν™˜ ν•  수 있게 ν•œλ‹€. APIλŠ” 전세계 μ‚¬μš©μžλ“€μ„ μ§€μ›ν•˜κΈ° μœ„ν•΄ 120개의 λ‹€μ–‘ν•œ 언어와 λ³€ν˜•μ„ μΈμ‹ν•œλ‹€. μ½œμ„Όν„°μ—μ„œ μ˜€λ””μ˜€λ₯Ό λ…ΉμŒν•˜λŠ” 것 이상을 μŒμ„±μœΌλ‘œ λͺ…λ Ήν•˜κ³  μ œμ–΄ ν•  수 μžˆλ‹€. APIλŠ” ꡬ글 λ¨Έμ‹ λŸ¬λ‹ κΈ°μˆ μ„ μ΄μš©ν•΄ μ‹€μ‹œκ°„ 슀트리밍과 사전에 λ…ΉμŒ 된 μ˜€λ””μ˜€λ₯Ό μ²˜λ¦¬ν•  수 μžˆλ‹€.

Google Cloud Speech-to-Text enables developers to convert audio to text by applying powerful neural network models in an easy-to-use API. The API recognizes 120 languages and variants to support your global user base. You can enable voice command-and-control, transcribe audio from call centers, and more. It can process real-time streaming or prerecorded audio, using Google’s machine learning technology.

 

ν΄λΌμš°λ“œ μŒμ„±-ν…μŠ€νŠΈ λ³€ν™˜ κΈ°λŠ₯ Cloud Speech-to-Text features

1. μžλ™ μŒμ„± 인식 Automatic Speech Recognition

μžλ™ μŒμ„± 인식(ARS)은 μŒμ„± λ…ΉμŒ 및 ν•™μŠ΅κ³Ό 같은 μ‘μš©ν”„λ‘œκ·Έλž¨μ— κ°•λ ₯ν•œ μ‹ κ²½ λ„€νŠΈμ›Œν‚Ήμ„ μ œκ³΅ν•œλ‹€.

Automatic Speech Recognition (ASR) powered by deep learning neural networking to power your applications like voice search or speech transcription.

 

2. μ†ŒμŒ μ–΅μ œ Noise Robustness

λ‹€μ–‘ν•œ ν™˜κ²½μ—μ„œ μΆ”κ°€λ‘œ μ†ŒμŒμ„ 제거λ₯Ό μš”κ΅¬ν•˜μ§€ μ•Šκ³  μ†ŒμŒμ΄ μžˆλŠ” μ˜€λ””μ˜€ νŒŒμΌμ„ μΈμ‹ν•œλ‹€.

Handles noisy audio from many environments without requiring additional noise cancellation.

 

3. κ΄‘λ²”μœ„ μ–΄νœ˜ Global Vocabulary

120개의 언어와 ν’λΆ€ν•œ μ–΄νœ˜λ₯Ό μΈμ‹ν•œλ‹€.

Recognizes 120 languages and variants with an extensive vocabulary.

 

4. λΆ€μ μ ˆν•œ μ½˜ν…μΈ  필터링 Inappropriate Content Filtering

일뢀 μ–Έμ–΄μ˜ ν…μŠ€νŠΈ κ²°κ³Όμ—μ„œ λΆ€μ μ ˆν•œ μ½˜ν…μΈ λ₯Ό 필터링 ν•œλ‹€.

Filter inappropriate content in text results for some languages.

 

5. ꡬ문 힌트 Phrase Hints

μŒμ„± 인식은 λ§ν•˜κΈ° μ‰¬μš΄ 단어와 ꡬλ₯Ό μ œκ³΅ν•¨μœΌλ‘œμ¨ νŠΉλ³„ν•œ λ§₯락을 μœ„ν•΄ μ‚¬μš©μžν™” 될 수 μžˆλ‹€. 

이것은 특히 μ‚¬μš©μžκ°€ μΆ”κ°€ν•œ λ‹¨μ–΄λ‚˜ 이름을 μ–΄νœ˜μ™€ μŒμ„± μ œμ–΄ μ‚¬μš© 사둀에 μΆ”κ°€ν•  λ•Œ μœ μš©ν•˜λ‹€.

Speech recognition can be customized to a specific context by providing a set of words and phrases that are likely to be spoken. This is especially useful for adding custom words and names to the vocabulary and in voice-control use cases.

 

6. μ‹€μ‹œκ°„ 슀트리밍과 사전 λ…ΉμŒ 된 μ˜€λ””μ˜€ 지원 Real-time Streaming or Prerecorded Audio Support

μ˜€λ””μ˜€ μž…λ ₯은 μ‘μš©ν”„λ‘œκ·Έλž¨μ˜ λ§ˆμ΄ν¬μ—μ„œ 슀트리밍 ν•˜κ±°λ‚˜ 사전 λ…ΉμŒλœ μ˜€λ””μ˜€ νŒŒμΌμ„ 보낸닀 (인라인 λ˜λŠ” ꡬ글 ν΄λΌμš°λ“œ μŠ€ν† λ¦¬μ§€λ₯Ό 톡해).

FLAC, AMR, PCMU, and Linear-16λ“±κ³Ό 같은 λ‹€μ–‘ν•œ 인코더듀을 μ§€μ›ν•œλ‹€. 

Audio input can be streamed from an application’s microphone or sent from a prerecorded audio file (inline or through Google Cloud Storage). Multiple audio encodings are supported, including FLAC, AMR, PCMU, and Linear-16.

 

7. μžλ™ ꡬ두점 Automatic Punctuation BETA

기계 ν•™μŠ΅μ„ 톡해 μ •ν™•ν•œ ꡬ두점을(콀마, λ¬ΌμŒν‘œ 및 λ§ˆμΉ¨ν‘œμ™€ 같은) μƒμ„±ν•œλ‹€.

Accurately punctuates transcriptions (e.g., commas, question marks, and periods) with machine learning.

 

9. μŠ€ν”Όμ»€ λ””μ•„λΌμ΄μ œμ΄μ…˜ Speaker Diarization BETA

λˆ„κ°€ 무엇을 λ§ν–ˆλŠ”μ§€ μ•„λŠ”κ²ƒ - λŒ€ν™”μ—μ„œ 각각이 λ§ν•œ 것을 μžλ™μœΌλ‘œ 인식할 수 μžˆμŠ΅λ‹ˆλ‹€.

Know who said what - you can now get automatic predictions about which of the speakers in a conversation spoke each utterance.

 

10. μžλ™ μ–Έμ–΄ 감지 Auto-Detect Language BETA

λ‹€κ΅­μ–΄ μ‹œλ‚˜λ¦¬μ˜€λ₯Ό 지원해야 ν•  경우, 2~4개의 μ–Έμ–΄ μ½”λ“œλ₯Ό  λͺ…μ‹œν•  수 μžˆλ‹€. ꡬ글 STTλŠ” μ˜¬λ°”λ₯Έ μ–Έμ–΄λ₯Ό μ‹λ³„ν•˜κ³  슀크립트λ₯Ό μ œκ³΅ν•œλ‹€.

When you need to support multilingual scenarios, you can now specify two to four language codes and Cloud Speech-to-Text will identify the correct language spoken and provide the transcript.

 

11. λ©€ν‹° 채널 인식 Multichannel Recognition BETA

각 μ°Έκ°€μžκ°€ λ³„λ„μ˜ 채널(예: 두 개의 채널이 μžˆλŠ” μ „ν™” 톡화 λ˜λŠ” λ„€ 개의 채널이 μžˆλŠ” 화상 회의)에 λ…ΉμŒμ΄ λ˜λŠ” 닀쀑 λ…ΉμŒμ—μ„œλŠ” ν΄λΌμš°λ“œ STTκ°€ 각 채널을 κ°œλ³„μ μœΌλ‘œ 인식 ν•œ λ‹€μŒ 사본을 주석 μ²˜λ¦¬ν•˜μ—¬ μ‹€μƒν™œ 예 처럼 λ™μΌν•˜κ²Œ ν•©λ‹ˆλ‹€.

In multiparticipant recordings where each participant is recorded in a separate channel (e.g., phone call with two channels or video conference with four channels), Cloud Speech-to-Text will recognize each channel separately and then annotate the transcripts so that they follow the same order as in real life.

 

Pricing

FEATURE 0-60 MINUTES OVER 60 MINUTES, UP TO 1 MILLION MINUTES
Speech Recognition (all models except video) Free $0.006 USD / 15 seconds*
Video Speech Recognition $0.006 $0.012 USD / 15 seconds*

 

- μŒμ„± 인식(Speech Recognition)의 경우 60λΆ„κΉŒμ§€λŠ” 무료이며 60λΆ„ 초과 μ‹œ 15초λ₯Ό κΈ°μ€€μœΌλ‘œ 6.67원($0.006 USB)이 λΆ€κ³Όλ˜λ©° ν•œλ‹¬μ— μ‚¬μš©κ°€λŠ₯ν•œ 전체 μ‹œκ°„μ€ μ•½ 16666.6667μ‹œκ°„(1million minutes) μž…λ‹ˆλ‹€.

- 각 μš”μ²­μ€ 15초λ₯Ό κΈ°λ³Έλ‹¨μœ„λ‘œ μΈ‘μ •ν•©λ‹ˆλ‹€.

- μœ„ ν…Œμ΄λΈ”μ˜ 가격은 개인용 μ‹œμŠ€ν…œμ— 적용되며 κΈ°μ—…μ—μ„œ μ‚¬μš©λ˜λŠ” 가격은 가격 κ°€μ΄λ“œλ₯Ό μ°Έμ‘°ν•˜μ—¬ μ—°λ½ν•˜μ—¬μ•Ό ν•œλ‹€.

- 각 μš”μ²­μ€ κ°€μž₯ κ°€κΉŒμš΄ 15초 λ‹¨μœ„λ‘œ μ˜¬λ¦Όλœλ‹€.

예λ₯Ό λ“€μ–΄, 각각 7초의 μ˜€λ””μ˜€λ₯Ό ν¬ν•¨ν•˜λŠ” μ„Έ 가지 κ°œλ³„ μš”μ²­μ„ ν•˜λ©΄ 45( 3x15 )λ™μ•ˆ μ˜€λ””μ˜€ μ΄μš©λ£Œκ°€ μ²¨λΆ€λ©λ‹ˆλ‹€.

15초, 14초λ₯Ό μ‚¬μš©ν•˜μ˜€μ„ 경우 (15+14=29)λ°˜μ˜¬λ¦Όν•˜κ³  30초둜 μ²­κ΅¬ν•©λ‹ˆλ‹€.

 

 


Thanks for