Whisper Speech-To-Text

Highest Nextcloud version

Nextcloud 29
Show all releases

Community rating
Author

Marcel Klehr

Last updated

1 year ago

Categories

Tools

Interact
Report problem Request feature Ask questions or discuss

Speech-To-Text provider running OpenAI Whisper locally

This app is deprecated in favor of stt_whisper2. Have a look at the docs for stt_whisper2

The models run completely on your machine. No private data leaves your servers.

Requirements:

  • Architecture: x86-64 with AVX support
  • OS: Linux

Model sizes:

  • Small: 500MB
  • Medium: 1.5Gb
  • Large: 3.1GB

After installing this app you will need to run

occ stt_whisper:download-models [model-name]

where [model-name] is one of

  • small
  • medium (default)
  • large

Ethical AI Rating

Rating: 🟡

Positive:

  • the software for training and inference of this model is open source
  • the trained model is freely available, and thus can be run on-premises

Negative:

  • the training data is not freely available, limiting the ability of external parties to check and correct for bias or optimise the model’s performance and CO2 usage.

Learn more about the Nextcloud Ethical AI Rating in our blog.

NOTE:

A few things to keep in mind.

  • Transcriptions need to be enabled in the Talk app if you need the calls to be transcribed with any Speech to Text provider (including this app). It can be set using this occ command:
occ config:app:set spreed call_recording_transcription --value yes
  • This app tends to be heavy on CPU. If it starts to be an issue in your normal workflow, you can limit the number of threads used by Whisper in the "Whisper Speech-To-Text" section in the admin settings
  • The generated transcriptions may vary in accuracy based on the spoken language.
  • Per participant transcription in calls is currently not available but PRs are welcome!

Releases

Nextcloud version Stable channel Nightly channel All releases
29 1.0.8 - 29
28 1.0.8 - 28
27 1.0.7 - 27

Comments

No comments found.