Speech to Text Software for Windows

View 37 business solutions

Browse free open source Speech to Text software and projects for Windows below. Use the toggles on the left to filter open source Speech to Text software by OS, license, language, programming language, and project status.

  • SysAid multi-layered ITSM solution Icon
    SysAid multi-layered ITSM solution

    For organizations spanning all industries and sizes from SMBs to Fortune 500 corporations

    SysAid is an ITSM, Service Desk and Help Desk software solution that integrates all of the essential IT tools into one product. Its rich set of features include a powerful Help Desk, IT Asset Management, and other easy-to-use tools for analyzing and optimizing IT performance.
    Free Trial
  • Cloud data warehouse to power your data-driven innovation Icon
    Cloud data warehouse to power your data-driven innovation

    BigQuery is a serverless and cost-effective enterprise data warehouse that works across clouds and scales with your data.

    BigQuery Studio provides a single, unified interface for all data practitioners of various coding skills to simplify analytics workflows from data ingestion and preparation to data exploration and visualization to ML model creation and use. It also allows you to use simple SQL to access Vertex AI foundational models directly inside BigQuery for text processing tasks, such as sentiment analysis, entity extraction, and many more without having to deal with specialized models.
    Try for free
  • 1
    CMU Sphinx

    CMU Sphinx

    Speech Recognition Toolkit

    Thank you for visiting! ----> Maintenance and improvement work has MOVED to https://cmusphinx.github.io/ Please go there for the most recent software and documentation. <---- CMUSphinx is a speaker-independent large vocabulary continuous speech recognizer released under BSD style license. It is also a collection of open source tools and resources that allows researchers and developers to build speech recognition systems.
    Leader badge
    Downloads: 897 This Week
    Last Update:
    See Project
  • 2
    Whisper

    Whisper

    Robust Speech Recognition via Large-Scale Weak Supervision

    Whisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification. A Transformer sequence-to-sequence model is trained on various speech processing tasks, including multilingual speech recognition, speech translation, spoken language identification, and voice activity detection. These tasks are jointly represented as a sequence of tokens to be predicted by the decoder, allowing a single model to replace many stages of a traditional speech-processing pipeline. The multitask training format uses a set of special tokens that serve as task specifiers or classification _targets.
    Downloads: 116 This Week
    Last Update:
    See Project
  • 3
    DeepSpeech

    DeepSpeech

    Open source embedded speech-to-text engine

    DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers. DeepSpeech is an open-source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. A pre-trained English model is available for use and can be downloaded following the instructions in the usage docs. If you want to use the pre-trained English model for performing speech-to-text, you can download it (along with other important inference material) from the DeepSpeech releases page.
    Downloads: 40 This Week
    Last Update:
    See Project
  • 4
    TTS Voice Wizard

    TTS Voice Wizard

    Speech to Text to Speech, sends text as OSC messages

    Speech to Text to Speech. Song now playing. Sends text as OSC messages to VRChat to display on avatar. (STTTS) (Speech to TTS) (VRC STT System) Use TTS Voice Wizard's accessibility features to improve your VRChat experience (it works outside of VRChat too!) You can convert your Speech-to-Text and back to Speech through various Speech Recognition and Text-to-Speech methods. You can send what you say as OSC messages to VRChat to be displayed on your avatar using KillFrenzyAvatarText or VRChats Chatbox. The app can translate your speech from one language to over 20 other support languages. There are 100+ different voices with various customization options so you can pick a voice that best suits you. Display the current song you are listening to on Spotify or via your browser. Display tracker and controller battery life in conjunction with XSOverlay. Use in conjunction with HRtoVRChat_OSC to enable you to display your heartrate in VRChat's Chatbox.
    Downloads: 19 This Week
    Last Update:
    See Project
  • 5
    Translate-Subtitle-File

    Translate-Subtitle-File

    Subtitle Creation Assistant

    Subtitle group machine translation assistant - [Function 1: Translate subtitle file] .srt .ass .vtt [Function 2: Voice to text] (Drag in video or audio to recognize subtitles) (The latest version v4.1.0 Update time 2021 2 May 23) 12 translation service providers can be configured, such as Google, Baidu, Tencent, Caiyun, IBM, Azure, Amazon, etc. (6 voice service providers can be configured: Alibaba Cloud, Xunfei, Tencent Cloud, IBM, Azure, Amazon ) Advantages: 1. You can use multiple service providers, 2. You can configure your own API Key to use your own account's free quota, such as Tencent's free translation quota of 5 million characters per month, IBM's 500-minute speech-to-text free quota (tern. best The domain name has expired and I don't want to renew it.) Azure speech-to-text and DeepL free version have problems, it is normal to not use it, please wait for the next version to fix. Machine translation of subtitle files, use machine translation to process files.
    Downloads: 14 This Week
    Last Update:
    See Project
  • 6
    sherpa-onnx

    sherpa-onnx

    Speech-to-text, text-to-speech, and speaker recognition

    Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without an Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter.
    Downloads: 11 This Week
    Last Update:
    See Project
  • 7
    Whishper

    Whishper

    Transcribe any audio to text, translate and edit subtitles 100% locall

    Open-source, local-first audio transcription and subtitling suite with a simple web UI. Thanks to open-source technologies, Whishper can run 100% offline. Your data never leaves your computer. Whishper allows you to translate your transcriptions to and from more than 60 languages thanks to Argos Translate and LibreTranslate. Download the transcriptions in many formats (json, txt, vtt, srt). Easily edit your subtitles right in the Web-UI.
    Downloads: 8 This Week
    Last Update:
    See Project
  • 8
    Coqui STT

    Coqui STT

    The deep learning toolkit for speech-to-text

    Coqui STT is a fast, open-source, multi-platform, deep-learning toolkit for training and deploying speech-to-text models. Coqui STT is battle-tested in both production and research. Multiple possible transcripts, each with an associated confidence score. Experience the immediacy of script-to-performance. With Coqui text-to-speech, production times go from months to minutes. With Coqui, the post is a pleasure. Effortlessly clone the voices of your talent and have the clone handle the problems in post. With Coqui, dubbing is a delight. Effortlessly clone the voice of your talent into another language and let the clone do the dub. With text-to-speech, experience the immediacy of script-to-performance. Cast from a wide selection of high-quality, directable, emotive voices or clone a voice to suit your needs. With Coqui text-to-speech, production times go from months to minutes.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 9
    Speech Recognition in English & Polish

    Speech Recognition in English & Polish

    Speech recognition software for English & Polish languages

    Software for speech recognition in English & Polish languages. Basic versions of SkryBot: 1. SkryBot Home Speech (English Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesEnglish/InstalatorSkryBotHomeSpeechDemo-2.6.9.18117.exe/download 2. SkryBot DoMowy (Polish Language) - https://sourceforge.net/projects/skrybotdomowy/files/ReleasesPolish/InstalatorSkryBotDoMowyDemo-2.4.9.18117.exe/download More help: https://sourceforge.net/p/skrybotdomowy/wiki/ Domain advanced versions (Polish Language) 1. SkryBot Prawo - for judicial professionals. 2. SkryBot Administracyjny - for civil and government administration. 3. SkryBot Medycyna Rodzinna - for physicians Professional version of SkryBot (commercial) offers you: 1. Audio conversion and cutting sound files into smaller ones. 2. Searching for words or phrases in sound files (recognized by SkryBot). 3. Editing sound files and automatic cutting off long silence parts in audio file.
    Downloads: 5 This Week
    Last Update:
    See Project
  • The Voice API that just works | Twilio Icon
    The Voice API that just works | Twilio

    Build a scalable voice experience with the API that's connecting millions around the world.

    With Twilio Voice, you can build unique phone call experiences with one API, to create, receive, control and monitor calls with just a few lines of code. Create an engaging voice experience that you can quickly scale and modify with a wide array of customization options and resources.
    Learn More
  • 10
    VATSG

    VATSG

    Video automatic transcribe and translated subtitle generator

    It generates srt format subtitle from videofile which can be any source language that whisper support , and then make translated subtitle file of your _target language which deepl support. This is the subtitle generator(VATSG) which use [moviepy](https://github.com/Zulko/moviepy) to generate mp3 and then use [faster-whisper](https://github.com/guillaumekln/faster-whisper) to get text recognition and then use deepl-api to generate your _target language subtitle file(srt format) If you are a general user who want to view any video file and mp3 file to your language, It will provide way. It's very easy to use because it has simple gui and very intuitive. So you can easily use it for any purpose. Now, you can choose to download either window installer setup type or uninstalled type. Enjoy and support my consistent development!
    Downloads: 6 This Week
    Last Update:
    See Project
  • 11

    SoundTranscriber

    SoundTranscriber can be used to generate automatic transcription / aut

    SoundTranscriber can be used to generate automatic transcription / aut
    Downloads: 3 This Week
    Last Update:
    See Project
  • 12
    ILA - teachable voice assistant

    ILA - teachable voice assistant

    ILA is a fully customizable and teachable voice assistant for Java

    ILA stands for (kind of) intelligent, learning assistant and is a speech recognition system aka voice assistant very similar to Siri, Google Now and Cortana. ILA is fully customizable and you can teach her/him/it new things by yourself like executing system commands, opening web pages, programs and apps or just some basic conversation :-) ILA runs on Java und thus is compatible to Windows, Mac and Linux. It is designed to integrate with your home enviroment and for example build up your own, free and open Amazon Echo replacement ;-) Right now the key components of ILA are the open source speech recognition CMU Sphinx-4, Google (Speech Recognition/Text-To-Speech) and MaryTTS (Text-To-Speech). The goal is to make ILA completely free of Google by improving all aspects of the open source systems. Since version 3.3 users can also write own add-ons to extend ILA. ILA's successor is the SEPIA Framework: https://sepia-framework.github.io/ Hope you enjoy ILA - Florian
    Downloads: 1 This Week
    Last Update:
    See Project
  • 13
    JAVT - Just Another Voice Transformer

    JAVT - Just Another Voice Transformer

    Just Another Speech Recognition and Text to Speech software.

    JAVT or Just Another Voice Transformer (formerly, it is called Just Another Video Transcriber) is a Speech Recognition software that also support text to Speech and simple media conversion. JAVT allows you to convert from video files to audio wav file using ffmpeg, and then transcribe the audio file to text using either Microsoft SAPI or CMU Sphinx. You can also open a text file and allow JAVT to read it out for you through text to speech conversion.
    Downloads: 3 This Week
    Last Update:
    See Project
  • 14
    Responding Partner

    Responding Partner

    Control your PC computer with voice commands

    Responding Partner is a windows application that enables you free talking with your computer which equipped with spoken animation character. You will be surprised how smart responding partner robot is. It also enables voice commands and controls to your computer for small task like open media files, open and close program, shutdown and restart computer,open website, type in editor, text to speech,etc. You can extend the ability by installing new plugin which available at files tab. We will continuous to update new plugin and animation character. Engine inside: - Speech Recognition - Text to Speech Requirements - Microphone
    Downloads: 1 This Week
    Last Update:
    See Project
  • Cybersecurity Search Engine | Criminal IP Icon
    Cybersecurity Search Engine | Criminal IP

    Criminal IP is a comprehensive threat intelligence search engine that detects vulnerabilities of personal and corporate cyber assets in real time

    Originated from the idea that individuals and corporations would be able to strengthen their cyber security by proactively acquiring information about IP addresses attempting to access your network, Criminal IP uses its big data of more than 4.2 billion IP addresses to provide threat-relevant information on malicious IPs and links, phishing sites, certificates, industrial control systems, IoTs, servers, security cameras, and so forth.
    Learn More
  • 15
    Anthromorphic Scribe

    Anthromorphic Scribe

    Provides speech to text gui to sphinx4

    It provides an interactive speech to text application that uses sphinx 4. With this you can use pre-recorded audio, record your own voice and convert incompatible audio/video to be compatible with sphinx 4. It currently supports U.S English by using hub4 acoustic and language model.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 16
    Conversations

    Conversations

    App in java for chatting to a generative A.I. (involving tts and stt)

    Java application for chatting to generative AI Llama3. * The user can speak into the microphone (speechToText), edit the recognized text and send it to the AI. * The AI ​​responds and the server returns that response in real time, and the sentences converted to audio (textToSpeech), and the application broadcasts them through the speaker. The application is prepared so that only one user occupies the server's resources, so if the server is busy, in theory it will not let you connect. There is a demo video that shows how it works: https://frojasg1.com:8443/resource_counter/resourceCounter?operation=countAndForward&url=https%3A%2F%2Ffrojasg1.com%2Fdemos%2Faplicaciones%2Fchat%2F20240815.Demo.Chat.mp4%3Forigin%3Dsourceforge&origin=web
    Downloads: 0 This Week
    Last Update:
    See Project
  • 17
    JKVSRG Eng Amharic Translator

    JKVSRG Eng Amharic Translator

    Translation between English and Amharic

    JKVSRG English and Amharic translator translates English to Amharic and Amharic to English. User can transliterate English to Amharic with the help of Amharic Letter System which is included in this software. In addition, User can also translate English text file to Amharic text file and vice-versa while comparing it with other Amharic Translators. This translator is an interactive and user friendly translator because of its special feature speech synthesize (text to speech). It runs on all windows operation system. It takes less memory and few seconds to install. When using it, users ensure internet connection and speaker in their system.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 18
    JKVSRG Tamil and Amharic Translator

    JKVSRG Tamil and Amharic Translator

    Attractive and user friendly Tamil and Amharic translator

    JKVSRG Tamil and Amharic translator translates between Tamil and Amharic. User can transliterate English to Tamil and English to Amharic with the help of Tamil and Amharic Letter System which is included in this software. In addition, User can also translate multilingual text file to Tamil and Amharic text file. It is an interactive and user friendly translator because of its special feature speech synthesize (text to speech). It runs on windows. It takes a smaller amount of memory and few seconds to install. Users make sure internet connection and speaker in their system while using it.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 19
    MDictate

    MDictate

    Speech to text using python, pocketsphinx, ready to deploy

    Automated speech recognition software is extremely cumbersome. This project's aim is to incrementally improve the quality of an open-source and ready to deploy speech to text recognition system. Runs on Windows using the mdictate.exe, but the core workings are found in the mdictate.py script which should work on Windows/Linux/OS X. In version 1.0, we use pocketsphinx' default setup with a basic graphic interface.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 20
    Sintetizador GMS

    Sintetizador GMS

    Herramienta para Sintetizar voces en la PC

    Herramienta para escuchar documentos, puede copiar y pegar, guardar documentos y generar audio hablado.
    Leader badge
    Downloads: 0 This Week
    Last Update:
    See Project
  • 21

    Sluh

    creating user interface for converting speech to text and voice contro

    Проект по изучению пригодности\простоты к использованию CMUSphinx, pocketsphinx в пользовательских целях. Попытка создания программного интерфейса по распознаванию русской речи (преобразованию в текст) на базе CMUSphinx, pocketsphinx для голосового управления ПО и прочее.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 22
    This software convert speech to text using Java and Android application. With this software you can also search for text in Google. You can use offline speech to text with this application if you don't have Internet, you can find the steps in guide file. How to use: ----------------- 1- Install a software to convert the PC as router (EX: My Wifi Router) then connect your mobile with PC via wifi. 2- Install Smart Text to Speech.apk file on your phone. 3- Open "Smart Speech to Text.jar" java application on PC. 4- Launch Smart Speech to Text on your phone. 5- Click on "Speak Now" button in java application. 6- After you speak click on red circle button on your phone to stop speaking and to convert it to text or you can wait few seconds notice: --------- Speech that will converted relied on the language that installed on your phone. © Copyright Khalid Kareem 2014, All Rights Reserved E-mail: khaled.atroshy@gmail.com Enjoy!
    Downloads: 0 This Week
    Last Update:
    See Project
  • 23
    Plugin for Windows XP that put an Icon in Windows Bar and allow read the copied text(By mean Ctrl+C) using Festival a Speech To Text Synthesis Software.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 24
    Speechalyzer

    Speechalyzer

    Process large speech data wrt transcription, labeling and annotation

    Speechalyzer: a tool for the daily work of a 'speech worker' It is optimized to process large speech data sets with respect to transcription, labeling and annotation. It is implemented as a client server based framework in Java and interfaces software for speech recognition, synthesis, speech classification and quality evaluation. The application is mainly the processing of training data for speech recognition and classification models and performing benchmarking tests on speech-to-text, text-to-speech and speech classification software systems.
    Downloads: 0 This Week
    Last Update:
    See Project
  • 25
    Voice Conference Manager uses VoiceXML and CCXML to control speech recognition, text to speech, and voice biometrics for a telephone conference service. Say the names or numbers of people and VCM places them into the call. Can be hosted on public servers
    Downloads: 0 This Week
    Last Update:
    See Project
  • Previous
  • You're on page 1
  • 2
  • Next
  NODES
admin 2
chat 14
Idea 1
idea 1
innovation 1
INTERN 4
Project 33
USERS 3