ASRT

A Deep-Learning-Based Chinese Speech Recognition System

What is ASRT?

ASRT project is a deep-learning-based Chinese speech recognition system, which uses TensorFlow.Keras based on deep convolutional neural network and CTC for speech model and Maximum entropy hidden Markov model for language model to implement. And also, we provide a HTTP-based server software to setup a API server easily for other clients to send API requests.

Developed for research

This is a project to research on Speech Recognition. We hope it can be a high accuracy ASR system.

Time saver

An ASR framework can save your time to build a ASR system that belongs your owns. It can also save your time to download released software directly and run an ASR server for your applications.

Cross-platform

You can run this ASR server software on Windows, Linux or MacOS as long as there are python 3.6 ↑ installed in your machine. Your client applications can also run in all platform with Internet or your local network.

Deep-Learning-Based

This project use deep learning model with CTC to implement and made a good correct rate.

Easy to use

This project is highly encapsulated and all components is componentized. You can build a ASR system like building blocks.

high performance

It is a lightweight ASR system that can run fastly. You can run it with or without GPU when prediction, because there is no obvious difference in many indicators.

Features

Get Started

Clone or Download

If you want to build a ASR system server please click here to download the latest version.

If you want to train your own models or modify models to train please follow these steps to do.

  • Environment: Python3, Git
  • Package: tensorFlow, wave, scipy, matplotlib, requests
  • Languages: Python3

    $ git clone https://github.com/nl8590687/ASRT_SpeechRecognition.git
                     

Download Speech Data Set

This project can use speech data set likes THCHS30, ST-CMDS, Primewords, AISHELL-1, aiDataTang, MagicData and so on. You find them in this project's wiki document page

After cloning a repository through git and downloading data set, you need to copy and unzip all the dataset files into the directory `/data/speech_data` or other directory you like.

Then, you need to download datalist that you need.

    
    $ python download_default_datalist.py
    

Run

To start training this project, please execute:


    $ python3 train_speech_model.py
                    

ASRT API Server startup please execute:


    $ python3 asrserver.py
                    

Full Documentation

For more information you can click the following button to see full documentation.

More on Wiki Document

License

This project is made by GitHub user nl8590687 beginning from 2016, which is 100% FREE under the GNU General Public License v3.0(GPL v3.0) License.

If you are feeling generous and want to show your support to ASRT project, you can buy him a beer or coffee via the AliPay and Wechat donate QR code below. :)

AliPay
WeChat