You cannot select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
main
${ noResults }
64 lines
2.6 KiB
Markdown
64 lines
2.6 KiB
Markdown
# Transcription DevTools
|
|||
|
|||
Includes:
|
|||
* __JiWER__ CLI NodeJS wrapper
|
|||
* Benchmark tool to test multiple transcription engines
|
|||
* TypeScript classes to evaluate word-error-rate of files generated by the transcription
|
|||
|
|||
## Build
|
|||
|
|||
```sh
|
|||
npm run build
|
|||
```
|
|||
|
|||
## Benchmark
|
|||
|
|||
A benchmark of available __transcribers__ might be run with:
|
|||
```sh
|
|||
npm run benchmark
|
|||
```
|
|||
```
|
|||
┌────────────────────────┬───────────────────────┬───────────────────────┬──────────┬────────┬───────────────────────┐
|
|||
│ (index) │ WER │ CER │ duration │ model │ engine │
|
|||
├────────────────────────┼───────────────────────┼───────────────────────┼──────────┼────────┼───────────────────────┤
|
|||
│ 5yZGBYqojXe7nuhq1TuHvz │ '28.39506172839506%' │ '9.62457337883959%' │ '41s' │ 'tiny' │ 'openai-whisper' │
|
|||
│ x6qREJ2AkTU4e5YmvfivQN │ '29.75206611570248%' │ '10.46195652173913%' │ '15s' │ 'tiny' │ 'whisper-ctranslate2' │
|
|||
└────────────────────────┴───────────────────────┴───────────────────────┴──────────┴────────┴───────────────────────┘
|
|||
```
|
|||
|
|||
The benchmark may be run with multiple model builtin sizes:
|
|||
|
|||
```sh
|
|||
MODELS=tiny,small,large npm run benchmark
|
|||
```
|
|||
|
|||
## Jiwer
|
|||
|
|||
> *JiWER is a python tool for computing the word-error-rate of ASR systems.*
|
|||
> https://jitsi.github.io/jiwer/cli/
|
|||
|
|||
__JiWER__ serves as a reference implementation to calculate errors rates between 2 text files:
|
|||
- WER (Word Error Rate)
|
|||
- CER (Character Error Rate)
|
|||
|
|||
|
|||
### Usage
|
|||
|
|||
```typescript
|
|||
const jiwerCLI = new JiwerClI('./reference.txt', './hypothesis.txt')
|
|||
|
|||
// WER as a percentage, ex: 0.03 -> 3%
|
|||
console.log(await jiwerCLI.wer())
|
|||
|
|||
// CER as a percentage: 0.01 -> 1%
|
|||
console.log(await jiwerCLI.cer())
|
|||
|
|||
// Detailed comparison report
|
|||
console.log(await jiwerCLI.alignment())
|
|||
```
|
|||
|
|||
## Resources
|
|||
|
|||
- https://jitsi.github.io/jiwer/
|
|||
- https://github.com/rapidfuzz/RapidFuzz
|