License:
CC-BY-4.0
Steward:
MDC CuratorsTask: ASR
Release Date: 4/15/2026
Format: WEBM, TSV
Size: 23.76 MB
Share
This dataset consists of 1,000 UK bank sort codes read out aloud by a single male speaker of UK English from the Midlands. Sort codes are used to identify bank branches for making money transfers in the United Kingdom. They consist of six digits, split into groups of two, for example `40-30-24`.
Licensing
Creative Commons Attribution 4.0 International (CC-BY-4.0)
https://spdx.org/licenses/CC-BY-4.0.htmlRestrictions/Special Constraints
n/a
Forbidden Usage
Any attempt to clone the voice or train models that imitate the speakers in this dataset is forbidden
Intended Use
Testing number recognition systems.
This dataset consists of 1,000 UK bank sort codes read out aloud by a single speaker of UK English from the Midlands.
Sort codes are used to identify bank branches for making money transfers in the United Kingdom. They
consist of six digits, split into groups of two, for example 40-30-24.
All samples were recorded on a Xiaomi mobile device in different acoustic environments.
Indoors (different room sizes) and outdoors
Background noise (road noise, background music, children's noise)
The numbers are read out naturally including both
zero and oh for the number 0, and one one and eleven for the number 11.
40-30-24: Forty thirty twenty four
09-00-44: Oh nine oh oh forty four
01-02-14: Zero one zero two one four
09-00-96: Zero nine zero zero ninety six
60-06-39: Sixty zero six three nine
The sort codes were selected randomly from a list of the top five UK banks: HSBC, Lloyds, Barclays, National Westminster (Natwest) and Santander. The following number of codes are represented per bank:
332 HSBC BANK PLC
284 NATIONAL WESTMINSTER BANK PLC
231 LLOYDS BANK PLC
104 BARCLAYS BANK PLC
32 SANTANDER
5 NATWEST OFFSHORE LTD
3 LLOYDS BANK (JERSEY) LTD
3 BARCLAYS PRIVATE CLIENTS INTERNATIONAL
2 LLOYDS OFFSHORE
2 C&G (LLOYDS BANK)
1 LLOYDS PRIVATE BANKING LTD
1 HSBC PRIVATE BANK (UK) LTD
There is one directory audio/ with the audio files, in .WEBM format and two TSV files:
mapping.tsv: Contains a mapping between the sort code and the audio recording.
sortcodes.tsv: Contains a mapping between sort code and bank name.
mapping.tsv
audio_filename key sentence
f702763094e195baffbe3388f880ffbf.webm f702763094e195baffbe3388f880ffbf 77-16-07
cbf80ac96db87e6d0452c852c2912ffb.webm cbf80ac96db87e6d0452c852c2912ffb 30-11-12
1ee0a79a7449da9ce4675ba278f22904.webm 1ee0a79a7449da9ce4675ba278f22904 60-40-09
sortcodes.tsv
id sort_code bank_name
1 77-16-07 LLOYDS BANK PLC
2 30-11-12 LLOYDS BANK PLC
3 60-40-09 NATIONAL WESTMINSTER BANK PLC
Testing number recognition systems.