RMIS Benchmark | Download

Download Code

First clone the repository and install necessary dependencies.

git clone https://github.com/jianganbai/RMIS.git
cd RMIS/
pip install -r requirements.txt

Download Data

The RMIS benchmark is sourced from 13 public datasets: the DCASE datasets (20, 21, 22, 23, 24, 25), IICA, IIEE, WTPG, MaFaulDa, SDUST, UMGED and PU. For the convenience of users, we provide two ready-to-use ways to access the preprocessed benchmark data:

Tsinghua Cloud: download all packaged benchmark files from Tsinghua Cloud (password: RMIS_dataset). This is recommended if you want to mirror the full benchmark data locally for the RMIS codebase.
Hugging Face Datasets: download the benchmark data from jiangab/RMIS through the provided extraction script. In the RMIS project, this usage is mainly intended to treat Hugging Face as a cloud storage backend and materialize the data into the same local wav-style directory layout used by the other download paths.

If you believe this repository or the hosted benchmark data infringes upon your rights, please contact us for prompt removal or correction.

To download data from Tsinghua Cloud, first download all zip files and organize them in the following structure:

top_dir/
├── check_sums.md5
├── iiee/
│   └── iiee.zip
├── mafaulda_sound/
│   └── mafaulda_sound.zip
└── dcase24/
    └── dev_data/
        └── dev_ToyCar.zip

Then run the following script to conduct md5 checksum and unzip these files.

bash utils/scripts/extract_tsinghua_cloud_data.sh path_to_top_dir

If you prefer Hugging Face, use the following script to download and extract the benchmark data into RMIS-compatible local wav folders:

[HF_ENDPOINT=https://hf-mirror.com] python -m utils.scripts.download_and_extract_hf_data \
    --output_dir OUTPUT_DIR \
    [--subset SUBSET [SUBSET ...]] \
    [--remove_parquet_after_extract] \
    [--force_reextract]

For example:

HF_ENDPOINT=https://hf-mirror.com python -m utils.scripts.download_and_extract_hf_data \
    --output_dir datasets_hf \
    --subset iiee mafaulda_sound \
    --remove_parquet_after_extract

Here --output_dir is required, while HF_ENDPOINT, --subset, --remove_parquet_after_extract, and --force_reextract are optional. If --subset is omitted, the script processes all RMIS subsets.

In this repository, the Hugging Face route is primarily provided as a convenient file hosting and download channel. If you are interested in more customized workflows, you can also make use of the metadata and parquet assets attached to the Hugging Face repository to develop your own data loading or preprocessing utilities.

We also provide setup guidelines for downloading from the original public sources of each dataset. Please first follow the instructions in utils/download/README.md for downloading the raw data from the official dataset websites, and then refer to utils/preprocess/README.md for preprocessing instructions.

Download Model

The RMIS benchmark currently integrates 6 models for evaluation: AudioMAE, BEATs, EAT, CED, DaSheng and FISHER. More models will be online soon. All models can be run independently. To setup the evaluation config for a model, please refer to the respective README file within the model config folder rmis/model_conf/your-interested-model.