publications
publications in reversed chronological order
2024
- GrooveTransformer: A Generative Drum Sequencer Eurorack ModuleNicholas Evans, Behzad Haki, and Sergi JordaSep 2024
This paper presents the GrooveTransformer, a Eurorack module designed for generative drum sequencing. Central to its design is a Variational Auto-Encoder (VAE), around which we have designed a deployment context enabling performance through accompaniment and/or user interaction. This module allows the user to use the system as an accompaniment generator while interacting with the generative processes in real-time. In this paper, we review the design principles and technical architecture of the module, while also discussing the potentials and short-comings of our work.
2023
- TapTamDrum: A Dataset for Dualized Drum PatternsBehzad Haki, Błażej Kotowski, Cheuk Lee, and 1 more authorNov 2023
Drummers spend extensive time practicing rudiments to develop technique, speed, coordination, and phrasing. These rudiments are often practiced on "silent" practice pads using only the hands. Additionally, many percussive instruments across cultures are played exclusively with the hands. Building on these concepts and inspired by Einstein’s probably apocryphal quote, "Make everything as simple as possible, but not simpler," we hypothesize that a dual-voice reduction could serve as a natural and meaningful compressed representation of multi-voiced drum patterns. This representation would retain more information than its corresponding monotonic representation while maintaining relative simplicity for tasks such as rhythm analysis and generation. To validate this potential representation, we investigate whether experienced drummers can consistently represent and reproduce the rhythmic essence of a given drum pattern using only their two hands. We present TapTamDrum: a novel dataset of repeated dualizations from four experienced drummers, along with preliminary analysis and tools for further exploration of the data.
- NeuralMidiFx: A Wrapper Template for Deploying Neural Networks as VST3 PluginsBehzad Haki, Julian Lenz, and Sergi JordaSep 2023
Proper research, development and evaluation of AI-based generative systems of music that focus on performance or composition require active user-system interactions. To include a diverse group of users that can properly engage with a given system, researchers should provide easy access to their developed systems. Given that many users (i.e. musicians) are non-technical to the field of AI and the development frameworks involved, the researchers should aim to make their systems accessible within the environments commonly used in production/composition workflows (e.g. in the form of plugins hosted in digital audio workstations). Unfortunately, deploying generative systems in this manner is highly expensive. As such, researchers with limited resources are often unable to provide easy access to their works, and subsequently, are not able to properly evaluate and encourage active engagement with their systems. Facing these limitations, we have been working on a solution that allows for easy, effective and accessible deployment of generative systems. To this end, we propose a wrapper/template called NeuralMidiFx, which streamlines the deployment of neural network based symbolic music generation systems as VST3 plugins. The proposed wrapper is intended to allow researchers to develop plugins with ease while requiring minimal familiarity with plugin development.
- Completing Audio Drum Loops with Symbolic Drum SuggestionsBehzad Haki, Teresa Pelinski, Marina Nieto, and 1 more authorApr 2023
Sampled drums can be used as an affordable way of creating human-like drum tracks, or perhaps more interestingly, can be used as a mean of experimentation with rhythm and groove. Similarly, AI-based drum generation tools can focus on creating human-like drum patterns, or alternatively, focus on providing producers/musicians with means of experimentation with rhythm. In this work, we aimed to explore the latter approach. To this end, we present a suite of Transformer-based models aimed at completing audio drum loops with stylistically consistent symbolic drum events. Our proposed models rely on a reduced spectral representation of the drum loop, striking a balance between a raw audio recording and an exact symbolic transcription. Using a number of objective evaluations, we explore the validity of our approach and identify several challenges that need to be further studied in future iterations of this work. Lastly, we provide a real-time VST plugin that allows musicians/producers to utilize the models in real-time production settings.
2022
- Real-Time Drum Accompaniment Using Transformer ArchitectureBehzad Haki, Marina Nieto, Teresa Pelinski, and 1 more authorSep 2022
This paper presents a real-time drum generation system capable of accompanying a human instrumentalist. The drum generation model is a transformer encoder trained to predict a short drum pattern given a reduced rhythmic representation. We demonstrate that with certain design considerations, the short drum pattern generator can be used as a real-time accompaniment in musical sessions lasting much longer than the duration of the training samples. A discussion on the potentials, limitations and possible future continuations of this work is provided.
2021
- Transformer Neural Networks for Automated Rhythm GenerationThomas Nuttall, Behzad Haki, and Sergi JordaJun 2021
Recent applications of Transformer neural networks in the field of music have demonstrated their ability to effectively capture and emulate long-term dependencies characteristic of human notions of musicality and creative merit. We propose a novel approach to automated symbolic rhythm generation, where a Transformer-XL model trained on the Magenta Groove MIDI Dataset is used for the tasks of sequence generation and continuation. Hundreds of generations are evaluated using blind-listening tests to determine the extent to which the aspects of rhythm we understand to be valuable are learnt and reproduced. Our model is able to achieve a standard of rhythmic production comparable to human playing across arbitrarily long time periods and multiple playing styles.
2019
- A Bassline Generation System Based on Sequence-to-Sequence LearningBehzad Haki, and Sergi JordaJun 2019
This paper presents a detailed explanation of a system generating basslines that are stylistically and rhythmically interlocked with a provided audio drum loop. The proposed system is based on a natural language processing technique: word-based sequence-to-sequence learning using LSTM units. The novelty of the proposed method lies in the fact that the system is not reliant on a voice-by-voice transcription of drums; instead, in this method, a drum representation is used as an input sequence from which a translated bassline is obtained at the output. The drum representation consists of fixed size sequences of onsets detected from a 2-bar audio drum loop in eight different frequency bands. The basslines generated by this method consist of pitched notes with different duration. The proposed system was trained on two distinct datasets compiled for this project by the authors. Each dataset contains a variety of 2-bar drum loops with annotated basslines from two different styles of dance music: House and Soca. A listening experiment designed based on the system revealed that the proposed system is capable of generating basslines that are interesting and are well rhythmically interlocked with the drum loops from which they were generated.