Thursday, June 20, 2024

ARBOREAL - Ai contest process document

“IF Ai is to create new music, THEN you have to train Ai with new music, ELSE no new music”

ARBOREAL - a pure data generated, LLM predicted midi score in the key of C
instrumental music, predicted by LLMs trained on synthetic datasets

Could Ai compose its own music if given the resources to do so? Is it possible for artificial intelligence to create its own unique music? Is it possible for Ai to compose original music that humans will enjoy? If Ai is to create its own music, it will need a large amount of intelligent original generated data for input training. This project idea initiated the construction of two musical algorithms in the pure data [pd] environment to produce MIDI "jam band" music. An algorithm that produced a four piece band. Producing enough midi data to fill a machine learning dataset. A lot of it! KicKRaTT, an algorithm generating drum patterns & KaOzBrD, generating notes, chords & harmonies. A unique collection of midi input containing every conceivable note, chord, and pattern produced by the pair of algorithms. The two separate pd algorithm developments were brought together to perform under one clock. In sync, the two generated midi data filled with random calls, arithmetic expressions and conditional responses. Generating a TREE of data filled with music that was new, unique & never before heard. Hanging from every branch were probability determined patterns & intuitive mathematical melodies and somewhere “in the tree” we found style!

The process began by developing algorithms in pure data [pd] to produce the musical performance of a trio band consisting of drums, bass, and keyboard. Over time evolved two algorithms: KicKRaTT, generating all the drums & KaOzBrD that generates notes & chords in harmony. Together or independently the algorithms will generate their part of this “trio band” midi performance indefinitely. The algorithms generated extensive hours of MIDI data, creating all conceivable notes, chords and theme combinations. Generating midi; instrument tracks, endless grooves, distinctive patterns and jam sessions that can be perceived as complete songs. The algorithms generated a surplus of unique midi data to build a synthetic dataset for LLM (GPT & LSTM) training that rely on extensive input data for composition prediction.

The random generation of musical notes produces a highly percussive, bell-like arrangement. If the dataset was constructed solely from randomly generated music, it would be expected the Ai predicted composition to be the same. For the composition to a have character, it required that there to be diversity within the dataset. It would be necessary for us to develop a source of original input data that encompasses a wide range of variation while preserving the same pitch and key. We configured the KaOzBrD algorithm for two different procedures to choose the next note in a sequence: random determined and arithmetic expression. To develop the expressions, we drew on the understanding that when playing an instrument you don’t randomly choose notes. Rather, in our moods(pitch), we gravitate (+) or regress (-) to the next note that best compliments the current note in the developing composition. Taking the algorithm's expressions one step further for some human-machine “co-creativity” integration. We accumulated in predicted jam sessions with the PIanoAI model, the improvisational musicality probability of our pianist and programmed those percentages back into the conditional statements of the KaOzBrD algorithm. Utilizing random and arithmetic configurations to generate input data, with the option of integrating human musicality, resulted in the creation of various setups for producing diverse midi performances. Humanizing the midi input data increased diversity and themes for the evolving datasets.

The YouTube "pd pure data BAND in C" video above is a good comparison between the pure data [pd] midi performance & the AI predicted score heard in the ARBOREAL KaOzBrD SoundCloud track or KicKRaTT YouTube & Vimeo videos.

Arboreal is composed in the key of C. We generated using all of the notes & 20 predetermined chords (triad note) in this scale across five octaves. Maintaining harmony throughout the process greatly simplified the task of dividing and merging all the tracks. Completed at the onset and conclusion of the procedure, as well as during the transfer of tracks between AI models. For generating midi data in key, KaOzBrD is configured for generating notes in two separate performances; random determinations & arithmetic expressions. Random note configurations produce a more percussive performance & arithmetic expressions more melodic. These different configurations produced a dichotomy between the files & expanding on the input data variation that can be had from a single key. We leave the results of the algorithms differences to the listener. If ARBOREAL exhibits style in the performance, would you think that this artificial style is a reflection of the algorithm’s design?, the differences in generating through random means or expression?, or both?. Having AI generate midi is one thing, enjoying AI generated midi that performs with style is another. It will be a measure of ARBOREAL if the predicted score performs with style for the audience.

The training and epoch processing times of the GPT LSTM models were not problematic given the scale of our datasets and the capabilities of our learning workstation. Even when we connected our datasets into a tokenized string and concatenated the datasets the size nor processing times was remarkable. The size and file contents of our generated datasets, set it apart from the available online datasets. Commercial .mid files are 2 - 3 min edits of structured songs and contain redundancy. Our pure data .mid files 5-10min edits from hour long sessions, filled with constant variation. With the notes, chords and drum patterns constantly in a state of change. Like a garage band that can’t stop improvising. We pondered how to get AI to predict changes in our song from these densely variating midi productions. Didn’t want to control the prediction through the model's configuration or employing a set scale map. We turned our attention to the datasets from which the models were predicting, and developed these sets into themes. Themed datasets were created by evaluating midi tracks and organized to enhance the overall theme of the song in various maneuvers. There is no right answer or set procedure to dataset theme development that we are following only that we found in doing so influenced the outcome. This decision to develop a theme dataset was done in favor of developing a song of a type and therefor you could still say we still had a hand in controlling the prediction. To retrain the model on a select portion of the composition's dataset and re-compose new midi scores from here. In the process, we also found the results more desired by developing a single theme and then rerun the process for another theme. Rather than collide two themes. Organizing the midi datasets into categories(themes), you are able to generate predictions that enhanced the melody and chords, while maintaining the original concept. The bulk of the song represents the predicted melody, drums, bass lines & accompaniment but it is the conclusion of ARBOREAL that the Ai process takes a bow. A total predicted thematic change to the music. This thematic change found at the conclusion was left undeveloped. What was important is that we achieved this in the prediction process. Where the prediction broke from the original structure and headed off in a new direction.

To predict the drum track, we aligned the pure data instrument midi out note numbers to the scale C3. From generating to predicting the template is pretty seemless, the resulting prediction produces an alternative drum performance of the original. There existed a timing issue between the original generated midi file and the predicted, after loads of head banging we found that delaying the whole performance by 504.3 milliseconds the fix. All the notes were slid / delayed by 504 millisec., in the predicted drum file. The final predicted drum track is a rather progressive performance for a pop song. Teaching Ai that "less is more" will be another exercise someday.

The GPT & LSTM models worked harmonious; tracks moved between models & predicted results were auditioned to audio to advance the cyclic songwriting process. The final predicted midi files were evaluated following conversions to ensure the absence of rogue notes, the preservation of the key, and the adherence of sequences to the intended direction of the song. Occasionally, the production led to unfinished melodies and sections of the composition that were incomplete, terminated suddenly, lacking smooth transitions. The process produced scores with omitted note duration. These issues exist. Some of the predictions were ready for immediate use, others were fixed, and the rest were discarded.

ARBOREAL was predicted to be distinct instruments, as multiple tracks, and subsequently reassembled in Reaper DAW for the performance. Generating unique midi data, making predictions based on that data, listening to the results & adjusting our datasets was the procedure. Ensuring a seamless structure was maintained consistently, with the assurance that the end result was achieved through the systematic approach. The idea of computers composing their own unique music was the initial driver; the focus is delving into a process aimed at achieving this goal. The pure data algorithms conceived ARBOREAL in the initial generations, Ai technologies from that conception developed the outcome. The produced Ai predicted midi score is a unique composition in that no part of the composition was derived from outside, commercial or historical midi data. No audio samples were used or frequency analyzed. Had no predetermined idea how the composition would perform or sound like at the start. Just ran this described process to the point where the final predicted instrument tracks defined a recognizable composition. We worked the process to define about five collective instrument parts. These parts created the changes throughout the composition. These five parts were last assembled together in Reaper DAW. The Reaper midi composition was performed on commercial midi audio components. The ARBOREAL.mp3 submitted to the International Ai Song Contest of 2024, is a live performance studio recording of the predicted instrument midi score on May, 2024.

Ai 2024 Contest Process Document

KicKRaTT Arboreal Midi Score can be purchased here!

* KicKRaTT; MUSIC, ALGORITHMS, DOCUMENTS, GRAPHICS & LOGOS: ISNI# 0000000514284662 *

No comments:

Post a Comment