Friday, October 30, 2020

How to create minimal music with code in any programming language

Étude in C minor

Let me start with a picture.

        int
       main()
      {float f
    ;char c;int
  d,o;while(scanf(
 "%d%c%d ",&d,&c,&o)
>0){c&=31;for(f=!(c>>4)*55,c=(c*8/5+8)%12+o*
                     12-24;c--;f*=1.0595);
                        for(d=16e3/d;d--;
                           putchar(d*f
                            *.032))
                              ;}}

I don’t really draw well, so I used formatted C code instead. That was supposed to be a triangular sound wave written in C. In fact, this very C code plays two-octave melodies written in text notation from stdin.

It’s only 160 bytes, fits into a modern-day tweet, and the reason it’s there is to show how simple it is to create minimal music with code in any programming language, not just the special languages like CSound, ChucK, or SonicPi.

Playing digital sound

Now let me actually draw something:

digital sound

This is sound - something that oscillates and moves the air in time, and the air reaches your ears and you hear it. The wave that goes up and down in the picture illustrates show the air vibrates. To describe this wave in digital terms people came up with an idea to measure the amplitude of the wave at fixed time intervals and use these sampled data points as “digital sound”.

Two questions arise - how often to sample the sound wave, and how to digitally represent the units of the amplitude range? To answer the first question we should recall that the human ear can not hear anything higher than 20000 Hz, that’s why CD music had a sample rate of 22000 Hz. Modern sound cards however tend to use sampling rates twice as high - 44100 Hz or 48000 Hz or even 96000 Hz. Lo-fi audio devices like Arduino or NES simply could not produce sound at such a high speed, so they used a reasonably low sampling rate, like 8000 Hz, and this is what we will be using in this article.

The amplitude quantization is also a matter of compromise. In theory, people can recognize a very subtle change of the amplitude, but computers can’t use infinitely precise numbers for each sample. Instead, they perform quantization - they map the amplitude value into a fixed-width number, like a float or an integer. In the picture above I’ve used +1 and -1 as the minimum and maximum value of the amplitude, assuming the float data type, but other popular formats are signed int16, where amplitude changes from -32768 to +32767, or a uint8, where amplitude changes from 0 to 255. The last one is what we will be using in this article because it’s very easy to understand and brings a few nice tricks.

Now, having the sample rate of 8000 and data format of unsigned uint8, digital sound is nothing more nothing less than an array of bytes shifting in time at the speed of 8000 bytes per second.

If our array is all zeros - there will be silence. If our array will be all 255 - there will be silence as well. However, if our array would contain different numbers - that would be a sound wave.

How can we hear it? One route would be to use native OS APIs, and that would be a route to the cross-platform programming hell because each OS has a different set of APIs, equally complex and unpleasant to use. And this is where UNIX way comes to the rescue. There are small utilities, and if you are lucky - they might even come with your OS - that allows you to play the sound streaming from stdin. On linux that would be aplay or pacat, while macOS and Windows users would have to install SoX and use the play command.

Here are a few commands that should allow you to play raw unsigned bytes from stdin at the rate of 8000 samples per second:

alias PLAY='aplay'
alias PLAY='pacat --rate 8000 --channels 1 --format u8'
alias PLAY='play -c1 -b8 -eunsigned -traw -r8k -'
alias PLAY='mplayer -cache 1024 -quiet -rawaudio samplesize=1:channels=1:rate=8000 -demuxer rawaudio -'
alias PLAY='ffplay -ar 8000 -ac 1 -f u8 -nodisp -'

# Play some white noise
cat /dev/urandom | PLAY

Oscillators

Digital sound is a very simple programming concept, and producing it can be as simple as writing a for-loop. For example, this tiny app should make infinite white noise:

// cc noise.c -o noise && ./noise | PLAY
#include <stdio.h>
#include <stdlib.h>
int main() { for (;;) putchar(rand()); }

To play a note rather than the noise we should make our sound periodic, the waveform should repeat itself at a certain frequency, and that frequency would define the note pitch. Here are the most common oscillator waveforms:

oscillator waveforms

The simplest oscillator in C would be the sawtooth wave. C implicitly casts int to unsigned char inside putchar, so if we simply write for (int t = 0;; t++) putchar(t) - we get a sawtooth wave. It won’t be a note yet, only a low buzzing sound. To play a note we need to know it frequency. For example, the most common “reference” note is A from octave 4, which has the frequency of exactly 440 Hz, and this is what most tuning forks (kamertons) resonate to.

440 Hz means the oscillator should go from 0 to 255 exactly 440 times per second. We also know that during one second our loop must produce 8000 values because that is our sample rate. So every time the loop iterates we should increase the oscillator counter by 256*440/8000=14.08. Roughly, 14.

To create a square wave we can simply get the 8th bit of the oscillator counter, it will be 0 for the values 0..127 and 0x80 for the values 127..255. This will result in a square wave of the same frequency as the sawtooth, it will be a bit more quiet, since the amplitude range will be twice as narrow, but still loud enough to hear it.

The sine wave requires the sin() function, we can use the oscillator counter (phase), divide it by 255, and multiply by 2π. The resulting amplitude should be multiplied by 255 to get it as loud as the other oscillators because sin() returns the values in the range [-1..1].

Here is the code that plays these three oscillators, one second each:

/* Sawtooth */
for (int t=0, osc = 0; t < 8000; t++, osc = osc + 14) {
  putchar(osc);
}
/* Square */
for (int t=0, osc = 0; t < 8000; t++, osc = osc + 14) {
  putchar(osc & 0x80);
}
/* Sine wave */
for (int t=0, osc = 0; t < 8000; t++, osc = osc + 14) {
  putchar(127 * sin(osc/255.0*2*3.14) + 128);
}

There is one more approach to produce oscillating sound, and it’s a clever one. It is often used to simulate bass or guitar strings. The idea is to fill the array with random data. The length of the array should be equal to the period of the oscillator, in our case for 440 Hz that would be ~18 samples. Then we will “play” bytes from that array, going back to the first item when we reach the end of the buffer. Despite being filled with random bytes, the repeating pattern of that random noise will sound like an oscillator and we will hear a distinctive pitch. But to make it sound like a trick we would have to smooth the data every time we loop over the array - we would replace elements with the average value of the current element and the next one. That’s how with each iteration the random number would become more and more smooth until they are all equal and the oscillator fades out in silence:

unsigned char a[18];
for (int i = 0; i < sizeof(a); i++) a[i] = rand();
for (int t=0; t < 8000; t++) {
  int i = t % sizeof(a);
  int j = (t+1) % sizeof(a);
  putchar(a[i] = (a[i] + a[j])/2);
}

Try using larger arrays and see how the pitch becomes lower and the duration of the sound gets longer. Doesn’t it resemble the sound of a string or kalimba tines?

Sequencers

Now that we are able to play a single note, how can we play a melody? We need to change the pitch of the notes in time and that is what step sequencers do. We may have a fixed number of steps, each having a fixed duration, we can iterate them in a loop and change the pitch accordingly. Each step may contain the increment of the oscillator phase counter, zero would notate a pause. For example, here’s a familiar riff “E B D E D B A B”. It uses only 4 notes - D and E from one octave and A + B from the other, lower octave. If we look at the note frequency table, the pitches of those notes in octaves 5 and 4 would be 659.2Hz (E), 587.3Hz (D), 440Hz (A) and 493.8Hz (B).

The oscillator phase increments would then be approximately 21 (E), 19 (D), 14 (A) and 16 (B). Each step may take 2000 samples (1/4 of a second). Then the playback loop could look like:

int osc = 0;
int melody[8] = {21, 16, 19, 21, 19, 16, 14, 16};
for (int step = 0;;step = (step + 1) % 8) {
  int increment = melody[step];
  for (int t = 0; t < 2000; t++) {
    osc = osc + increment;
    putchar(osc);
  }
}

I guess, now it’s time to deobfuscate the melody player from the very beginning of this post:

// play.c
#include <stdio.h>
int main() {
  float f; /* note frequency */
  char c; /* "cdefgab" for notes or "pr" for pause */
  int d, o; /* d = duration, o = octave */
  while (scanf("%d%c%d ", &d, &c, &o) > 0) {
    /* convert note to lowercase */
    c &= 31;
    /* c>>4 is 0 for CDEFGAB and 1 for "PR" */
    /* so, for pauses f would be zero, for notes - 55 */
    f = !(c >> 4) * 55;
    /* a trick we used in Nokia Composer post to conver note letter to note index */
    c = (c * 8 / 5 + 8) % 12 + o * 12 - 22;
    /* Note `x` frequency is 2^(x/12), or (2^(1/12))^x */
    while (c--) {
      f *= 1.0595; /* 1.0595 is 2^(1/12) */
    }
    /* Play sawtooth wave for the given duration with given pitch */
    for (d = 16e3 / d; d--; putchar(d * f * .032));
  }
}

It plays music in something similar to MML, RTTTL or ABC notation. It expects a sequence of notes coming from stdin. Notes can be separated by whitespace, or commas or any other symbols that scanf would safely ignore. Each note has 3 parts - duration, pitch and octave, for example our loop from above can be written as “8e5 8b4 8d5 8e5 8d5 8b4 8a4 8b4”. Note frequency calculation is taken from Nokia Composer and note playback is done with the sawtooth oscillator, as been described above.

Due to the numerous overflows, there are ASCII symbols beyond CDEFGAB that result in sharp notes:

  • C# - k
  • D# - l
  • F# - n
  • G# - o or h
  • A# - i

Bytebeat

There is a niche music genre, known as ByteBeat, where music is written as terse C expressions. I hope to cover it in more detail in further articles, because it combines the cleverness of tiny code with the creativity of music composing. It heavily uses bit shifts and bitmasks to juggle notes. Some of the melodies are created by accident, some are carefully composed with some end goal in mind. They tend to sound a bit harsh and maybe slightly off-tune, but their beauty is in their code. A typical example of bytebeat would be:

t*(t+(t>>9|t>>13))%40&120

It produces a repetitive melody of the uncertain pitch, that sounds like multiple instruments and has a certain rhythm. You might find a lot more bytebeat examples on the web, if you are interested.

Effects

The pure oscillating sound is boring. But fortunately, there are a few sound effects that we can apply to it without much of a hassle.

For example, the bytebeat tune above can be passed through some kind of a low-pass filter that would smoothen the high frequencies and leave the low frequencies. The simplest form of a low-pass filter would be approximating the current output value with the previous one stored in the accumulator:

int main() {
  int prev = 0;
  for (int t = 0;; t++) {
    int output = t*(t+(t>>9|t>>13))%40&120;
    prev = prev * 0.8 + output * 0.2;
    putchar(prev);
  }
}

The sound should become more muffled and less high-pitched. If you change 0.8/0.2 to 0.9/0.1 the effect should become even stronger. Try adjusting the coefficients and see how it affects the sound.

If you want to reduce the low frequencies and leave the high ones - just subtract the filtered low-pass signal from the original signal.

Another simple affect would be a delay line, which is just another array, storing a few recent signal values. For example, we want to re-play our sound with a 0.1 second delay. At the sampling rate of 8000 Hz we need to store 800 most recent samples and add them to the current output signal with a 800 byte offset:

#define N 800
int main() {
  int delay[N] = {0};
  for (int t = 0;; t++) {
    int output = ((t*(42&t>>10))&0xff)/2;
    delay[t%N] = output; /* put current sample into delay line */
    putchar(output + delay[(t+1)%N]); /* mix current sample with the oldest sample from the delay line */
  }
}

This should bring a bit of polyphony to the sound and there would be some echo. Delay lines are very easy to implement, and mixed with filters that may result in the reverberation effect.

And much more

There are many more possible effects one could code in C, but the post is long enough and I guess I should stop. Similarly, playing with an oscillator may inspire you to create a sample player, or a granular synthesizer, or a frequency-modulating synthesizer. Of course, sequencers are also an endless area of experimentation - from random music generation and self-evolving melodies to compact sequencers like old mod trackers, that could be used in short demos and games.

If you are into music - feel free to share your sound experiments! In the meantime, I’m preparing a post about the elegance of 1-bit sound with a simple tool to create 1-bit music. If there are any other music+programming topic you would like to hear about - just drop me a line.

I hope you’ve enjoyed this article. You can follow – and contribute to – on Github, Twitter or subscribe via rss.

Oct 15, 2020

See also: Nokia Composer in 512 bytes and more.



from Hacker News https://ift.tt/2FSGW5W

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.