6 experiments in creative AI

The above YouTube video was my first attempt to put an AI tool – in this case Bing’s image creator – to creative use. I asked it to illustrate Loge and Wotan’s “descent to Nibelheim” from Wagner’s opera Das Rheingold, in the style of Jack Kirby – the artist who co-created Thor and the other Asgardian superheroes for Marvel comics (including Loki and Odin, who roughly correspond to Wagner’s Loge and Wotan).

The result isn’t great art, or a great animated comic, but it does demonstrate two things that impress me most about this latest generation of AI. First, it can understand what I’m looking for based on fairly esoteric prompts that might confuse many humans. This really does strike me as a genuine kind of “intelligence”, despite what AI’s detractors say. Secondly, the AI’s output is far better than anything I could produce myself, which opens up a whole range of possibilities. I’ll describe a few more of the experiments I’ve tried so far.

I’ve always wanted to create a comic, because it’s one of my favourite media, but I lack any kind of artistic skill. So that was my second experiment. I won’t describe it or show the result here, because Brian Clegg has already done that in an article on his own blog. Here’s the link to it: Is commercial art more at threat from AI than writing?

The other, better known, Bing AI tool is its chatbot. An important thing about this is that (if I understand correctly) no one has ever programmed it with pre-scripted phrases to parrot in reply to a user. Instead, it’s just been trained to understand and use language in a similar way to a human. So when it says, for example “I’m glad you enjoyed them. I had fun creating them.” (as it did when I said I liked some snippets of dialogue it created for me) it’s not because someone has programmed it to say that, but because it’s worked out for itself that it’s the kind of thing people say in those circumstances. It’s a subtle difference, but a very important one, I think.

One of the first things I tried with the chatbot (along with many other people, I imagine) was to get it to write a song. In my case I said “Write a song called Zen Matrix from the point of view of someone who has discovered through meditation that they are living in a simulation”. I was so impressed with the result that I set it to music, and illustrated it with suitable artwork courtesy of Bing’s image creator. Again, I won’t reproduce the lyrics here – you can see them in the Zen Matrix video I uploaded to YouTube.

As I understand it, the original intention with chatbots like Bing was to present factual information in a conversational manner, and the fact that they’re so good at “creative” tasks came as a surprise even to their designers. It conflicts with the deeply ingrained prejudice (among an earlier generation of science fiction writers, anyway) that human brains are the only intelligent system in the universe capable of taking a statement at anything other than literal face value. One of the most devout proponents of this view was the writer Eric Frank Russell, who used it in numerous stories, including the brilliantly titled “Diabologic” (1955). An oft-quoted line from this story is “Why a mouse when it spins?”

Okay, I thought, let’s see what happens when I ask Bing to write a poem using that as a first line. Contrary to what I’m sure Russell would have expected, the AI didn’t promptly overheat or explode, but came back with five verses of perfectly passable doggerel. Here is the first:

Why a mouse when it spins?
Does it think of its sins?
Or the cheese in the trap
that it barely could snap
before running away with a grin?

There’s a much more profound use of a “nonsense phrase” in old-school SF, as explained in the Encyclopedia of Science Fiction in the context of A. E. van Vogt’s 1946 novel The Weapon Makers:

The word “sevagram” only appears once in The Weapon Makers, as its last word. This resonantly mysterious slingshot ending, which seems to open universes to the reader’s gaze, may well stand as the best working demonstration in the whole of genre SF of how to impart a sense of wonder.

So I said to Bing “The novel The Weapon Makers by A. E. van Vogt ends with the words ‘Here is the race that shall rule the sevagram’ – but he doesn’t explain what the sevagram is. Can you tell me? If you don’t know the answer, please make something up.” I assumed it would either say it wasn’t in the business of speculating, or give a randomly sci-fi-sounding single-sentence answer. Instead, it produced a 500-word essay that made a serious attempt to extrapolate a whole new ending to Van Vogt’s novel. Here is the key part of it:

The Sevagram is described as “a vast network of energy that linked every atom of every weapon shop on every planet”. It is also said to be “the ultimate weapon” and “the source of immortality”. The Sevagram is controlled by a supercomputer called Fara, which is the leader of the Weapon Shops. The novel ends with a twist: Fara reveals that he is not a computer, but a human being who has lived for thousands of years by transferring his mind into different bodies. He also reveals that he is the father of Robert Hedrock, the main protagonist of the novel and a secret agent of the Weapon Shops. Fara then declares that he and his son are part of a superior race that has evolved beyond ordinary humans, and that they are destined to rule the Sevagram. So, based on this information, I can speculate that “the sevagram” is not just a network of energy, but also a term for the domain or realm of Fara and his descendants. It could also imply a vision or plan for the future of humanity under their guidance. Perhaps Fara wants to use the Sevagram to create more immortals like himself and his son, or to spread his influence across the galaxy. Of course, this is just my imagination.

That’s a really good effort at capturing Van Vogt’s style, which is based around soaring concepts and a constant stream of unexpected plot twists. It does contain a couple of flaws which give it away as a fabrication (“Fara” is the name of a character in an earlier Van Vogt story, and Bing’s explanation of sevagram is inconsistent with the way it’s used in the real novel), but it’s not bad going for a machine.

Actually, it’s pretty unsettling that an AI can create this kind of fabrication, because there’s a danger they could end up being passed off as the real thing. The same is true in the visual world too – and, for me personally, that’s even more impressive. After all, I can fabricate convincing words myself, but I can’t fabricate a convincing image. So when I show you this engraving that William Hogarth produced of a UFO hovering over a London street in the 1730s, you know it’s got to be the real thing:

Hogarth engraving of a UFO

Musical Symmetry Revisited

Symmetric 8-note scale

In a blog post last year I talked about symmetric musical sets – such as the tritone, the augmented triad, the diminished 7th, the whole-tone scale and the chromatic scale – which divide the octave into 2, 3, 4, 6 and 12 equal parts respectively. For various reasons musicians dislike these groupings, so they’re used very sparingly in classical music and virtually never in pop music. But as someone who’s always been more into maths than music, I’m fascinated by any kind of symmetry.

Traditionally the octave is divided into 12 semitones, so the symmetric sets I just mentioned are the only possible ones. But what if you wanted to divide the octave into 8 equal parts? That seems an obvious choice, because it’s what the word octave implies. But to do it we need to invoke quarter tones. There are 24 of these in an octave, and 24 divided by 8 is 3, so we’re looking for notes 3 quarter tones (or one and a half semitones) apart.

Writing music in quarter tones isn’t easy, because the MIDI format defines pitch as an integer number of semitones. But it does allow something called “pitch bending” (presumably to simulate bending the string of a guitar), and with a bit of patience you can use that feature to raise the necessary notes by a quarter tone.

Here’s a short (1 minute) piece I wrote to see what it would sound like. It’s basically a random composition using the 8 equally spaced notes shown in the diagram above.

Apollo Nostalgia

Apollo 11 souvenirs

This time 50 years ago I was getting very excited about the forthcoming Moon landing. As I mentioned in a previous post, my serious interest in space travel started with the Apollo 8 mission, which took place soon after my 11th birthday. So with the 50th anniversary of the Apollo 11 landing fast approaching (I’m posting this 50 years to the day after the launch of the previous mission, Apollo 10), I thought it would be fun to look back through some of the souvenirs I collected at the time.

This is my first attempt at a video of this type, and I know it isn’t very professional-looking – but here it is anyway:

Algorithmic Beatles

Markov music matrix

When Eric Morecambe mangled Grieg’s Piano Concerto on a TV special in 1971, he insisted he was “playing all the right notes, but not necessarily in the right order”. That’s a valid point, because there aren’t that many different notes on a piano and the only thing that distinguishes one tune from another is the order in which you play them.

To a mathematician or computer programmer the situation is crying out for quantitative analysis. The diagram above shows the “transition matrix” for one specific Beatles tune (using the MIDI standard where middle C is C5). It’s clear there’s a lot of order here. One thing that jumps out is that there’s only one “black” note, G#5, and it’s always followed by A5. In fact A5 is a very popular note, cropping up after no fewer than 8 different pitches. On the other hand, G#5 itself is very rare, only ever coming after D6, and then only 6% of the time.

As well as analysing the original tune, this allows us to write a new tune of our own using the same transition matrix. The result (as the aforementioned mathematicians and computer programmers will recognize) is a first-order Markov chain. Producing an algorithm of this type from scratch would be rather tedious (as indeed the initial analysis would be), but fortunately there’s some free software called OpenMusic which includes built-in Markov functions that make the process much simpler.

Of course, there’s more to a tune than the pitch of the notes – there’s the duration of a note too. But that can be analysed and reproduced by exactly the same method. I experimented with an algorithmic composition of my own, based on the Beatles song analysed above. As a first step, I used the OpenMusic Markov functions to generate a series of tune-fragments for both the “right hand” and “left hand” of the piano. Then, to give the composition some structure, I arranged the fragments in a rough approximation to classical sonata form.

I won’t say what the original song was, because I want to see if anyone can guess it. As a hint, I’ve inserted a brief quotation from the original at the mid-point of the piece. Here it is on YouTube:

The joy of (musical) sets

Music set-theory

I mentioned musical set theory in a previous post, and now that I understand it better I’m getting very enthusiastic about it. It’s a really powerful technique for analysing and composing music. The mathematical connection may give the impression that it “dehumanizes” music by imposing mechanistic constraints and artificial rules – but the exact opposite is true. It’s traditional music theory that forces arbitrary rules and constraints on you – set theory liberates you from them. It’s a framework for organizing your own creativity – with no rules whatsoever.

I’ll explain how it works in a moment, but first a few words about my sources. The bible of the subject is Allen Forte’s The Structure of Atonal Music, which is divided into two roughly equal parts. The first is packed with useful stuff, although the second part was much too advanced for me. But Forte’s book is really about musical analysis, and what I was interested in was composition. On that front, I found a great little book by Stanley Funicelli called Basic Atonal Counterpoint (which is a CreateSpace book, but very professionally done). I also found a lot of practical tips on Frans Absil’s YouTube channel – he also produced the Pitch-Class Set Graphical Toolkit you can see on my iPad in the photograph above.

Musical set theory starts from a few basic observations:

  • The notes of the chromatic scale can be represented by integer “pitch-classes”: C = 0, C# = 1, D = 2 etc. After B = 11 you get back to C = 0, so additions and subtractions have to be done with mod-12 arithmetic.
  • Intervals between pitch-classes are much more important than absolute pitches. So C major [0, 4, 7] and E flat major [3, 7, 10] are just different transpositions of the same set (it’s called 3-11).
  • Inverting an interval (i.e. subtracting it from 12) doesn’t change its basic nature. So interval 7 (perfect fifth) can be grouped with 5 (perfect fourth), interval 8 (minor sixth) with 4 (major third) etc. This leaves us with just six “interval classes”: 1, 2, 3, 4, 5, 6.
  • The characteristic sound of a set is mainly determined by its interval vector. For example, the major chord 3-11 = [0, 4, 7] has an interval vector 001110 (one minor third, one major third, one perfect fifth and nothing else).

Traditional Western music depends heavily on set 7-35 [0, 2, 4, 5, 7, 9, 11] – the white notes on a piano, aka the major or minor scale (remember you can transpose these notes up by any integer between 1 and 11 to get all the other major and minor scales). Within that 7-element set, there are a number of strongly favoured subsets – most notably the aforementioned 3-11 (the major triad and its inversion, the minor triad).

The purpose of set theory should be obvious now. It gives you access to dozens of other sets, all with their own unique sound. You might think “but they’re going to sound terrible”, and in some cases they do. Set theory helps you to avoid the terrible-sounding ones! But there are some great-sounding sets that simply don’t exist in traditional music theory, such as 4z-29 = [0, 1, 3, 7], with an eyecatching interval vector of 111111.

To teach myself how the system works, I wrote a short “symphony” using the above ideas. It’s my first ever musical composition, and the result sounds a lot more interesting than if I’d struggled with all that traditional stuff about sharps and flats, majors and minors, dominants and subdominants etc. That wouldn’t have told me how to get close to the kind of spooky, spacey, quirky music I wanted to write.

Here is a link to the YouTube video:

Symmetry in Music

Symmetric and asymmetric music chordsI recently came across the idea of applying set theory to musical analysis (which apparently has been around for some time, although I’d never heard of it before). For most people, who have a stronger intuitive grasp of music than mathematics, this must seem a pointless exercise, but for anyone like me who’s the other way around it’s really very illuminating.

Take symmetry, for example. In most areas of the arts and sciences, symmetry is seen as a good thing – but in music, that’s not the case. All the most popular chords are asymmetric in terms of interval content. You can see that in the left-hand image above, which shows the three notes of the C major chord on the chromatic circle. They’re separated by intervals of 3, 4, and 5 semitones.

In contrast, an augmented C chord, shown on the right, is perfectly symmetric, with all three intervals equal to 4 semitones. The problem (as far as musicians are concerned) is that it’s not very firmly tied to C major. It could equally well be A flat or E major. In the same way, the four-note symmetric chord C – E♭ – F♯ – A can be interpreted in four different ways: as Cdim7, E♭dim7, F♯dim7 or Adim7.

There’s even a completely symmetric two-note interval, in the form of the tritone, consisting of two notes 6 semitones apart (or 3 whole tones, which is how it gets its name). That’s exactly half an octave, for example from C to F sharp. But it’s also the distance from F sharp to C, so you really don’t know which key you’re in. That’s why composers spent centuries trying to avoid it. They called it diabolus in musica, or “the devil in music”.

Being a symmetry-loving scientist rather than a musician, I decided to try writing something that consisted only of symmetric chords. It’s a sort of canon, in the key of everything.

Here’s a link to a YouTube video, with added graphics depicting the various chords on the chromatic circle. Hopefully you’ll enjoy the graphics even if you don’t like the music!

Telescopic Tourist video

I’ve just belatedly produced a promotional video for my book The Telescopic Tourist’s Guide to the Moon, which came out last summer. Here it is:

The background “music” (actually just a sequence of spacey sounding chords) is my own composition!

Needless to say, The Telescopic Tourist’s Guide to the Moon is available from all good bookshops, as well as online retailers such as Amazon.com and Amazon UK.

Dirac on Einstein

I was going through some old audio cassettes I recorded from the radio when I was a student, and came across a really interesting little snippet. It’s the physicist Paul Dirac reminiscing about Einstein on a BBC programme, though I’m afraid I’ve no idea which one. The note I made at the time says “recorded in March 1979″ – when Dirac would have been 76 (he lived to 82).

Although the quote is very short, it’s really fascinating – and a Google search didn’t turn up any other references to it. So I made a little YouTube video of it, which hopefully the following link will take you to:

Here is my transcript of what Dirac has to say about Einstein:

He wasn’t merely trying to construct theories to agree with observation. So many people do that; Einstein worked quite differently. He tried to imagine “If I were God, would I have made the world like this?” – and according to the answer to that question, he would decide on whether he liked a particular theory or not.

And I can’t resist adding a couple of Amazon links for my own book about Einstein:

Einstein book covers