Brief Summary
The success of Chat GPT has focused many people’s minds on the benefits of this humanlike interface with technology. In this article Julian speculates on the effect this might have on the tools we use and suggests some production tasks he’d like to delegate to a chatbot.
Going Deeper
Any discussion of software in 2023 is inevitably going to include AI. Chat GPT is, I’m sure, a watershed moment in the development of AI and while it will inevitably be superseded by something even more impressive (or unnerving) it does indicate the power and genuine usefulness of a ‘chat’ interface.
Putting the onus on the technology to do the work and to fulfil the role of an assistant or even colleague, has obvious applications in all areas, including audio production.
AI In Plugins
We're more than impressed with the performance of products such as Acon Digital’s de-noise and de-reverb plugins as well as the online tools from the likes of Descript, but rather than a traditional plugin based approach, when will we see chat driven tools?
For a task like de-reverb the advantages of such an approach aren’t obvious to me but taking the premise that we’d like assistance with things we either don’t enjoy or don’t think we’re very good at, I’d welcome chat driven vocal tuning software. I imagine it being something along the lines of software I can upload vocals to and then direct using a chat interface in the same way as you can with Chat GPT. ‘Tune this transparently’ followed by ‘I can hear artefacts where she sings “falling down” can you fix that?’. Making a first pass and then fine tuning it in response to typed, or spoken requests. That would be awesome!
Another area I’d be interested in being able to interact with software in a chat is creating harmonies. I’ve used various techniques to experiment with harmonies and arrangements over the years. Busking them into the DAW (something I’m not very good at), figuring out alternate lines at the piano, using the piano roll and MIDI and more recently using Melodyne (ARA in Pro Tools really helps). Unless you’re arranging for a Barbershop quartet, for me the best results are always to be had by working with a singer who is good at harmonies and offering feedback and direction ‘don’t go down there’, ‘try a low harmony here’ etc. Chat GPT is great for generating ideas to which you can respond, either positively or negatively and in the spirit of there being no bad ideas in the studio, because bad ideas can lead to good ideas, some of the intelligence and fluidity of a chat interface might work for me when it comes to harmonies and arrangements.
Virtual Players
Another task I’d like to see driven by chat is virtual instruments. We’ve seen the impressive results ‘virtual player’ instruments can give. The likes of Toontrack and UJAM have worked hard in this area. When working with a real musician we make verbal requests of them, directing them but leaving the detail up to them and their experience. For example a virtual drummer (I hate programming drums…) might be instructed - ‘The kick is too busy. Sit back during the third verse and easy on the cymbals in the chorus’. The thing Machine Learning is good at is identifying patterns in large data sets. Give it enough examples of good drumming and it will reproduce something which sounds like good drumming as convincingly as it can reproduce written language. Like written language it might need some further direction and if you are looking for something genuinely original then you’re probably better off getting a real musician, and thank goodness for that!
Originality Vs Familiarity
Laborious tasks and tasks which fall outside our experience are well suited to this treatment and while originality is important, so much of what we do is trying to make things sound appropriate for the genre and ‘like a record’. That’s effectively trying to make it sound similar to things which have gone before. With a big enough data set and sophisticated enough software which can respond to direction from a human. I can’t see why AI can’t produce results which we wouldn’t question as being anything other than regular music.
In our recent podcast with Stian from Acon Digital, his find of the week was a paper on AI generated music. The results were everywhere from alarmingly good through to comically awful. But the takeaway for me was that this was AI generating music from text descriptions but hadn’t been through the revisions which are necessary to get the most from it. When using Chat GPT, the first answer is usually fairly uninspiring, however with some direction you can get much closer to what you might want. This is the same as many people’s human to human work. A draft is submitted, this receives feedback, is resubmitted etc. Just because the first draft needs work doesn’t mean the concept is flawed.
The ‘GPT’ In Chat GPT
The GPT part of Chat GPT refers to the the use of the Generative Pre-trained Transformer family of language models and this is what is behind its uncanny capabilities. This human-like interface and the very large data set from which Chat GPT can draw combine to produce its impressive results. Another important aspect of the tech behind Chat GPT is Reinforcement Learning from Human Feedback and given the huge success and uptake of Chat GPT this might be an area which presents an issue for a small developer wishing to try to do something similar, the bigger the user base the more feedback you can access and maybe this is an area where the biggest gets bigger because it has the best data. Chat GPT plugins exist, though these are browser plugins, not the sort you might expect to be discussed here and I’m sure someone somewhere is considering whether Chat GPT will be playing a role in audio plugins soon. Time will tell but this is an area which I’m sure will grow explosively and the specific uses to which it will be put are very much yet to be defined.
What about you? Would you welcome this tech into your studio? Share your thoughts.
Photo by Gabriel Barletta on Unsplash
Photo by Andrew Neel