Adumb's World: September 2009

Monday, September 28, 2009

Zurich Institute for Computer Music and Sound Technology

There was 'pop colloquium' today at CCRMA. The announcement was made just last night, so the audience was small and the duration short. A handful of folks from the Zurich University of the Arts came by to talk about some research projects happening there now. The one that especially caught my attention was Daniel Bisig's presentation on Interactive Swarm Orchestra and Immersive Swarm Spaces. In brief (if I understand correctly), the studies map the results of lifelike swarm simulations (flocking birds, schooling fish, etc) to musical and graphical parameters. The result is a transposition of the emergent properties of swarms (a single cloud like entity comprised of many individuals) to an artistic framework. This kind of thing has interested me for a long time, in particular as a way to perform granular synthesis using something other than rand(). They have made C++ libraries available, which I think I'll be checking out some time soon.

Object-oriented audio systems

The first assignment for 256a (Music, Computing and Design I) was a doozie. It involved creating a signal generator application with a handful of features like waveform selection and pulse width modulation. My submission is here: http://ccrma.stanford.edu/~adam/courses/256a/hw1/

It didn't have to be a doozie, but once you get used to generic programming there's no going back, so I kind of went overboard. The code can speak for itself; this post is about some issues that left me scratching my head a little.

The Flow of Callbacks

In my system I created a singleton entity that acts as a "Server," mediating the connections between audio processing objects (clients) and the DAC. Clients can be registered with the server, and registered clients will get callbacks when the server gets its callback from the driver (the "driver" is a wrapper around some nice api like RtAudio, PortAudio, Juce, etc). Clients connected to the dac can be processed and the their resulting outputs summed. But what if clients are not connected to the dac, but connected to each other? This is an obvious feature of modular audio environments (both analog and digital), but there is a question of how best to represent these "connections" in software. The way I decided to try was to design any classes that require input to hold a pointer to an abstract client. Source objects can be registered with these, and their callbacks will be called by the client taking the input. In this way, the callbacks "flow" in a depth-first traversal of loosely coupled clients in a graph, with the Server/DAC as the origin. The question that remains is: is this optimal? I've considered another way of dealing with processing, such as having the server process all objects, relying on a connection graph to handle dependencies (inputs to other clients). I haven't really thought that one out, so maybe it's senseless. But I'm not convinced at the obvious way I came up with is optimal. And so I scratch my head...

Time

I ran into a bug when I connected a client to the dac more than once (for multi-channel output). The bug was that the client would render new material more than once per buffer. Ge explained to me that the notion of time is useful in determining whether a client should produce new audio or just return a copy of old material. This hint was hugely insightful, and the problem went away. However, now I'm scratching my head over the usage of the word time. When the a client is asked for audio some time after it has already done so, it needs to know if time has advanced for the server, as well. If not, then no new audio needs to be generated. So in this sense, time is a perfectly good description of what needs to be considered. But the word 'time' will inevitably come up again soon when I implement some kind of event scheduling system, and I know this has to be tied in to the server in much the same way. The question now is should I create some kind of "time keeper" that manages time? I need to draft some specs for the next phase of this project before I can know anything else...

Wednesday, September 23, 2009

Day 2

Music 220a

Fundamentals of Computer Generated Sound - Chris Chafe

According to the professor, 220a is a kind of programming course. It is apparently designed to convey the basic techniques of digital sound synthesis and computer music composition, using ChucK as a pedagogical tool (a task for which I believe it is optimally suited). The content of this course seems pretty straight-forward for me, but it's been three years since I've covered these topics in a classroom. I'm looking forward to spending long nights curled up with a midi controller and my laptop, making bizarre bleeps and bloops into the night for academic credit. One thing strikes me as odd, though. The text for this course is Perry Cook's Real Sound Synthesis. I actually read it this the summer, and it is rather in-depth, mostly focused on physical modeling. Maybe there is more to this class than meets the eye. We shall see.

The second half of the lecture took place in the Knoll Concert Hall, a room that is more reminiscent of a small chapel than a concert hall. It sports 16 channels of ADAM speakers, eight around the walls and eight hung from the ceiling. The presentation was a ~15 minute live computer music performance by Fernando Lopez-Lezcano. The piece, entitled "A Very Fractal Cat" was played on a midi keyboard with foot pedals and switches. The sound was like being inside a piano when someone sits down to play some atonal music, and all of a sudden the strings emit sonic smoke that wafts up as the extremely high partials decay. Then the whole damn thing catches fire. It was a good piece, but I wrote a note to myself during the performance: NO COMPUTER MUSIC BEFORE NOON!

Music 320

Introduction to Digital Audio Signal Processing - Jonathan Abel and David Berners

I guess this is the class I've been waiting for. The signal processing series, taught by Abel/Berners or Julius O. Smith, represents my reason for being at CCRMA in the first place. Today was really just a bunch of hand-waving over the field of signal processing. Jon gave a very basic introduction to perception (cochlea, basilar membrane), digital audio signals, complex exponentials, sinusoids, and resynthesis. Notwithstanding the elemental nature of today's lecture, during which I found myself installing Octave packages on my laptop, I learned something really important about dB conversion.

I had been confused by linear to dB conversions that sometimes involve 20*log something and sometimes 10*log something. I guess I just wasn't looking closely enough. Jon pointed out today that given a signal x(t), the dB representation of that signal is 10*log( abs( x(t) )^2) but is sometimes represented by 20*log( abs( x(t) ) ). This follows from one the logarithmic identities, but I still wondered why one would choose to use one form over the other. First of all, we need to look at why the the dB system is the way it is. In the first version of the dB calculation, the input signal is squared because the measurement is taken on the signal's power. The power of two can be cancelled by the identity, and I suspect this form is desirable in computer systems because it removes the squaring operation from the computation.

Matriculation / Day 1

I'm officially a Stanford student and CCRMAlite. You can check out the details of my coursework over at http://ccrma.stanford.edu/~adam. Today was my second day of classes, and some first impressions are due.

First off, the Knoll (CCRMA headquarters) is a mansion dating to the 1910's and was once designated as the president's residence. It feels rather like a castle, complete with stone construction, vaulted ceilings, and a spiral staircase. The top floor offers views of San Francisco Bay to the north and the Stanford Dish / western foothills to the south. It seems like the perfect place to watch the sun rise with a cup of strong coffee after an all night coding jam.

All of my six courses take place in the one classroom, room 217. To get there you go past the music tech museum, up the spiral stairs, through a computer lab, and through another computer lab. The room is furnished with long shared desks with AC outlets for all, and a projector shines onto the bare white wall at the front. The walls are still adorned with old lamps and long drapes, and when chairs become scarce students sit in the windowsills or on the old radiator. The room, perhaps the whole building, is reminiscent of J.F. Sebastian/J.R. Isidore's crib. In short, it's a geek haven.

Music 250a
HCI Theory and Practice (AKA Physical Interaction Design for Music) - Edgar Berdahl & Wendy Ju

HCI Theory and Practice is a funny name for this course, as it is really about hardware hacking and home-made alternative musical interfaces. The students in this course have to buy a kit which contains an Arduino and some sensors. The thrust of the course is essentially to get us to learn a thing or two about electronics, embedded software, and using the two to control real-time audio processes running on a computer. More specifically, the project involves the integration of sensors, arduino, firmware, and Max/MSP or PD. I'm glad it's not a formal survey of HCI topics as presented in Computer Science, because I worked a little bit with an Arduino at CalArts, and I've been meaning to get back into that.

The course is taught by Ed Berdahl and Wendy Ju. I had met Ed back in November 2008 when Miriam Kolar introduced me around. He's very articulate and clearly enthusiastic about HCI for music. He and Wendy presented some videos to get the gears turning, which can be found at http://ccrma.stanford.edu/courses/250a/videos.html.

Music 256a

Music, Computing, and Design I:
Software Design and Implementation for Computer Music - Ge Wang

Just to put it out there, Ge Wang is a bit of a celebrity in the computer music world. Before I got to CCRMA, many people talked with had heard of him, or at knew of his projects. It's no surprise, since he is the founder of the successful iPhone app company Smule, and he authored the ChucK programming language for real-time audio noodling/performing. No doubt, his celebrity status is earned from his charismatic personality and open enthusiasm about new ideas, ranging from deeply intellectual issues to silly novelties. Clearly, there are many fascinating sides to this man, and I'm thrilled to have someone this respectable as a professor (he's also my MST program advisor).

As for the course content, the first day was a lot of hand-waving about some really hefty issues. For example, he was trying to briefly mention the follow-up course, 256b - Mobile Music, but ended up on a lengthy tangent about how handheld devices combine intimacy, communication, and creativity in a way that can change the way

people think about making music, the way Beethoven changed the way people think about music. He also touched on software design principles like polymorphism, a term which, combined with the requisite C/C++ experience, had to have scared away some newbies (hopefully).

Thursday, September 17, 2009

Don't Come Back

I'm in La Quinta, CA where it's 93ºF at 9:45PM. There is a well-known historic resort here where Art & Logic is having their annual conference. This is my last day of work. Tonight at the final dinner by the waterfall the president of the company approached me by the dessert buffet and said, "don't come back."

Working for Art & Logic is pretty much a dream job. Developers work from home, share administrative and managerial duties, and are more or less treated like gods. The client list is long and attractive, we have a great reputation, and the presidents behave like doting fathers. I have never once experienced a moment when I felt that my common sense was being undermined. At times, clients have a tendency to stretch their expectations too far, but the management team at A&L are experts at handling these issues and the development teams are thusly insulated from stupidity (usually).

Many have asked why I am leaving such a cool job to return to school and incur an enormous debt while I clearly posses the wherewithal to have a successful career as a software engineer. The answer is simple: passion. I have a passion for audio technology and I want to make it the focus of my career. I have great ideas and I was to turn them into products. I am mystified by the rigorous mathematics in DSP, and need to understand it. That's exactly how Paul felt back in the dark ages. He told me he is jealous because I have the opportunity start at the beginning, to pursue my passion, and to never lose sight of why I got into engineering the first place. If he were to live vicariously through me, it would be his chance to do it all over.

So that's why he gave me some of the best advice I've ever received: "Don't come back."

Tuesday, September 15, 2009

Becoming a music technologist

There are a lot of people out there who use music production tools for their art and dream of new ways to work, or have ideas for new sounds that they can only hear in their mind, or have thought of a revolutionary way to approach the process of sound synthesis, or sampling, or effects, or composition...

I am one of those people, and I have come to a point where I feel comfortable firing up a software project and putting pen to paper with a new idea. But I also know what areas of my knowledge need work, which is of course the most important question of all. I remember when I had no idea where to start. I remember how overwhelmed I felt when the idea of creating tools seemed unattainable. But by taking careful steps with the help of some fantastic people and a share of lucky breaks, I've acquired a set of skills that will help me as I continue the journey. So for anyone with similar interests I think I should explain a bit about what I do and how I got to this point.

What I do: I've been working as a programmer for just over three years. The first two were for Audio Impressions, where I developed some VST plug-ins and a big windows app called DVZ RT. It was a really cool experience and I learned a lot about how software is made. For the past year I've been working for a really cool software outsourcing company called Art & Logic. I found them when I first started looking for programming jobs, but decided to save that application for when I had more experience. Serendipitously, just after I left Audio Impressions to intensify my studies in Computer Science (more on that in a bit), Art & Logic found me! I primarily work on the Beat Thang drum machine, but have done some consulting on other projects related to music, audio, and other multimedia applications. Now I'm about to start the MST program at Stanford, and I'll probably be leaving Art & Logic to focus on that. I'd like to stay on-board to do a dash of consulting, but that is up to some higher-ups.

How I got to this point: It all started with a love of electronic music and a really bad attitude towards school. I was a crappy student in high school, especially in math, and my collegiate options were pretty limited. I only applied to schools with music tech programs (CalArts, U of Oregon, Berklee, Oberlin, U of Michigan, U of Miami). I was accepted to Michigan, Oregon, and Berklee. I went to Michigan. Looking back I think I should have gone to Oregon, because I hated Michigan. I hated the stuck up university attitudes, the arrogant professors, the cold weather, and the damned math requirement. So I reapplied to CalArts, which was close to my home town (Los Angeles), and they let me in. Problem solved, yay! Except for one thing....

CalArts, wonderful place. Full of diverse, interesting, and passionate artists who want nothing but to be able to make art. It's sorely lacking on the general education side, which shouldn't be a problem for most--in particular older undergraduates with prior education and of course grad students. For the average age college kid, however, it's a curse. Oh well. Problem for me was CalArts was where I got into engineering. I'm not exaggerating when I say that CalArts is the absolute worst place to take up engineering, except for the fact that you have plenty of free time to figure stuff out for yourself. And that's exactly what I did. I took a C programming class and then taught myself how to code in C++, create VST and Audio Unit plug-ins, make user interfaces, and a bunch of other stuff I became obsessed with. My teachers, in particular Miriam Kolar and Mark Trayle, my friends, in particular Cooper Baker and Tony Cantor, and the fine folks at the KvR Audio DSP and Plug-In development forum helped me immensely along the way. Constant hacking from late 2004 till I graduated in spring 2006 produced a portfolio that landed me the job at Audio Impressions.

Audio Impressions (AI) definitely took a risk in hiring me. They wanted someone with hands-on experience coding VST plug-ins, rather than a computer science graduate. And clearly they wanted someone cheap, but don't get me wrong, I was fine with that. Scoring that job was pure elation. I owe it all to Stan Bartilson, the "chief software architect", who had decades of experience but nonetheless offered me the chance to prove that a dedicated hacker can learn to do anything, as long as the focus and passion is there.

As it turns out, I'm not so sure that wanting it is really all it takes to be a great software engineer. There's a reason why there is a branch of academia called Computer Science, which is separate and distinct from the Computer Programming. The latter relies heavily on knowledge of the former, and, as with any science, the academic environment is best suited to learning. Not to say that it can't all be picked up in the field, but as soon as I had to have conversations about optimal data structures, search and sort algorithms, and assembly language programming, I pretty much immediately enrolled in computer science and math courses and split my time between work and school.

And so, the crappy student who avoided math like the plague got into engineering at an art school and wound up voluntarily enrolling himself in math courses. LOTS of math courses. I picked ones that seemed to be essential to DSP and computer science:

Pre-Calculus
Differential Calculus
Multivariable Calculus
Linear Algebra
Discrete Math

And as for computer science, I figured I should take all the necessary courses needed to fulfill the breadth requirement for application to UCLA's C.S. Master's Degree

Computer Architecture
Data Structures and Algorithms
Computer Organization
Programming Languages
Operating Systems
Formal Languages and Automata Theory
Software Engineering

As you can see, I loved this stuff, at least enough to split my income in half for 2.5 years to pursue it. I found my calling. I was going to get a masters in Computer Science. Then I looked in to the fields of study within computer science: Human Computer Interaction, Computer Vision, Artificial Intelligence, Security, Databases, Networking, Theory. And you know what? None of that had anything to do with music or audio. I could have made my way through one of those areas and applied it to music technology, but in the end I would have ended up going down a much more travelled path. And if you ask me that's boring!

As far as I know, the MST program at Stanford is the only music technology program that provides the theoretical rigor I am craving, as I already have the fundamental background and engineering prowess, and I'm not looking to make art in an academic setting. So it was the only program I applied to. I'm in, I'm thrilled, and classes start in 6 days!

1st post hahaha!!11!!one!!1!eleven

I've decided to create a web log to chronicle my studies as an MST student at Stanford's Center for Computer Research in Music and Acoustics (CCRMA [pronounced 'karma']). This is mostly for my own use. It's a way to enhance my learning by trying to explain some of the mind-boggling stuff I'm about to study. I'll also add other interesting tidbits relating to programming, digital signal processing (DSP), and other music technology related findings.

Adumb's World