Synth — Max Aiello

Design Philosophy

Sound design shouldn't require a degree.

Most synthesizers are built for people who already know synthesis. The interface is a wall of knobs labeled with words like "resonance", "attack", and "LFO rate" — terms that mean nothing to someone who just wants to make a sound they hear in their head. The learning curve isn't steep; it's vertical.

This project started from a different question: what if you could describe a sound the way you'd describe it to a friend? Say "warm bass pad" or "icy bell that shimmers" or "punchy 808 with a long tail" — and the synth figures out the rest. That's the core idea. The AI doesn't replace the knobs; it gives you a starting point you can hear immediately, then you tweak from there.

Under the hood, the synth uses GPT-4o-mini with a heavily engineered system prompt — a 3,000-word sound design reference guide that teaches the model the physics of waveforms, the relationships between filter types and harmonic content, and 20+ real preset recipes it can use as building blocks. When you type "dreamy keys," the AI isn't guessing — it's blending the envelope of a vibraphone with the detuned oscillators of a lush pad and the reverb profile of a concert hall.

The entire synth runs client-side in the browser with zero server infrastructure. The Web Audio API handles real-time DSP — oscillators, filters, envelopes, effects chains, a brick-wall limiter — all at 44.1kHz with sub-millisecond latency. MIDI keyboards plug in via the Web MIDI API with hot-plug detection and sustain pedal support. You can export anything you make as a WAV file or save it as a .syn patch file to share with others.

The goal is a synth that a professional producer would use alongside their DAW, but that a fourteen-year-old with no music theory could pick up and start creating with in under thirty seconds.

The Problem

Synthesizer UIs assume expert knowledge. New music creators face months of learning before they can translate ideas into sound.

The Approach

Natural language → AI sound designer → real-time audio. Describe what you hear in your head, the AI configures 50+ parameters, and you hear the result instantly.

Key Insight

The AI system prompt contains 20 reference presets as "training targets" — not to copy, but to blend. "Icy bell pad" = glass_bell oscillators + cold_pad envelope + reverb. Novel sounds come from novel combinations.

Stack

Vanilla JavaScript · Web Audio API · Web MIDI API · OpenAI GPT-4o-mini · Zero dependencies · Zero frameworks

AI Sound Designer GPT-4o-mini

Type a sound description and hit Generate.

Debug Log

Pitch

Pitch Envelope

Glide

Time100ms

Bend

Range±2st

Oscillator · Stack

Default Patch

Mst

Unison

Spread12¢

Master Filter

Velocity

Vel → Vol60%

Vel → Filter40%

Amplitude Envelope

Pan

L R

Master Filter

CutoffOpen

Resonance1.0

Arpeggiator

Pattern

Rate

Gate Type

Octaves

Latch

Step Sequencer 16 steps

Reverb

Convolution · Room

Room Size0.60

Decay

2.0s

Wet Mix35%

Delay

Echo · Feedback

Time

320ms

Feedback35%

Wet Mix25%

Chorus

Modulation · Width

Rate

0.80Hz

Depth0.40

Wet Mix40%

Distortion

Waveshaper · Saturation

Drive20

Wet Mix30%

Keyboard · C3–B4

Active: A B C

SUS

—

Play a key to begin

AC WC# SD ED# DE FF TF# GG YG# HA UA# JB KC5

BPM 120

No MIDI file loaded ⊕ Upload MIDI

Vol Out 80%

Patch ⬆ Upload .syn

Under the Hood

How the synth actually works.

Every sound you hear from this synth is built in real-time by your browser. There are no audio samples, no pre-recorded clips — just math. Here's a plain-language walkthrough of the pieces that make it work and why each one matters for creating sounds that actually sound good.

◎

Three Oscillators (A, B, C)

An oscillator is a repeating wave that produces a tone. This synth gives you three of them, each independently configurable — different wave shapes (sine, saw, square, triangle), different octaves, different tuning. Why three? Because real instruments are never a single pure tone. A piano string vibrates at its fundamental pitch plus dozens of overtones. Layering oscillators — a sawtooth for brightness, a sub-sine for weight, a detuned saw for width — is how you build sounds with the richness and complexity of real instruments from simple building blocks.

▽

Subtractive Filter

A lowpass filter removes high-frequency overtones from a harmonically rich wave. It's called "subtractive synthesis" because you start bright and carve away frequencies to shape the tone. The cutoff knob controls where the filter starts cutting, and resonance boosts the frequencies right at the cutoff point, creating that classic synth "squelch." The filter envelope amount lets the filter sweep open on each note attack and close as it decays — this is the single most important parameter for making a synth sound alive rather than static.

⟋

ADSR Envelope

Attack, Decay, Sustain, Release — these four values define the volume shape of every note. Attack is how fast the sound reaches full volume (instant for a pluck, slow for a pad). Decay is how quickly it drops from peak to sustain level. Sustain is the held volume while your finger is on the key. Release is how long the sound lingers after you let go. This simple four-stage curve is what separates a piano hit from an organ drone from a bell ring — same pitch, completely different feel.

≈

Effects Chain (Reverb, Delay, Chorus, Distortion)

Dry synth sounds exist in a vacuum. Effects place them in a space. Reverb simulates a room — a convolution-based impulse response built from randomized decay curves. Delay creates echoes with a feedback loop and a lowpass filter to simulate distance. Chorus duplicates the signal with a modulated delay, creating the lush doubling effect of an ensemble. Distortion clips the waveform to add harmonic saturation. Each effect has a wet/dry mix so you control how much processing hits the signal.

⧖

Brick-Wall Limiter

When you stack three oscillators, add distortion, and crank the resonance, the output signal can spike well above safe levels. A DynamicsCompressor node configured as a limiter sits at the end of the signal chain with a 20:1 ratio and 0.5ms attack — fast enough to catch any transient before it clips your speakers. This is the same technique used in professional mastering chains. Without it, certain patches would produce painful digital distortion instead of clean, loud output.

♬

Step Sequencer / Arpeggiator

Hold a chord and the arpeggiator breaks it into a rhythmic pattern — cycling through the notes up, down, up-down, randomly, or as a full chord. The 32-step sequencer adds per-step velocity (how hard each note hits), transpose (pitch shift in semitones), and chord voicings (stacking intervals on each step). Swing, humanity (random timing jitter), and life (velocity randomization) make mechanical sequences feel human. 18 built-in presets cover everything from classic arpeggios to complex polyrhythmic patterns.

⚡

AI Sound Designer

The AI prompt system sends your description plus the current synth state to GPT-4o-mini, which returns a JSON object mapping to all 50+ synth parameters. The system prompt is a 3,000-word sound design manual containing waveform physics, envelope dictionaries, 20 complete voice recipes, and 18 arp preset references. The AI doesn't randomly generate values — it identifies the 2–3 closest reference presets to your description and blends their parameters. "Icy bell pad" takes oscillator config from glass_bell, envelope from cold_pad, and reverb from lush_pad.

⎍

MIDI & Keyboard Input

The Web MIDI API provides hot-plug detection — connect a MIDI keyboard at any time and the synth auto-binds it. Sustain pedal (CC 64) holds notes until the pedal lifts, matching the behavior of a real piano. Computer keyboard input maps a chromatic octave to the home row (A–K keys). Velocity sensitivity from MIDI controllers scales both volume and filter envelope depth, so playing harder produces brighter, louder notes — just like an acoustic instrument.

The AI Pipeline

From English to audio in under two seconds.

User types a description

"Warm detuned pad with slow attack and big reverb." The prompt is free-form natural language — no syntax, no parameters, no constraints.

Current state is captured

The synth serializes its full parameter state — oscillators, filter, envelope, FX, arp config — into JSON and sends it alongside the prompt. This gives the AI context to make relative adjustments, not just blind presets.

GPT-4o-mini processes the request

The model receives a 3,000-word system prompt containing waveform physics, envelope dictionaries, 20 voice recipes, 18 arp patterns, and strict output format rules. It identifies the closest preset references and blends their values to match the description.

JSON response is parsed and applied

The response — a single JSON object with 50+ parameter values — is validated, parsed, and applied to the live audio graph. Knobs rotate, sliders move, oscillators reconfigure, effects toggle. The sound changes in real-time with no page reload.

User plays and tweaks

The AI gets you 80% of the way there. The remaining 20% is yours — adjust the filter cutoff, change the reverb decay, swap an oscillator waveform. Or describe another sound. The cycle is seconds, not hours.

BrowserSynth.

Sound design shouldn't require a degree.

How the synth actually works.

From English to audio in under two seconds.

Browser
Synth.