{"id":36146,"date":"2026-07-02T08:22:15","date_gmt":"2026-07-02T15:22:15","guid":{"rendered":"https:\/\/www.podfeet.com\/blog\/?p=36146"},"modified":"2026-07-02T08:22:15","modified_gmt":"2026-07-02T15:22:15","slug":"audiobook-audio-v3-part-4-choosing-the-book-voice","status":"publish","type":"post","link":"https:\/\/www.podfeet.com\/blog\/2026\/07\/audiobook-audio-v3-part-4-choosing-the-book-voice\/","title":{"rendered":"Audiobook Audio (V3) Part 4: Choosing the Book Voice"},"content":{"rendered":"<h3>From repeatable sound to chosen sound<\/h3>\n<p>The previous part of this series treated mic technique as processing. Not metaphorically. Literally. Distance changes tone. Angle changes sharpness. Height changes the balance of the voice. And once I accepted that, I had to make microphone position boring and repeatable.<\/p>\n<p>Same stool. Same mic. Same place in the room. Same distance reference. That solved one problem, and then, annoyingly, created a better one, because once I could repeat the sound, I had to choose the sound.<\/p>\n<p>That is what this part is really about. Not &#8220;how do I get a good sound?&#8221; in some abstract, internet-forum sense. The more useful question is much more awkward: what voice do I want the listener to live with for the whole book?<\/p>\n<h3>The uncomfortable bit is that there is no correct tone<\/h3>\n<p>This was more uncomfortable than I expected, because I wanted there to be a correct answer. I wanted the test to be something like: record five samples, listen carefully, pick the one with the most clarity, and have the answer. But audiobooks are not that simple.<\/p>\n<p>A voice can be clear and still tiring. It can be warm and still muddy. It can sound intimate in ten seconds and start leaning on the listener after ten minutes. And the really annoying part is that &#8220;better&#8221; depends on the book.<\/p>\n<p>Some books want a little distance and calm. Some want intimacy. Some need a narrator who feels as if they are sitting beside you. Others need a bit more air around them. None of those choices is automatically more professional.<\/p>\n<p>So when I say &#8220;finding the sound&#8221;, I do not mean discovering the objectively best version of my voice. I mean choosing the narrator I am going to be for this book. That sounds a bit grand, but it is also very practical. If I do not choose that voice at capture, I will be tempted to invent it later with EQ, de-essing, compression, and all the other little nudges that feel harmless until they have accumulated across hours of audio. V3 is about not doing that.<\/p>\n<h3>So the mic became my tone control<\/h3>\n<p>In the old version of this problem, if the voice felt slightly wrong, I would start thinking in plugins. A little EQ. A little de-essing. Maybe something to make it sound more finished.<\/p>\n<p>And sometimes that works, in the same way that adding more seasoning can rescue dinner. But if every dinner needs rescuing, the seasoning is not the real problem. For audiobooks, I now want the recorded voice to be boringly close to finished before I touch it.<\/p>\n<p>That pushed me back to the simplest tone control I had: mic position. Not because plugins are bad. They are useful. But mic position changes the sound before it becomes a file. It changes the way the voice exists, rather than asking software to reshape it afterwards.<\/p>\n<p>And because I had already done the boring consistency work, I could finally experiment without lying to myself. If a take sounded different, it was because I changed something on purpose, not because I sat slightly differently, leaned in without noticing, or aimed my mouth at a new bit of microphone by accident.<\/p>\n<h3>I kept the test to three versions<\/h3>\n<p>The test was deliberately small. I did not want too many options, because that would just give me too many ways to second-guess myself. I wanted three useful choices:<\/p>\n<ul>\n<li>my baseline, balanced sound;<\/li>\n<li>a slightly closer, warmer version;<\/li>\n<li>a slightly more off-axis version, a bit softer around the edges.<\/li>\n<\/ul>\n<p>Same paragraph. Same energy. Same room. Same everything except the thing I was testing. And this is where I think an audio demo is worth hearing, but you&#8217;d need the podcast version for that.<\/p>\n<p>The result though, is that the baseline is balanced, familiar, not doing anything dramatic. The closer recording has more warmth and intimacy. The voice comes forward. It feels more present. And the off-axis version softens the top a little. It can make the voice feel smoother, less pointy, and less eager to poke the listener in the ear.<\/p>\n<p>None of these is &#8220;right&#8221;. They are flavours. And that was the point. Once I had repeatability, mic position stopped being a source of drift and became a small menu of deliberate tonal choices.<\/p>\n<h3>Then there is the trap of the impressive sample<\/h3>\n<p>Here is where I nearly made the wrong decision. The close version was the one that made me smile first. It had that nice immediate warmth. It sounded more intimate. It sounded, in the dangerous ten-second sense, better.<\/p>\n<p>And this is exactly how I get myself into trouble, because audiobooks are not listened to in ten-second comparisons. Nobody buys an audiobook and thinks, &#8220;Excellent, I hope this wins a quick A\/B test.&#8221; They live with it. They cook with it. They walk with it. They fall asleep to it. They listen when they are tired, distracted, or wearing headphones that do not flatter anything.<\/p>\n<p>So I stopped deciding while I was excited. I recorded the samples, left them alone, and came back later. Sometimes the next day. Sometimes in a different mood. Then I listened like a listener rather than like someone trying to prove a point. Longer stretches. No scrubbing. No staring at the waveform. No trying to hear the cleverness. Just: can I forget about the sound and follow the words? That became the test.<\/p>\n<h3>So, what did I actually choose?<\/h3>\n<p>In my case, the answer was not the most dramatic option. The version I kept coming back to was slightly closer than my old baseline, but not the closest, richest, most &#8220;produced&#8221; version.<\/p>\n<p>I wanted some of that warmth and intimacy, because Jern&#8217;s stories are character-led and emotionally close. I do not want the narration to feel distant or detached. But I also did not want the voice to become thick, or too present, or too impressed with itself.<\/p>\n<p>So the sound I chose was: close enough to feel human, angled enough to keep the consonants comfortable, and plain enough that after a while I stopped noticing it. That last part matters. The winning sound was not the most obviously impressive one. It was the one that got out of the story&#8217;s way.<\/p>\n<h3>This is where fatigue beats fidelity<\/h3>\n<p>This is the part that changed my judgement. I used to think I was looking for fidelity: maximum detail, maximum clarity, maximum voice. But long-form narration does not reward maximum anything. The real constraint is fatigue.<\/p>\n<p>Too much detail becomes attention. Too much presence becomes pressure. Too much brightness makes every S and mouth noise feel important. Too much warmth can make the voice feel heavy.<\/p>\n<p>The best audiobook sound, at least for me, is not the most impressive sound. It is the sound that still feels easy after an hour.<\/p>\n<p>Clear enough that the listener does not work. Stable enough that nothing keeps changing. Human enough that it feels like a person telling the story. Modest enough that the voice does not become the subject.<\/p>\n<h3>Choosing also means refusing<\/h3>\n<p>Once I chose the book voice, I wrote it down. Not in mystical language. In boring physical terms:<\/p>\n<ul>\n<li>stool position;<\/li>\n<li>mic height;<\/li>\n<li>distance reference;<\/li>\n<li>angle;<\/li>\n<li>what I am listening for;<\/li>\n<li>and what counts as drift.<\/li>\n<\/ul>\n<p>Because the choice is only useful if I can return to it tomorrow. This is where choosing becomes refusing.<\/p>\n<p>I am refusing to freestyle the tone chapter by chapter. I am refusing to chase a slightly better sound every time I sit down. I am refusing to let a tired Tuesday version of me accidentally create a different narrator from the cheerful Saturday version.<\/p>\n<p>That sounds rigid, but it is actually freeing. Once the book voice is chosen, I can stop auditioning microphones in my head and get back to the sentence in front of me.<\/p>\n<h3>So, where does that leave us?<\/h3>\n<p>So that is the shift in Part 4. Mic position is not just a way to avoid bad sound. Once it is repeatable, it becomes a creative control. But the creative decision is not &#8220;what sounds best in isolation?&#8221; It is: what sound can carry this book for hours without asking to be admired?<\/p>\n<p>For me, that meant choosing a slightly closer, warm-but-not-too-warm placement, keeping the angle comfortable, and treating that as part of the production plan. Now the next problem appears: if the microphone is hearing me consistently, it will hear my habits consistently too: the rushed breaths, the mouth clicks, the slightly panicked retakes, all the little performance problems I used to assume I could fix later.<\/p>\n<p>So next time, I will talk about removing that safety net, and why the next upgrade is not more processing. It is better behaviour at the microphone.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>From repeatable sound to chosen sound The previous part of this series treated mic technique as processing. Not metaphorically. Literally. Distance changes tone. Angle changes sharpness. Height changes the balance of the voice. And once I accepted that, I had to make microphone position boring and repeatable. Same stool. Same mic. Same place in the [&hellip;]<\/p>\n","protected":false},"author":34,"featured_media":36211,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_feature_clip_id":0,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_post_was_ever_published":false},"categories":[1],"tags":[7881,8287,8290,8288,7891,8291,7896,7892,7897,8289],"class_list":["post-36146","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-podcasts","tag-audiobook-audio-v3","tag-book-voice","tag-listener-fatigue","tag-mic-position","tag-microphone-placement","tag-narration-style","tag-off-axis","tag-proximity-effect","tag-repeatable-setup","tag-tone-selection"],"jetpack_featured_media_url":"https:\/\/www.podfeet.com\/blog\/wp-content\/uploads\/2026\/06\/04-choosing-the-book-voice-hero.png","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/posts\/36146","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/users\/34"}],"replies":[{"embeddable":true,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/comments?post=36146"}],"version-history":[{"count":4,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/posts\/36146\/revisions"}],"predecessor-version":[{"id":36212,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/posts\/36146\/revisions\/36212"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/media\/36211"}],"wp:attachment":[{"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/media?parent=36146"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/categories?post=36146"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.podfeet.com\/blog\/wp-json\/wp\/v2\/tags?post=36146"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}