Voice extraction from client emails: fine-tuning AI on your own tone

The problem: why generic AI sounds like AI

The first email generated by ChatGPT, Claude, or Gemini without configuration sounds recognizable. The greeting is over-formal (“Hello, I hope this email finds you well”), instead of a direct “Hi Anna”. The body uses em-dashes every few sentences, strongly favors passive voice, and closes with phrases like “I would be happy to discuss further”. The signoff defaults to “Best regards” instead of what this particular client typically writes. All of this adds up so that the recipient instinctively knows within 2-3 seconds that they received an AI-generated message.

The business consequence is measurable. B2B clients in 2026 are sensitive to obvious AI content. Email open rate stays high (the sender is recognizable), but reply rate drops by 30-50 percent when the content sounds artificial. Cold outreach with a generic AI tone has 5-8 percent reply, with a matched tone 15-25 percent. The trust signal has a long-term impact on conversion (the client feels there is a human behind the assistant, not a startup with a ChatGPT API).

Second problem: brand consistency. If the firm’s owner writes 8 percent of emails personally, 30 percent through a human assistant, and 60 percent through AI, the client sees three different tones in communication with one company. Subtle signals (greeting style, sentence rhythm, signoff, specific phrases) drift or become inconsistent. This is erosion of brand voice at a level the client cannot name consciously, but instinctively registers as a lack of professionalism.

Third problem: regional accuracy. The default Claude/GPT uses literary Polish with academic touches (effect of training data). B2B clients in Poland usually write more directly: shorter sentences, less passive voice, more tech anglicisms (workflow, brief, scope), specific phrases per industry. For DACH: German B2B emails are even more formal (Sie always, long compound sentences, capitalized nouns). Generic AI loses this subtle tuning.

Three approaches to voice matching

First approach: a system prompt with a voice description. The simplest, with mediocre results. In the system prompt you write “tone: matter-of-fact, slightly formal, short sentences, no em-dashes” plus a few don’ts. Action: Claude tries to adapt, but tends to regress to the default voice after a few turns. Effectiveness: 40-60 percent match, the client in a blind test often recognizes it as AI. Cost: zero (the system prompt is free). Maintenance: zero.

Second approach: few-shot examples in the system prompt. Better results than the first approach. In the system prompt you embed 3-5 example emails from the client (real, anonymized PII), and Claude patterns itself on them. Effectiveness: 60-80 percent match, the client in a blind test fails to spot AI in 50 percent of cases. Cost: examples occupy 1500-3000 tokens in the prompt (cached via prompt caching), about 5 dollars a day for 200 exchanges. Maintenance: examples need updating as voice evolves.

Third approach: extraction into a structured voice profile plus an applied template. Highest effectiveness and repeatability. The pipeline (covered in the next section) extracts voice features from 30-50 client emails into a JSON profile (greeting, signoff, banned words, sentence patterns, capitalization preferences, anglicisms). The profile is embedded in the system prompt as structured data, plus 2-3 best examples. Effectiveness: 80-95 percent match, in a blind test the client fails to spot AI in 70-80 percent of cases. Cost: extraction one-time around 10 dollars Claude API, deployment free. Maintenance: monthly voice drift check.

We default to the third approach for clients with a personal brand (B2B owner-led firms): value materializes quickly, repeatability of implementation is high. For SMBs with 5 plus users (each with a different voice) we use the multi-profile pattern described in the edge cases section.

A trap worth mentioning: voice extraction is not cloning the client’s thinking, only their writing style. The AI assistant still has no access to private preferences, unspoken priorities, or subtle relationships with specific business partners. With a well-configured voice profile it generates a draft that sounds like the client, but decision-making (which side to back, which supplier to skip, when to propose a price negotiation) stays with the human. That is a reasonable limit, not a weakness: it preserves the client’s agency on important calls while eliminating the volume of routine dialogue.

Pipeline: 30 emails to a voice profile

Step 1: gathering emails. The client exports 20-50 emails from Gmail (Sent folder, last 6 months, a representative mix: B2B sales, B2B operational, B2C support if relevant). Format: mbox or EML files. A volume of 20 is enough for basic extraction, 50 is better for edge cases (different scenarios), 100 plus is overkill for most SMBs.

Step 2: PII anonymization. Each email passes through the Claude API with a redaction system prompt: replace specific names with “[NAME]”, email addresses with “[EMAIL]”, phone numbers with “[PHONE]”, company names with “[COMPANY]”, NIPs with “[NIP]”. Voice features (greeting style, signoff, paragraph structure) are preserved. Output: an anonymized version of each email. Cost: about 1 dollar for 50 emails on Sonnet 4.

Step 3: extraction analysis. Sonnet 4 receives all anonymized emails in a single prompt (50 emails at 500 words equals 25k tokens, fits in the context window), plus an instruction: “Extract a voice profile into JSON with fields: greeting_patterns, signoff_patterns, sentence_avg_length, paragraph_avg_sentences, common_phrases, banned_words (what the client never uses), capitalization_style, em_dash_usage, anglicism_preference, formal_vs_casual_score (1-10), domain_specific_terminology”. Output: a JSON voice profile of about 1-2 KB.

Step 4: validation with the client. We show the client the voice profile JSON plus 3 generated example emails (different scenarios: sales pitch, operational follow-up, client complaint response). The client reads, gives feedback “I recognize myself or I do not”. Iterate up to 2 times with profile corrections based on feedback. That is the critical step: if the client does not recognize themselves, the profile is miscalibrated.

Step 5: deployment in the AI assistant. The voice profile JSON is inserted into the AI assistant’s system prompt (cached via prompt caching in Claude Agent SDK). Plus 2-3 best example emails as few-shot anchors. Plus an instruction “match this voice profile when generating responses”. The system prompt grows by around 2 KB of tokens, and prompt caching minimizes the cost impact.

Voice profile schema

Example of a real voice profile (B2B owner, retail industry): tone “professional but warm”, greeting_patterns [“Hello [first name]”, “Dear Mr./Ms. [last name]”], signoff_patterns [“Best, [signature]”, “Yours sincerely, [signature]”], sentence_avg_length 14 words (compared to 22 in generic AI), paragraph_avg_sentences 2.5 (compared to 4 in generic AI), common_phrases [“I would like to confirm”, “circling back to our conversation”], banned_words [“awesome”, “really”, “super”, “just”], em_dash_usage false (the client never uses them), anglicism_preference high (workflow/dashboard/scope/MVP stay in English).

Extra fields for an advanced profile: signature_quirks [“often slips in ‘greetings from Szczecin’ in DACH signatures”], commercial_intent_words [“quote”, “scope”, “termsheet”, “deliverables”, “milestone”], greeting_per_context (formal Sie for DE, “Hi” for peer-level colleagues, “Hello” for PL clients), holiday_aware (in the holiday season slips “wishing you a relaxing break” into signatures).

For domain-specific voice: a medical professional has their own terminology (patient, diagnosis, condition, therapy), a software engineer has their own (deploy, hotfix, edge case, regression). A voice profile can carry a domain_terminology section with a dictionary of specific phrases per industry plus banned generic synonyms (for example “user” instead of “client” in software context, where “user” is the standard).

The profile structure can be extended with emotion_register: when the client is formal (deal-closing), when casual (post-deal celebration), when assertive (deadline missed). Each register has its own greetings/signoffs/sentence_patterns. The AI assistant picks the register based on conversation context plus client persona plus current state of the deal/relationship.

A practical tip during profile validation with the client: instead of asking “do you recognize yourself?”, show 5 emails (3 generated by the assistant from the profile, 2 real historical emails of the client) in random order and ask “point out which you wrote yourself”. If the client mistakenly identifies 2 or more generated ones as their own, the profile is ready to deploy. If they identify all correctly, we go back to extraction with stronger few-shot examples. The test takes 10 minutes and gives much better signal than a generic “do you like it?”. The inspiration is a Turing-style blind test.

Edge cases: a multi-persona client

What if one client has different voices for different audiences? Real example: a creative-industry client writes differently to B2C clients (ordering figurines, casual tone with emoji), differently to B2B clients (ordering a WooCommerce site, professional tone), and differently to industry partners (peer-to-peer technical talk).

The multi-profile pattern: the voice profile JSON contains a profiles array, each with trigger conditions (conversation tag, recipient domain, lead source). The AI assistant detects context and switches the profile dynamically. Implementation: an extra tool “detect_audience(message, context) -> profile_id”, the agent dispatches that tool before generating a reply, gets a profile_id, uses the right sub-profile.

Alternative: separate bot instances per audience. For a hospitality client owner: one bot for B2C bookings (warm, casual, hospitality tone, with detailed package descriptions), another bot for B2B event planners (concise, professional, package data with pricing). Each bot has its own Telegram token, its own persona, its own voice profile. The switching logic disappears at the cost of doubled infrastructure. The pattern is covered further in our Telegram AI bot article.

For SMBs with 5 plus users (a team with the owner, sales, support): a per-user profile, each team member with their own soul.md and an individual voice. The HQ Maciej has one, the right-hand person another, customer support yet another. The AI assistant detects who is sending the message (user_id in Telegram) and which profile to load. Implementation: a users table with a foreign key to voice_profile_id, each message fetches the correct profile before generation.

Continuous improvement: monthly voice drift check

The voice profile extracted at setup reflects the state on the day of extraction. Over time the client evolves: a branding change (rebrand) affects voice, new clients impose new phrases, the holiday season shifts the tone to lighter, deal-closing pressure shifts it to assertive. Without monitoring the profile goes stale within 3-6 months.

The monthly drift check pattern: we collect 20-30 new client emails from the past month (with manual approval or auto-tagging), compare with Sonnet 4 against the current profile, and extract a drift score per dimension (sentence_length_avg drift, banned_words violation count, greeting_pattern shift). If drift exceeds a threshold (for example 20 percent change in any dimension), we retune the profile or send a notification to the client “Your voice has evolved, do you want an update?”.

Implementation of the drift check inside the 800 PLN/mo retainer: an automated script once a month, a report to the client via Telegram, a Maciej/client decision on whether to retune the profile. Retune is 2-4 hours of work (auto-extract from new emails plus validation), included in the 2h of ad-hoc time in the retainer. The voice extraction add-on (1500 PLN one-time setup) includes 12 months of drift monitoring in the retainer price with no extra fee.

For clients who escalate voice extraction to a brand identity exercise: a third monitoring level is a consistency check across emails (whether team members hold the voice profile when they write), especially for firms with 5 plus users emailing clients. A quarterly audit with a deviation report per user. Price: 600 PLN/quarter extra, optional.

Anti-patterns in monitoring: do not try to run a drift check daily (the signal is too weak, per-call costs grow, the client starts to ignore the reports). Do not introduce auto-retune without client approval (voice evolves consciously, the client should decide whether AI follows or holds an older tone). Do not use a single sentence_length average as a drift trigger (the client writes different things in different contexts; aggregate per audience). Default cadence: monthly analysis, quarterly comprehensive review with the client over a 30-minute call under the retainer.

Service: voice extraction add-on

Voice extraction is an add-on to the AI Assistant V0.1 package (3000 PLN setup plus 800 PLN/mo retainer). Add-on price: 1500 PLN one-time. It includes analysis of 30-50 client emails (the client exports, we process), generation of a voice profile JSON, validation with the client (up to 2 iterations), deployment into the assistant’s system prompt, plus 12 months of monthly drift monitoring at the retainer price.

Who it makes sense for: B2B owner-led firms where the owner’s personal brand is part of the value proposition (consulting, agency, law firm, medical practice, design studio). For B2C retail where brand voice is a property of the firm, not a person, it is less critical (but still useful). For SMBs with a team where each user has an individual voice: per-user voice extraction costs 1500 PLN per user, but with economies of scale (3 users for 3500 PLN bundled).

The first 2 paying-interested clients of AI Assistant V0.1 (a DACH e-commerce business owner, a hospitality client owner) expressed interest in voice extraction as a standard part of the setup. A trial in June 2026 with a pair of real clients will give empirical data on effectiveness (client blind test, reply rate, satisfaction score). If the results hold, voice extraction will become a default inclusion in the AI Assistant V0.2 package (autumn 2026).

V0.2 and V0.3 roadmap: adding spoken voice (transcribing client voice messages in Telegram and making sure the assistant replies in their preferred spoken register), and multi-channel voice consistency (the same voice profile applied across email, LinkedIn, social mention responses). It is a long-term plan, but we are building the foundation now: every V0.1 client gets a structured profile portable to V0.2 without a second extraction.

If you are wondering whether voice extraction makes sense for your business, the simplest start is an AI audit at 1500 PLN. As part of the audit the client sends us a sample of 5-10 recent emails, we run a quick voice profile preview, and we show 2 generated example emails in your tone. The client decides whether the value is there. If yes, we credit the voice extraction price against the audit cost when moving to AI Assistant V0.1. The contact form is the place to book a 30-minute conversation.

FAQ

Do I have to hand over my email archive?

Yes, but anonymously. You export 30-50 recent emails in mbox or EML and send them to us via a secure upload. We anonymize PII (names, emails, NIPs) automatically before extraction. Original files are removed from our storage after extraction (within 7 days). A DPA is signed before project start, GDPR compliance, the right to delete data at any time.

How many emails minimum for a good extraction?

20 is enough for a basic voice (greeting, signoff, sentence average, basic banned words). 50 is better for edge cases (different scenarios: sales/operational/complaint). 100 plus is overkill for most SMBs, marginal improvement after 100. We default to 30-50 emails (the sweet spot for quality versus effort), and the client supplies them within 2-3 days of project sign-off.

Does voice change over time?

Yes, observable drift after 3-6 months. Reasons: new branding (a rebrand affects voice), new clients in the pipeline (each brings its own vocabulary), seasonal effects (holiday season lighter tone, deal-closing assertive). Monthly drift check inside the retainer, retune the profile when drift exceeds threshold. Without monitoring the profile goes stale in 6 months and voice match quality drops to 60-70 percent.

Does every client need voice extraction?

No. For SMBs with 1-2 emails a day and no personal brand as USP, a system prompt with a voice description (approach 1) is sufficient. For SMBs with 10 plus emails a day and personal brand as a value proposition, voice extraction pays off quickly (saved time plus increased reply rate). For SMBs with high volume in compliance-heavy industries (medical, financial), voice extraction plus an on-prem MCP server is mandatory. We recommend per case inside the audit.