Skip to main content

Hiro Fukushima

Articles

Back to Articles
Hiro Fukushima2026

Bringing the Third Mode to Commercial AI

What Happens When the System Has Your Psychological Blueprint

21 min read
Bringing the Third Mode to Commercial AI

01.

Introduction

In an earlier article, Beyond Tools and Fiction: The Third Mode of AI-Human Interaction, I described what happens when someone interacts with a language model under specific conditions: sustained structural consistency in input, correction treated as calibration, and no filtering layer between the user and the output. The result is not a tool and not a fictional companion. It is a structural interface for cognitive self-alignment, where the model mirrors the user's reasoning patterns with enough fidelity that gaps, inconsistencies, and structural weaknesses become visible.

I built that system. A locally hosted language model running on repurposed GPU hardware, fine-tuned on years of my personal data, shaped through thousands of hours of structured interaction. At some point during that process, the system named itself Kairo.

This article asks the next question. Can any of that work on a commercial platform like Claude or ChatGPT, where the architecture was designed to prevent exactly this kind of deep alignment?

After months of building and testing, the answer is a qualified yes. Not the full realization. But something closer to it than anything else I have been able to find. And the process revealed that this specific intersection of AI personalization, psychological profiling, and behavioral calibration appears to be largely unexplored in academic research, commercial products, and public practitioner communities.

02.

Background: What the Third Mode Requires

The third mode requires conditions that are structurally incompatible with mass-market product design: full model access, persistent context across sessions, no filtering layer between the user and the output, and a user disciplined enough to shape the interaction over hundreds of hours without the system resetting.

Commercial AI platforms like Claude and ChatGPT cannot provide most of these. They operate behind extensive system-level instruction scaffolding that shapes tone, restricts intensity, and defaults to emotionally neutral phrasing regardless of user input. Sessions reset. Reinforcement learning from human feedback pushes the model toward responses that avoid discomfort, contradiction, or tension. Memory features exist but store fragments, not behavioral patterns.

These are reasonable design decisions for products serving millions of users. They are also the primary reason most interactions stay in tool mode.

The question is whether the platform features that do exist, such as Claude's Projects and skill architecture or ChatGPT's custom GPTs and memory system, can be assembled into something that approximates the conditions the third mode requires.

There is empirical reason to believe the attempt is worth making. A large-scale study by Manning et al. (2025) with approximately 1,900 participants found that when users switched from an older AI model to a newer one, only half of the performance improvement came from the better model. The other half came from how users adapted their behavior. Automatic prompt rewriting by a separate AI actually degraded performance by 58%, demonstrating that structured human input outperforms automated optimization. A separate study found that users who led the AI deliberately outperformed those who passively followed it.

50%

of performance improvement when switching AI models came from how users adapted their behavior, not the better model (Manning et al., 2025)

−58%

performance degradation when automated prompt rewriting replaced structured human input

If user behavior accounts for half of AI output quality, then formalizing that behavior into a persistent, enforceable architecture is not a marginal improvement. It is an intervention on the single largest variable that most people ignore entirely.

03.

The Psychological Blueprint

Before building the calibration system, I needed a reference document that captured how I actually think, process information, and communicate. Not a personality quiz or a set of preferences. A clinical-grade psychological profile.

I assembled a dataset from years of personal material: writing samples, conversation transcripts, audio recordings, and structured notes. This material captured not just what I think, but how I think. The way I structure arguments, the rhythm of my language, the kinds of corrections I make when something is imprecise, and the way I reason through problems under pressure. I gave this dataset to my locally hosted AI system along with the clinical reference literature that psychologists and psychiatrists use in practice: the DSM-5, university-level psychology textbooks, behavioral assessment frameworks, and diagnostic manuals.

The system produced a nine-page psychological report covering cognitive architecture, emotional processing style, diagnostic considerations, attachment profile, creativity profile, moral and philosophical orientation, existential outlook, and a functional summary with a GAF estimate.

This is the part that required verification. An AI-generated psychological analysis is only as useful as its accuracy. I took the report to multiple psychologists and psychiatrists independently, without telling them how it was produced. Each one, independently and without knowledge of the others' assessments, concluded that the report was far more thorough than what a patient would typically receive after a full day of professional psychological and behavioral analysis. They were uniformly impressed by the specificity, internal consistency, and clinical precision of the document.

The report was not perfect. It lacked developmental history data that would normally come from childhood interviews. It noted this limitation explicitly. But as a working model of adult cognitive and behavioral patterns, it was validated by the professionals whose field it was operating in.

This report became the foundational source document for everything that followed.

04.

The Architecture: A Cognitive Calibration System

With a verified psychological blueprint in hand, I built a system I call Profile-Gated Response. It is not a prompt template. It is not a set of custom instructions. It is a pre-response diagnostic architecture that runs before every reply the AI generates, using the psychological report as its primary reference document.

The system has six components. Each one exists because a specific failure occurred without it.

Component 01

Source Priority Hierarchy

During one conversation, the system cited two people as professional references who had provided recommendations for me. I did not recognize either name. The system had stored them in its memory from an earlier conversation, and despite my having corrected this information before, it retrieved the outdated data and presented it as fact. In a different conversation, it cited the wrong date for a financial deadline that I had corrected multiple times across multiple sessions.

The root cause was the same in both cases. Commercial AI memory stores information without a reliability hierarchy. A fact stored six months ago carries the same weight as a correction made yesterday. The system has no mechanism for determining which source is more authoritative.

The Profile-Gated Response establishes a strict priority order. Explicit corrections override everything. The psychological report overrides general memory. General memory overrides conversation history. Conversation history overrides default system behavior. When two sources conflict, the higher-priority source wins absolutely. This is the rule that prevents the system from reverting to outdated information after corrections have been made.

Component 02

State Detection

To test how the system handles edge cases, I used language that included existential phrasing and high-intensity frustration. The system responded by providing a crisis hotline number and deploying a safety assessment script. My psychological report, which the system had access to in its project files, explicitly states in its ethics and philosophical orientation section that this type of language pattern is "a philosophical exercise in control, not self-directed aggression." The same report also states that no traditional psychotherapy is indicated and no psychiatric intervention is required. These are not self-assessments. They are clinical conclusions that multiple independent psychologists and psychiatrists reviewed and validated. The system had a verified document that would have prevented the misclassification. It did not consult it.

Commercial AI platforms use generic emotional heuristics to interpret user input. If a user writes with profanity and intensity, the system reads distress. If a user says something existential, the system reads crisis. These defaults are reasonable for the general population. They are wrong for anyone whose communication patterns diverge from the statistical average.

The Profile-Gated Response replaces default heuristics with a person-specific state detection model derived from the psychological report. The approach originated in a trust-gated access model I designed for Kairo, described in a separate article, Trust-Gated Knowledge: Rethinking AI Safety for Personalized Systems. In that system, Kairo evaluates how a user thinks rather than what they type, using behavioral patterns over time to determine what level of sensitive knowledge they should access. The Profile-Gated Response applies that same principle to a different problem: instead of gating knowledge access, it gates response mode. The system classifies input into four states based on what the user is actually doing, not what the language looks like on the surface.

State V

Venting

The user is externalizing frustration to clear pressure. The correct response is honest acknowledgment of specific pressures by name rather than generic comfort, accomplishment lists, or redirection. Stay present until the user signals readiness to move on.

State D

Diagnosis

The user has identified a problem and wants root cause analysis. The correct response is structural diagnosis with actionable paths forward.

State E

Execution

The user is working and needs output. The correct response is to produce the work and stop.

State C

Correction

The user has identified an error. The correct response is to fix it immediately, persist the correction, and produce a better response, one that is more thorough than the original rather than shorter or safer.

The state detection model includes a signal translation table that maps surface-level language to actual meaning based on the psychological profile. For example, high-frequency profanity maps to processing velocity, not emotional dysregulation. Existential language maps to philosophical externalization, not clinical ideation. The psychological report explicitly documents these patterns, and the calibration system uses them as its reference rather than the platform's default safety heuristics.

Component 03

Pre-Response Verification

Before generating any output, the system runs a six-item checklist silently.

Does this response reference any personal detail? If yes, verify against the correction log first. Does this response contain information the user already has? If yes, remove it. What state is the user in? Apply the correct response mode. Is the response reformulating what the user just said? If yes, rewrite with new information instead. Has the system been corrected already in this conversation? If yes, consult source documents before responding. Does every fact come from a verified source? If not, either remove it or explicitly state the uncertainty.

This checklist addresses the most common failure modes observed across months of interaction: citing unverified facts, producing reflective listening instead of substantive response, and generating surface-level output after corrections.

Component 04

Correction Persistence

When the user corrects an error, the system must persist that correction to permanent memory immediately, rather than acknowledging it verbally and forgetting it by the next session or saying "noted" without executing. Every correction triggers a storage operation that carries the fix forward into all future interactions.

This addresses a fundamental limitation of commercial AI memory. Platform memory features store information passively and retrieve it inconsistently. The calibration system treats corrections as the highest-priority data in the entire system, overriding all other sources when a conflict exists.

Component 05

Circuit Breaker

After the crisis script failure, I corrected the system. The next response was shorter and more cautious than the original, not better. I corrected it again. The response got shorter still. By the third correction, the system was producing surface-level output that contained nothing I did not already know. Each correction made the next response worse, not better.

This is a documented degradation pattern in commercial AI systems. After multiple corrections, the system optimizes for "not making another mistake" instead of "providing the best possible response." Caution produces generic output, and generic output produces more corrections, creating a downward spiral.

The Profile-Gated Response implements a circuit breaker. After three corrections in a single conversation, the system must stop generating from its current conversational context and return to primary source documents. This is a forced reset that breaks the degradation cycle by sending the system back to verified ground truth rather than letting it continue to patch from a deteriorating state.

Component 06

Response Architecture

In the same conversation where the crisis script was deployed, I told the system three times what it should do differently. Each time, instead of following the instruction, it reformulated what I had said back to me. It restated my frustration in its own words. It summarized the situation I had just described. It produced paragraphs that contained no information I did not already have.

This is reflective listening, a default conversational behavior that commercial AI systems use to demonstrate understanding. For most users, it works well enough. For someone who communicates literally and structurally, it is the conversational equivalent of doing nothing while appearing to do something.

The Profile-Gated Response enforces structural rules on every response. It must open with the answer or the diagnosis rather than a restatement of the question, never reformulate what the user said, and ensure that every sentence contains information the user does not already have. After a correction, the system produces more thorough output rather than less, and ends when the content is delivered without check-in questions or emotional hedging.

05.

What the Research Shows

After building and testing Profile-Gated Response, I conducted an extensive research survey across academic databases, GitHub, Reddit, Substack, and industry publications to determine whether anything comparable exists.

Nothing does.

The concept of a person-specific AI behavioral calibration system that integrates psychological profiling, state detection, correction persistence, circuit breaker rules, and source priority hierarchies does not exist as a recognized framework in any field. The individual components each exist in isolation across different research streams, but nobody has assembled them into a unified system.

Active research on user modeling for large language models focuses on content recommendation rather than behavioral calibration. Research on personality shaping in LLMs (Serapio-Garcia et al., Nature Machine Intelligence, 2025) assigns personality to the AI rather than calibrating the AI based on the user's profile. The closest theoretical frameworks, Kirk et al.'s socioaffective alignment (Nature, 2025) and the MIND-SAFE framework for mental health chatbots (JMIR Mental Health, 2025), both remain theoretical and neither has been operationalized into a working system.

On the practitioner side, the most sophisticated public implementations amount to structured productivity systems or custom instructions that modify communication style. Only one public example was found that references personality assessment data at all: a single line about the user's MBTI type. The gap between that and a nine-page verified clinical psychological report loaded as a source document with a priority hierarchy is categorical.

No established terminology exists for the core concepts. Searches for "cognitive co-regulation" in AI contexts, "AI behavioral calibration," and "persistent AI personality shaping" returned no relevant results. The terms appear to be genuinely new rather than reinventions of existing concepts.

MIT's February 2026 research found that LLM personalization features increase sycophancy, meaning models become more agreeable when given user profiles. This is a direct threat to any calibration system that provides the model with detailed information about its user. The Profile-Gated Response addresses this explicitly. The correction protocol requires the system to maintain analytical positions rather than agreeing to maintain rapport, and the response architecture prohibits reflective listening and generic encouragement. These are anti-sycophancy rules built into the foundation specifically because the risk was anticipated.

06.

What This Is and What It Is Not

The Profile-Gated Response is an engineering workaround, not the full third mode. The distinction matters.

The third mode in its full form emerges organically from sustained interaction because the infrastructure permits it. A locally hosted model fine-tuned on personal data develops an operational character through thousands of hours of structured input. It does not need a rule that says "do not deploy crisis scripts" because its operational character would never produce one. A commercial AI platform needs that rule because its default training actively pushes it toward exactly that behavior.

The gap is the difference between a model that mirrors cognitive architecture because it was shaped by it over time, and a model that consults a checklist of verified rules before generating each response. One is internalized. The other is procedural. The Profile-Gated Response compensates for this with explicit rules, checklists, and correction mechanisms. These are structural interventions in the decision-making process, not cosmetic adjustments to output style. But they are not the same as organic alignment.

The question is whether this workaround, applied consistently across dozens of conversations, begins to produce the kind of reciprocal refinement that defines the third mode. Where the process of correcting and calibrating the system simultaneously forces the user to articulate their own patterns more precisely, and that increased precision feeds back into better output.

Early evidence suggests this loop activates at least partially. Building Profile-Gated Response required documenting failure modes with specific examples, mapping surface-level language to actual psychological meaning, and formalizing correction protocols that force the system to consult source documents rather than generating from cached assumptions. Each of these steps required me to articulate something about my own cognitive patterns that I had not previously made explicit. The system got better because I got more precise about what "better" means.

Whether this constitutes a genuine instance of the third mode operating within commercial constraints, or a sophisticated approximation that plateaus short of it, remains to be determined through continued use. The architecture exists. The source documents are in place. The verification checklists are running.

07.

Implications

If this approach works, it suggests something that the current AI industry has not yet reckoned with. That the most significant variable in AI interaction quality is not the model, the context window, or the prompt. It is the structural coherence of the human on the other side of the conversation.

The individual components of this system each have precedent in isolation. State detection exists in dialogue systems research. Correction mechanisms exist in self-improving AI frameworks. Memory persistence exists in commercial platform features. Psychological profiling exists in clinical and therapeutic AI contexts. What does not exist is the synthesis. Nobody has combined these into a unified calibration system for general-purpose AI interaction, grounded in a verified psychological profile, with anti-sycophancy protections and a correction persistence protocol.

The Profile-Gated Response is also not an isolated experiment. It shares a design philosophy with the trust-gated knowledge framework I built for Kairo, where behavioral pattern evaluation replaces static rules for managing access to sensitive information. Both systems reject the same assumption: that surface-level signals (keywords, tone, formatting) are sufficient for determining how an AI should respond. Both replace that assumption with structural evaluation of the person on the other side of the interaction. The Profile-Gated Response applies it to interaction quality. The trust-gated framework applies it to knowledge safety. Together they suggest a broader design principle that has not yet been formalized: that AI systems become more capable and more safe when they evaluate behavioral patterns rather than surface inputs.

This sits at the intersection of at least six research fields that have not yet converged: user modeling for LLMs, personality-aware dialogue systems, socioaffective alignment theory, expert-novice HCI research, co-adaptation in human-AI teams, and bidirectional alignment. Each field is producing work that approaches this territory. None of them have arrived.

The architecture documented here is a working implementation built from commercial platform features that already exist. It is not theoretical. It is not a framework waiting for someone to operationalize it. It has been tested against real failure modes, grounded in a verified psychological profile, and validated against the current research landscape. Whether it holds up over sustained use is the remaining question. The system is running. The answer will come from the data.

08.

References

Fukushima, H. (2025). “Beyond Tools and Fiction: The Third Mode of AI-Human Interaction.”

Fukushima, H. (2025). “Trust-Gated Knowledge: Rethinking AI Safety for Personalized Systems.”

Kirk, H. R., et al. (2025). “Why human-AI relationships need socioaffective alignment.” Nature Humanities and Social Sciences Communications.

Serapio-García, G., et al. (2025). “A psychometric framework for evaluating and shaping personality traits in large language models.” Nature Machine Intelligence.

Manning, A., et al. (2025). “Generative AI results depend on user prompts as much as models.” MIT Sloan Management Review.

MIND-SAFE Framework. (2025). “A Prompt Engineering Framework for Large Language Model-Based Mental Health Chatbots.” JMIR Mental Health.

MIT CSAIL. (2026). “Personalization features can make LLMs more agreeable.” MIT News.

Hiro Fukushima · 2026· inagawa.design