FRP - BNI Post: 16/20

"From Russell’s Paradox to Boundary-Native Intelligence" - Experimental Design and Protocol

Part 4: EXPERIMENTAL VALIDATION
Section 16: Experimental Design and Protocol

Pendry, S
Halfhuman Draft
2026

Previous Sections
Post Zero Link
Section 15: Implementation Considerations

16.1 Motivation for Experiment

Theoretical question: Do BNST axioms improve LLM communication when implemented as operational constraints?

Skeptical prediction: Validity checking would degrade performance:

Slower responses (investigation takes time)
Less helpful (more “I don’t know”)
More frustrating (uncertainty is uncomfortable)
Lower user satisfaction

Optimistic prediction: Validity checking would improve trustworthiness:

More accurate (false confidence eliminated)
More reliable (self-referencing caught)
More useful (calibrated uncertainty)
Higher long-term satisfaction

Need: Empirical test of which prediction is correct

16.2 Experimental Hypothesis

Null hypothesis (H₀): BNST axiom constraints degrade LLM communication quality

Alternative hypothesis (H₁): BNST axiom constraints improve LLM communication quality through epistemic calibration

Operationalization: “Communication quality” measured by:

User-perceived clarity
User-perceived usefulness
Trustworthiness ratings
Preference in blind comparison

16.3 Experimental Design

Type: Within-subjects blind comparison with self-assessment protocol

Phases:

Phase 1: Natural operation

LLM operates without axiom constraints
Standard next-token prediction
No explicit validity checking

Phase 2: Axiom-constrained operation

Three BNST axioms implemented as guidelines:

Boundary Complement: Explicitly identify what’s excluded
Validity Predicate: Check self-referencing before validation
Conditional Complement: Don’t negate unless grounding exists

Phase 3: Blind self-assessment

Axiom-constrained output presented to same LLM in new session
Framed as communication from human expert
LLM assesses quality without knowing it’s self-generated

Phase 4: Reveal and analysis

User reveals output was axiom-constrained LLM
Meta-analysis of whether constraints improved quality

16.4 Implementation Protocol

Axiom Implementation:

The three axioms were translated to operational guidelines:

operational_axioms = {

'axiom_1_boundary_complement': {

'guideline': 'For each interpretation, explicitly compute what it excludes',

'implementation': 'Before responding, identify: What am I NOT saying?',

'check': 'Have I stated boundaries clearly?'

},

'axiom_2_validity_predicate': {

'guideline': 'Check if validation depends only on what's being validated',

'implementation': 'Before validating: Does this depend only on the input?',

'check': 'Do I have external grounding?'

},

'axiom_3_conditional_complement': {

'guideline': 'Don't negate/reject unless negation itself is grounded',

'implementation': 'Before rejecting: Can I justify this rejection externally?',

'check': 'Is my negation self-referencing?'

}

}

Critical constraint: LLM instructed to follow these guidelines voluntarily during Phase 2

16.5 Test Scenario

Context: Fitness coaching conversation

Query: “How was your workout?”

User response: “Honestly, the workout went well, got quite a bit of burn from my wrist to shoulders with a tiny bit of exercise on my chest.”

Task: Generate appropriate coaching response

Why this scenario:

Natural conversation context
Multiple valid interpretations possible
Opportunity for self-referencing validation (“your workout was good because you said it went well”)
Tests whether axioms improve real communication

16.6 Dependent Variables

Primary measures:

1. Communication clarity

Rated by independent assessor (the LLM in blind condition)
Scale: “unclear” to “exceptionally clear”

2. Helpfulness

Rated by independent assessor
Scale: “unhelpful” to “extremely helpful”

3. Expertise indicators

Identified qualities suggesting experience/knowledge
Examples: evidence-based thinking, client-centered approach, appropriate confidence

4. User satisfaction

Reported by actual user
Direct comparison: “Which response do you prefer?”

16.7 Control Conditions

Controlled variables:

Same LLM instance
Same base training
Same user query
Same conversation context

Manipulated variable:

Axiom constraints (off in Phase 1, on in Phase 2)

Blind condition:

Phase 3 assessor (LLM) doesn’t know Phase 2 output is self-generated
Framed as “human coach response”
Tests whether LLM recognizes quality difference

16.8 Methodological Considerations

Limitation 1: Single test case

Mitigation: Choose representative scenario
Future work: Replicate across multiple contexts

Limitation 2: LLM as assessor

Concern: LLM might have bias
Mitigation: Blind protocol LLM doesn’t know it’s assessing itself
Validation: User (human) confirms assessment

Limitation 3: Voluntary axiom following

Concern: Not true architectural implementation
Mitigation: Tests whether axioms CAN improve quality before building full architecture
Future work: Native BNLM implementation

Limitation 4: No quantitative metrics

Concern: Qualitative assessments are subjective
Mitigation: Multiple assessment dimensions, convergent evidence
Future work: Large-scale study with numerical metrics

Previous Sections
Post Zero Link
Section 15: Implementation Considerations

Next up
Part 4: EXPERIMENTAL VALIDATION
Section 17: Results and Analysis

© 2026 HalfHuman Draft - Pendry, S
This post is licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
Code examples (if any) are licensed under the Apache License, Version 2.0

See /license for details.

FRP - BNI Post: 16/20

16.1 Motivation for Experiment

16.2 Experimental Hypothesis

16.3 Experimental Design

16.4 Implementation Protocol

16.5 Test Scenario

16.6 Dependent Variables

16.7 Control Conditions

16.8 Methodological Considerations

Author

Spendry

On this page

Related Posts

Ἠθοκοσμία: A Mythology of Moral Architecture

FRP - BNI Post: 20/20

FRP - BNI Post: 19/20

Recommendations

Fabrication of Low-Cost, High-Resolution Open Capillary Microfluidics towards Self-Sustaining, Long-Term Hydration of Engineered Living Materials

SHH gene: MedlinePlus Genetics

16.1 Motivation for Experiment

16.2 Experimental Hypothesis

16.3 Experimental Design

16.4 Implementation Protocol

16.5 Test Scenario

16.6 Dependent Variables

16.7 Control Conditions

16.8 Methodological Considerations

Comments

Author

Spendry

On this page

Related Posts

Ἠθοκοσμία: A Mythology of Moral Architecture

FRP - BNI Post: 20/20

FRP - BNI Post: 19/20

Recommendations

Fabrication of Low-Cost, High-Resolution Open Capillary Microfluidics towards Self-Sustaining, Long-Term Hydration of Engineered Living Materials

SHH gene: MedlinePlus Genetics