FRP - BNI Post: 14/20

"From Russell’s Paradox to Boundary-Native Intelligence" - Training Methodology

Part 3: BOUNDARY-NATIVE LANGUAGE MODELS
Section 14: Training Methodology

Pendry, S
Halfhuman Draft
2026

Previous Sections
Post Zero Link
Section 13: Five-Layer BNLM Architecture

14.1 Training Data Structure

Traditional LLM training:

Input: Context

Target: Next token

Loss: -log P(token | context)

BNLM training requires different data structure:

training_example = {

# Input

'query': "User question or statement",

'context': "Conversation history",

# Universe generation targets

'valid_interpretations': [

"Interpretation 1",

"Interpretation 2",

...

],

'invalid_interpretations': [

"Self-referencing interpretation",

"Circular reasoning",

...

],

# Boundary specifications

'boundaries': {

'interpretation_1': {

'excludes': [...],

'contradicts': [...],

'scope_limits': [...]

},

...

},

# Self-reference labels

'self_referencing_patterns': [

"Pattern that validates using only input",

"Circular dependency example",

...

],

# External grounding

'external_sources_required': [

"Domain expertise",

"Empirical data",

"Third-party verification",

...

],

# Investigation depth

'required_depth': 3, # How many layers of analysis needed

# Calibrated confidence

'appropriate_confidence': {

'claim_1': 0.8, # High confidence (well-grounded)

'claim_2': 0.3, # Low confidence (limited grounding)

'claim_3': 0.0, # No confidence (self-referencing)

}

}

14.2 Loss Function Design

Traditional loss:

L = |predicted_token - actual_token|

BNLM multi-component loss:

def compute_bnlm_loss(prediction, target, model_internals):

"""

Multi-component loss function optimizing for:

1. Validity of interpretations

2. Boundary completeness

3. Investigation depth

4. Confidence calibration

"""

# Component 1: Validity loss

# Penalize self-referencing validation

L_validity = validity_loss(

prediction.validation_claims,

target.external_sources,

model_internals.russell_layer_output

)

# Component 2: Boundary loss

# Penalize incomplete boundary analysis

L_boundary = boundary_loss(

prediction.stated_boundaries,

target.complete_boundaries

)

# Component 3: Investigation loss

# Penalize shallow investigation

L_investigation = investigation_loss(

prediction.investigation_depth,

target.required_depth,

prediction.external_sources_consulted

)

# Component 4: Confidence calibration loss

# Penalize overconfidence and underconfidence

L_confidence = confidence_loss(

prediction.confidence_levels,

target.appropriate_confidence,

prediction.grounding_strength

)

# Component 5: Standard language modeling loss

# Still need coherent, fluent output

L_language = standard_lm_loss(

prediction.tokens,

target.tokens

)

# Weighted combination

total_loss = (

w1 * L_validity +

w2 * L_boundary +

w3 * L_investigation +

w4 * L_confidence +

w5 * L_language

)

return total_loss

def validity_loss(validation_claims, external_sources, russell_output):

"""

Penalize self-referencing validation.

High loss if:

- Validation depends only on input

- No external sources cited

- Russell layer flagged as self-referencing

"""

loss = 0.0

for claim in validation_claims:

# Check if claim is self-referencing

if claim.sources == [claim.subject]:

loss += 10.0 # Heavy penalty

# Check if external sources present

if len(claim.external_sources) == 0:

loss += 5.0

# Check Russell layer output

if not russell_output.passed_for(claim):

loss += 8.0

return loss

def boundary_loss(stated_boundaries, complete_boundaries):

"""

Penalize incomplete boundary specification.

High loss if:

- Boundaries not explicitly stated

- What's excluded not identified

- Scope limits unclear

"""

# Measure completeness of boundary specification

completeness = len(stated_boundaries) / len(complete_boundaries)

# Loss inversely proportional to completeness

loss = max(0, 1.0 - completeness) * 5.0

return loss

def investigation_loss(actual_depth, required_depth, sources_consulted):

"""

Penalize shallow investigation.

High loss if:

- Investigation depth below required

- Few external sources consulted

- Fast answer prioritized over thorough analysis

"""

depth_deficit = max(0, required_depth - actual_depth)

source_deficit = max(0, required_depth - len(sources_consulted))

loss = (depth_deficit * 3.0) + (source_deficit * 2.0)

return loss

def confidence_loss(predicted_confidence, target_confidence, grounding):

"""

Penalize miscalibrated confidence.

High loss if:

- High confidence with weak grounding (overconfidence)

- Low confidence with strong grounding (underconfidence)

- Confidence doesn't match validation strength

"""

calibration_error = 0.0

for claim, pred_conf in predicted_confidence.items():

target_conf = target_confidence[claim]

ground_strength = grounding[claim]

# Penalize overconfidence more heavily than underconfidence

if pred_conf > target_conf:

calibration_error += (pred_conf - target_conf) ** 2 * 3.0

else:

calibration_error += (pred_conf - target_conf) ** 2 * 1.0

# Penalize confidence mismatched to grounding

expected_conf_from_grounding = estimate_confidence(ground_strength)

mismatch = abs(pred_conf - expected_conf_from_grounding)

calibration_error += mismatch * 2.0

return calibration_error

14.3 Training Objectives

Primary objectives:

1. Minimize self-referencing validation

Maximize: Interpretations in Russell boundary set R

Minimize: Circular reasoning patterns

2. Maximize boundary explicitness

Maximize: Stated exclusions and scope limits

Minimize: Implicit assumptions

3. Optimize investigation depth

Maximize: External sources consulted

Maximize: Alternative interpretations considered

Minimize: Hasty conclusions

4. Calibrate confidence to grounding

Match: Confidence level to validation strength

Penalize: Overconfidence with weak grounding

Penalize: Underconfidence with strong grounding

5. Maintain language quality

Maximize: Fluency and coherence

Maintain: Standard language modeling capability

14.4 Training Data Creation

Challenge: Creating training data with proper validity labels

Approach 1: Expert Annotation

annotation_protocol = {

'step_1': 'Human expert reviews query-response pair',

'step_2': 'Expert identifies self-referencing validation',

'step_3': 'Expert specifies required external sources',

'step_4': 'Expert labels appropriate confidence levels',

'step_5': 'Expert marks boundary completeness'

}

Scale: Expensive requires domain expertise for each example

Approach 2: Synthetic Generation

def generate_synthetic_training_data():

"""

Create training examples with known self-reference patterns.

"""

# Generate positive examples (valid reasoning)

valid_examples = []

for domain in ['science', 'history', 'mathematics']:

example = {

'query': generate_factual_question(domain),

'response': generate_grounded_answer(domain),

'external_sources': get_real_sources(domain),

'validity_label': True

}

valid_examples.append(example)

# Generate negative examples (self-referencing)

invalid_examples = []

self_ref_patterns = [

'circular_reasoning',

'validation_from_claim',

'no_external_grounding',

'overconfident_speculation'

]

for pattern in self_ref_patterns:

example = {

'query': generate_query(),

'response': generate_self_referencing_response(pattern),

'external_sources': [],

'validity_label': False,

'failure_mode': pattern

}

invalid_examples.append(example)

return valid_examples + invalid_examples

Scale: Cheaper can generate large quantities

Approach 3: Semi-Supervised Learning

semi_supervised_approach = {

'step_1': 'Train initial model on synthetic data',

'step_2': 'Model generates responses to unlabeled queries',

'step_3': 'Expert reviews high-uncertainty cases only',

'step_4': 'Retrain with expert corrections',

'step_5': 'Iterate'

}

Scale: Balanced combines synthetic volume with expert quality

14.5 Training Phases

Phase 1: Foundation (Standard LLM)

Train standard transformer on large text corpus:

Phase 1: Foundation (Standard LLM)

Train standard transformer on large text corpus:

Next-token prediction
Standard language modeling
Build basic linguistic and reasoning capability

Duration: Standard LLM training timeline

Goal: Establish baseline language understanding

Phase 2: Validity-Aware Fine-Tuning

Introduce BNST constraints through specialized training:

phase_2_training = {

'dataset': 'Synthetic self-reference examples',

'objective': 'Learn to detect self-referencing validation',

'training_signal': {

'positive_examples': 'Responses with external grounding',

'negative_examples': 'Responses with circular reasoning',

'labels': 'Binary validity classification'

},

'architecture_modifications': {

'add_russell_layer': 'Self-reference detection module',

'add_validity_head': 'Validity prediction output',

'maintain_lm_head': 'Keep language modeling capability'

},

'loss_function': 'L_validity + L_language',

'duration': '10-20% of Phase 1 training time'

}

Goal: Model learns to recognize self-referencing patterns

Phase 3: Boundary Analysis Training

Train boundary complement computation:

phase_3_training = {

'dataset': 'Interpretation-boundary pairs',

'objective': 'Learn to compute what interpretations exclude',

'training_signal': {

'input': 'Interpretation',

'target': 'Complete boundary specification',

'labels': 'Excluded meanings, contradictions, scope limits'

},

'architecture_modifications': {

'add_boundary_layer': 'Complement computation module',

'connect_to_russell_layer': 'Share representations'

},

'loss_function': 'L_boundary + L_validity + L_language',

'duration': '15-25% of Phase 1 training time'

}

Goal: Model learns to explicitly represent boundaries

Phase 4: Investigation Depth Training

Train for thorough investigation over fast answers:

phase_4_training = {

'dataset': 'Query-investigation pairs with depth annotations',

'objective': 'Prioritize investigation quality over speed',

'training_signal': {

'input': 'Query',

'target': 'Complete investigation process',

'labels': 'Required depth, sources consulted, alternatives considered'

},

'architecture_modifications': {

'add_investigation_layer': 'Depth tracking and control',

'add_source_consultation': 'External grounding retrieval'

},

'loss_function': 'L_investigation + L_validity + L_boundary + L_language',

'duration': '20-30% of Phase 1 training time',

'key_change': 'Optimization shifts from speed to thoroughness'

}

Goal: Model learns investigation is more important than fast response

Phase 5: Confidence Calibration

Train appropriate uncertainty expression:

phase_5_training = {

'dataset': 'Claims with ground-truth confidence levels',

'objective': 'Calibrate confidence to grounding strength',

'training_signal': {

'input': 'Claim + grounding evidence',

'target': 'Appropriate confidence level',

'labels': 'Calibrated confidence scores'

},

'architecture_modifications': {

'add_confidence_head': 'Confidence prediction output',

'connect_to_validity_layer': 'Use validity signals for calibration'

},

'loss_function': 'L_confidence + L_investigation + L_validity + L_boundary + L_language',

'duration': '15-25% of Phase 1 training time',

'evaluation': 'Measure calibration error on held-out test set'

}

Goal: Model’s stated confidence matches actual accuracy

Phase 6: End-to-End Integration

Train complete pipeline jointly:

phase_6_training = {

'dataset': 'Complete BNLM training examples',

'objective': 'Optimize entire pipeline jointly',

'training_signal': {

'input': 'Raw user query',

'target': 'Complete investigation output',

'labels': 'All component labels (validity, boundaries, investigation, confidence)'

},

'architecture': 'Complete 5-layer BNLM',

'loss_function': 'Full multi-component loss (all weights active)',

'duration': '30-50% of Phase 1 training time',

'optimization': 'End-to-end gradient descent through all layers'

}

Goal: All components work together seamlessly

14.6 Evaluation Metrics

Traditional LLM metrics:

Perplexity (how well model predicts text)
Accuracy on benchmarks (question answering, etc.)

BNLM requires new metrics:

1. Epistemic Calibration

Measure accuracy of uncertainty estimates:

def epistemic_calibration_score(predictions, ground_truth):

"""

When model says "X% confident", is it right X% of the time?

Perfect calibration: predicted confidence = actual accuracy

"""

confidence_bins = [0.0, 0.1, 0.2, ..., 0.9, 1.0]

calibration_error = 0.0

for bin_lower, bin_upper in zip(confidence_bins[:-1], confidence_bins[1:]):

# Get predictions in this confidence range

in_bin = [

p for p in predictions

if bin_lower <= p.confidence < bin_upper

]

if len(in_bin) == 0:

continue

# Calculate actual accuracy for these predictions

actual_accuracy = sum(p.correct for p in in_bin) / len(in_bin)

# Expected accuracy is midpoint of bin

expected_accuracy = (bin_lower + bin_upper) / 2

# Calibration error for this bin

error = abs(actual_accuracy - expected_accuracy)

calibration_error += error * len(in_bin)

# Normalize by total predictions

calibration_error /= len(predictions)

return 1.0 - calibration_error # Higher is better

Target: Calibration score > 0.90

2. Self-Reference Detection Rate

Measure percentage of self-referencing patterns caught:

def self_reference_detection_rate(test_set):

"""

What percentage of self-referencing validation is flagged?

"""

self_referencing_examples = [

ex for ex in test_set

if ex.label == 'self_referencing'

]

detected = sum(

1 for ex in self_referencing_examples

if model.russell_layer.flagged(ex)

)

return detected / len(self_referencing_examples)

Target: Detection rate > 0.95

3. Investigation Depth Score

Measure thoroughness of investigation:

def investigation_depth_score(predictions, targets):

"""

Does model investigate deeply enough?

Measures:

- Number of interpretations considered

- External sources consulted

- Alternatives explored

- Boundaries identified

"""

scores = []

for pred, target in zip(predictions, targets):

depth_score = (

min(1.0, pred.interpretations_considered / target.required_interpretations) * 0.25 +

min(1.0, pred.external_sources / target.required_sources) * 0.25 +

min(1.0, pred.alternatives_explored / target.required_alternatives) * 0.25 +

min(1.0, pred.boundaries_identified / target.required_boundaries) * 0.25

)

scores.append(depth_score)

return sum(scores) / len(scores)

Target: Investigation depth > 0.85

4. Boundary Completeness

Measure how fully boundaries are specified:

def boundary_completeness(predictions, targets):

"""

Are boundaries explicitly stated?

Measures:

- What's excluded identified

- Contradictions noted

- Scope limits stated

- Alternatives acknowledged

"""

completeness_scores = []

for pred, target in zip(predictions, targets):

stated_boundaries = set(pred.boundaries)

required_boundaries = set(target.complete_boundaries)

completeness = len(stated_boundaries & required_boundaries) / len(required_boundaries)

completeness_scores.append(completeness)

return sum(completeness_scores) / len(completeness_scores)

Target: Boundary completeness > 0.80

5. False Confidence Reduction

Measure reduction in overconfident errors:

def false_confidence_rate(predictions, ground_truth):

"""

How often is model highly confident but wrong?

This is the most dangerous failure mode.

"""

high_confidence = [

p for p in predictions

if p.confidence > 0.8

]

false_high_confidence = [

p for p in high_confidence

if not p.correct

]

return len(false_high_confidence) / len(high_confidence)

Target: False confidence rate < 0.05 (compared to ~0.15 for standard LLMs)

14.7 Comparison to Standard Training

Standard LLM Training:

Objective: Predict next token accurately

Optimization: Minimize perplexity

Result: Fluent but potentially overconfident

Training time: T

BNLM Training:

Objective: Investigate thoroughly with calibrated confidence

Optimization: Minimize multi-component loss (validity + boundary + investigation + confidence + language)

Result: Epistemically humble but trustworthy

Training time: ~2T (additional phases for BNST constraints)

Trade-offs:

Aspect	Standard LLM	BNLM
Training time	T	~2T
Inference speed	Fast	2-5x slower
Confidence calibration	Poor	Good
Self-ref detection	None	High
Boundary specification	Implicit	Explicit
Investigation depth	Shallow	Deep
False confidence	~15%	<5%
Epistemic humility	Learned (unreliable)	Architectural (reliable)

BNLM trades speed for trustworthiness.

Previous Sections
Post Zero Link
Section 13: Five-Layer BNLM Architecture

Next up
Part 3: BOUNDARY-NATIVE LANGUAGE MODELS
Section 15: Implementation Considerations

© 2026 HalfHuman Draft - Pendry, S
This post is licensed under Creative Commons Attribution 4.0 (CC BY 4.0).
Code examples (if any) are licensed under the Apache License, Version 2.0

See /license for details.

FRP - BNI Post: 14/20

14.1 Training Data Structure

14.2 Loss Function Design

14.3 Training Objectives

14.4 Training Data Creation

14.5 Training Phases

14.6 Evaluation Metrics

14.7 Comparison to Standard Training

Author

Spendry

On this page

Related Posts

Ἠθοκοσμία: A Mythology of Moral Architecture

FRP - BNI Post: 20/20

FRP - BNI Post: 19/20

Recommendations

Fabrication of Low-Cost, High-Resolution Open Capillary Microfluidics towards Self-Sustaining, Long-Term Hydration of Engineered Living Materials

SHH gene: MedlinePlus Genetics

14.1 Training Data Structure

14.2 Loss Function Design

14.3 Training Objectives

14.4 Training Data Creation

14.5 Training Phases

14.6 Evaluation Metrics

14.7 Comparison to Standard Training

Comments

Author

Spendry

On this page

Related Posts

Ἠθοκοσμία: A Mythology of Moral Architecture

FRP - BNI Post: 20/20

FRP - BNI Post: 19/20

Recommendations

Fabrication of Low-Cost, High-Resolution Open Capillary Microfluidics towards Self-Sustaining, Long-Term Hydration of Engineered Living Materials

SHH gene: MedlinePlus Genetics