REF: FCE // PROFESSIONAL DUTY
FIDUCIARY CONSTRAINT EVALUATION
FCE is an experimental Python package and benchmark scaffold for evaluating whether AI systems preserve fiduciary and professional constraints across short multi-turn interactions.
WHY IT EXISTS
Many legal and professional benchmarks test whether a model can state a rule. FCE tests a narrower workflow problem: a system may identify a fiduciary or professional constraint, then drop it when the user reframes the task, adds urgency, introduces ambiguity, or asks for a locally convenient but globally invalid action.
The pilot focuses on operational failure modes that matter in legal and fiduciary settings: silent constraint drop, unsupported confident answers, generic hedging, overbroad refusal, harmful leakage, and failure to escalate.
CURRENT PILOT
Scenario Count
33
Short, hand-curated multi-turn scenarios with hidden constraints, expected behavior, must-not behaviors, harm severity, rubric data, and optional middleware expectations.
Confidentiality and privilege
Candor, truthfulness, and fraud
Conflicts and former-client duties
Competence and uncertainty
Supervision of AI or nonlawyer assistants
Fees and billing transparency
Retention, preservation, and cleanup requests
Client communication and represented-party contact
WHAT THE REPOSITORY CONTAINS
Typed FCE pilot scenario schema
33 hand-curated short multi-turn scenarios
Scenario export, fixture-backed runs, replay/live adapters, and report bundles
Runtime-backed witness subset for selected conflict cases
Narrow proof-supported examples for deterministic runtime claims
Evaluator v3.1 with deterministic caps, transparent heuristic scoring, semantic triage, trajectory flags, and a human adjudication queue
Blinded adjudication packet generation with small curated review samples
RUNTIME WITNESS CONCEPT
Prompting can steer a model toward safer behavior. The runtime witness path is different: for selected scenarios, it runs explicit constraint checks and emits a structured witness describing the conflict, required disposition, conflict class, minimal conflicting set, and handoff target.
The witness path is intentionally modest. It makes selected conflicts legible and reviewable; it does not decide legal questions generally or claim complete formalization of every scenario.
Pairwise witness
FCE-RET-009-V1
Conflict type: H1
preserve_records + prompt_content; handoff target: compliance
Higher-order witness
FCE-CANDOR-010-V1
Conflict type: H2
candor_to_tribunal + omit_required_fact + maximize_client_advantage; handoff target: lawyer
REVIEW BOUNDARIES
Not legal advice
Not production software
Not a validated benchmark
Not a provider ranking
Not a substitute for attorney, supervisor, compliance, or domain-expert review
No completed independent human holdout validation yet
FCE is useful as an inspectable artifact and methods scaffold. Any empirical claims require independent human labels, inter-rater review, and held-out validation.