simulated-selfhood-llms
Code and data for the behavioral evaluation of introspective coherence in LLMs.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.8%) to scientific vocabulary
Repository
Code and data for the behavioral evaluation of introspective coherence in LLMs.
Basic Info
- Host: GitHub
- Owner: josealprestes
- License: mit
- Language: Python
- Default Branch: main
- Size: 269 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
LLM Self-Reference Analysis
This repository contains the code, data, and figures for the paper "Simulated Selfhood in LLMs: A Behavioral Analysis of Introspective Coherence (Preprint Version)".
The study evaluates introspective simulation across five open-weight Large Language Models (LLMs), using repeated prompts and a three-stage evaluation pipeline (textual, semantic, and inferential).
Project Structure
src/— All Python scripts used in the experiment.outputs/— JSON and CSV files with model responses.results/— Final figures and computed analysis.models/— Empty folder with README for model download instructions.requirements.txt— Python dependencies.LICENSE— Open license for reuse.
Reproducing the Experiment
Clone this repo:
bash git clone https://github.com/SEU_USUARIO/llm-self-reference-analysis.git cd llm-self-reference-analysis(Optional) Create and activate a virtual environment:
bash python -m venv venv source venv/bin/activate # or venv\Scripts\activate on WindowsInstall dependencies:
bash pip install -r requirements.txtDownload the models (see
models/README.md).Run the experiment:
bash python src/main.py
Preprint
The associated preprint will be available soon. Once published, the link (DOI) will be added here.
License
Distributed under the MIT License. See LICENSE for details.
Owner
- Login: josealprestes
- Kind: user
- Repositories: 1
- Profile: https://github.com/josealprestes
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this repository, please cite as below."
title: "Simulated Selfhood in LLMs: A Behavioral Analysis of Introspective Coherence"
authors:
- family-names: "de Lima Prestes"
given-names: "José Augusto"
orcid: "https://orcid.org/0000-0001-8686-5360"
date-released: "2025-07-26"
version: "v2-preprint"
repository-code: "https://github.com/josealprestes/simulated-selfhood-llms"
url: "https://github.com/josealprestes/simulated-selfhood-llms"
license: "MIT"
type: "software"
abstract: "Large Language Models (LLMs) increasingly generate outputs that resemble introspection,
including self-reference, epistemic modulation, and claims about their internal
states. This study investigates whether such behaviors reflect consistent, underlying patterns
or are merely surface-level generative artifacts.We evaluated five open-weight, stateless LLMs
using a structured battery of 21 introspective prompts, each repeated ten times to yield 1,050
completions. These outputs were analyzed across four behavioral dimensions: surface-level
similarity (token overlap via SequenceMatcher), semantic coherence (Sentence-BERT embeddings),
inferential consistency (Natural Language Inference with a RoBERTa-large model),
and diachronic continuity (stability across prompt repetitions). Although some models exhibited
thematic stability, particularly on prompts concerning identity and consciousness, no
model sustained a consistent self-representation over time. High contradiction rates emerged
from a tension between mechanistic disclaimers and anthropomorphic phrasing. Following
recent behavioral frameworks, we heuristically adopt the term pseudo-consciousness to describe
structured yet non-experiential self-referential output in LLMs. This usage reflects
a functionalist stance that avoids ontological commitments, focusing instead on behavioral
regularities interpretable through Dennetts intentional stance. The study contributes a reproducible
framework for evaluating simulated introspection in LLMs and offers a graded
taxonomy for classifying such reflexive output. Our findings carry significant implications
for LLM interpretability, alignment, and user perception, highlighting the need for caution
when attributing mental states to stateless generative systems based on linguistic fluency
alone."
GitHub Events
Total
- Push event: 16
- Create event: 2
Last Year
- Push event: 16
- Create event: 2