llmicl_inpca
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: AntonioLiu97
- Language: Jupyter Notebook
- Default Branch: inPCA
- Size: 293 MB
Statistics
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
LLMICL InPCA
This repository contains the complementary codebase for the paper: Density estimation with LLMs: a geometric investigation of in-context learning trajectories
LLaMA-2 70B estimating a randomly generated, multi-modal distribution from 400 data points

In-context density-estimation trajectories traversed by LLaMA-2 70B, Bayesian histogram, and kernel density estimator
Directory structure
/data: Contains functions for converting lists of sampled data $X1,X2,...,Xn \sim P(x)$ into 1D strings, which are then used to prompt LLMs. It contains `seriesgenerator.ipynb`, a Jupyter notebook for generating all distributions investigated in the paper: Gaussian, uniform, Student's t-distribution, and random PDFs./generated_series: This directory caches all prompts generated byseries_generator.ipynbin the form of pickled dictionaries./models:ICL.pyimplements essential packages like Hierarchy-PDF and its auxiliary functions.generate_predictions.pyprompts LLMs such as LLaMA, Mistral, and Gemma with the generated prompts and saves the estimated PDFs as pickled Hierarchy-PDFs.baseline_models.pyimplements baseline density-estimation algorithms such as KDE and Bayesian histogram.
/processed_series: Stores the density estimation trajectories of LLMs./inPCA: Contains Jupyter notebooks for analyzing LLMs' DE trajectories with InPCA:inPCA_multi_traj.ipynbsimultaneously embeds multiple DE trajectories within the same inPCA visualization.inPCA_multi_traj_kernel_nD_fit.ipynbsimultaneously embeds multiple DE trajectories, as well as their bespoke KDE trajectories.inPCA_multi_traj_kernel_nD_fit_meta_embed.ipynbperforms meta-inPCA embeddings of multiple trajectories and their bespoke KDE imitations.
/figures: A repository for all figures generated through the analysis processes.
Owner
- Name: Toni Liu
- Login: AntonioLiu97
- Kind: user
- Company: Cornell University
- Repositories: 1
- Profile: https://github.com/AntonioLiu97
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: ICL_inPCA
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Toni J.B.
family-names: Liu
email: jl3499@cornell.edu
affiliation: 'Cornell University '
orcid: 'https://orcid.org/0009-0001-3142-5402'
identifiers:
- type: url
value: 'https://arxiv.org/abs/2410.05218'
description: ArXiv URL
repository-code: 'https://github.com/AntonioLiu97/LLMICL_inPCA'
url: 'https://github.com/AntonioLiu97/LLMICL_inPCA'
abstract: >-
This is the codebase for the paper "Density estimation
with LLMs: a geometric investigation of in-context
learning trajectories"
keywords:
- >-
density estimation, DE, KDE, LLM, in-context learning,
kernel methods, PCA, InPCA, multidimensional scaling
license: MIT
commit: Initial release
version: 1.0.0
date-released: '2025-02-14'
GitHub Events
Total
- Watch event: 1
- Push event: 5
- Public event: 1
Last Year
- Watch event: 1
- Push event: 5
- Public event: 1