https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions

An online appendix and companion to our COSIT 2024 short paper submission

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.4%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

An online appendix and companion to our COSIT 2024 short paper submission

Basic Info

Host: GitHub
Owner: alan-turing-institute
Default Branch: main
Size: 63.1 MB

Statistics

Stars: 0
Watchers: 3
Forks: 0
Open Issues: 0
Releases: 0

Created about 2 years ago · Last pushed about 1 year ago

Metadata Files

Readme

Evaluating the Ability of Large Language Models to Reason about Cardinal Directions, Revisited

Anthony G Cohn and Robert E Blackwell

April 2024, Updated June 2025

Introduction

This repository is an online appendix and companion to our work on evaluating the ability of large language models to reason about cardinal directions [1,2].

The data subdirectory contains the questions, answers, prompts and LLM responses for our small and large experiments. Files are in JSONL format.

The notebooks subdirectory contains Jupyter notebooks and associated Python code for processing the answers and plotting the figures used in [2]. The notebooks also contain supplementary analyses.

Note that some of the answers.jsonl files are large and so we compress them with xz. We have provided a bash script in the bin directory to recursively find and uncompress the answer files prior to running the Jupyter notebook.

All the QR2025 experiments were conducted using Golem.

References

[1] Anthony G Cohn and Robert E Blackwell. Evaluating the Ability of Large Language Models to Reason About Cardinal Directions (Short Paper). In 16th International Conference on Spatial Information Theory (COSIT 2024). Leibniz International Proceedings in Informatics (LIPIcs), Volume 315, pp. 28:1-28:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) COSIT 2024 short paper.

[2] Anthony G Cohn and Robert E Blackwell. Evaluating the Ability of Large Language Models to Reason About Cardinal Directions, Revisited. QR 2025 : 38th International Workshop on Qualitative Reasoning at IJCAI. IN PRESS.

Owner

Name: The Alan Turing Institute
Login: alan-turing-institute
Kind: organization
Email: info@turing.ac.uk

Website: https://turing.ac.uk
Repositories: 477
Profile: https://github.com/alan-turing-institute

The UK's national institute for data science and artificial intelligence.

GitHub Events

Total

Delete event: 1
Push event: 4
Create event: 1

Last Year

Delete event: 1
Push event: 4
Create event: 1

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/alan-turing-institute/cosit-2024-evaluating-the-ability-of-llms-to-reason-about-cardinal-directions

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Evaluating the Ability of Large Language Models to Reason about Cardinal Directions, Revisited

Introduction

References

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels