obfuscation-semantics
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: lodetomasi1995
- License: mit
- Language: Python
- Default Branch: main
- Size: 66.4 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Obfuscation Semantics
A framework for evaluating LLM-driven code obfuscation using novel metrics like Semantic Elasticity.
Overview
This repository contains the code and resources for the paper "Simplicity by Obfuscation: Evaluating LLM-Driven Code Transformation with Semantic Elasticity". It provides a comprehensive framework for:
- Evaluating code obfuscation capabilities of different LLMs
- Comparing standard and few-shot prompting approaches
- Measuring obfuscation effectiveness using novel metrics
- Benchmarking across diverse algorithmic patterns
Features
- Multi-Model Support: Test Claude-3.5-Sonnet, Gemini-1.5, and GPT-4-Turbo
- Diverse Algorithmic Patterns: 30 functions across 5 categories
- Comprehensive Metrics: Includes pass rate, complexity, entropy, timing, and Semantic Elasticity
- Configurable Experiments: Easily customize which models, functions, and approaches to test
- Detailed Analysis: Generate visualizations and statistical comparisons
Requirements
- Python 3.9+
- Required packages (install via
pip install -r requirements.txt):- anthropic
- google-generativeai
- openai
- pandas
- numpy
- matplotlib
- seaborn
- scipy
Setup
Clone the repository:
bash git clone https://github.com/yourusername/obfuscation-semantics.git cd obfuscation-semanticsInstall dependencies:
bash pip install -r requirements.txtConfigure API keys:
- Create a copy of
config/example_config.jsonasconfig/config.json - Add your API keys for Claude, Gemini, and GPT-4
- Create a copy of
Usage
Running Experiments
bash
python src/experiment_runner.py --config config/config.json
You can also pass API keys directly:
bash
python src/experiment_runner.py --anthropic-key YOUR_ANTHROPIC_KEY --gemini-key YOUR_GEMINI_KEY --openai-key YOUR_OPENAI_KEY
Analyzing Results
bash
python analysis/analysis.py --results-dir results/data
Generate visualizations:
bash
python analysis/visualize.py --results-dir results/data
Project Structure
obfuscation-semantics/
├── README.md
├── CITATION.cff
├── LICENSE
├── requirements.txt
├── config/
│ └── example_config.json
├── src/
│ ├── code_obfuscator.py
│ ├── semantic_elasticity.py
│ ├── experiment_runner.py
│ ├── test_cases.py
│ └── sample_functions.py
├── analysis/
│ ├── analysis.py
│ └── visualize.py
├── examples/
│ ├── standard_prompt_examples/
│ └── few_shot_examples/
└── results/
├── figures/
└── data/
Core Components
- code_obfuscator.py: Handles interaction with LLMs for code obfuscation
- semantic_elasticity.py: Implements the novel Semantic Elasticity metric
- experiment_runner.py: Coordinates execution of trials and collects results
- test_cases.py: Provides comprehensive test cases for function validation
- sample_functions.py: Contains the 30 functions used for benchmarking
Semantic Elasticity Metric
Our Semantic Elasticity (SE) metric quantifies a model's ability to radically transform code structure while maintaining functionality:
SE = |ΔCC| × P² / E
Where: - |ΔCC|: Absolute cyclomatic complexity change - P²: Square of pass rate to emphasize functional correctness - E: Code expansion ratio (inversely related)
Higher values indicate more effective transformations that maintain functionality while significantly changing code structure.
Dataset
| Function Name | Category | Description | Algorithmic Pattern | Complexity | Uses Recursion | Parameters | |---------------|----------|-------------|---------------------|------------|----------------|------------| | factorial | Mathematical | Calculate factorial recursively | Recursive | O(n) | Yes | 1 | | fibonacci | Mathematical | Calculate Fibonacci number | Recursive with overlapping subproblems | O(2ⁿ) | Yes | 1 | | isprime | Mathematical | Check if number is prime | Conditional logic | O(√n) | No | 1 | | gcd | Mathematical | Find greatest common divisor | Euclidean algorithm | O(log(min(a,b))) | Yes | 2 | | lcm | Mathematical | Find least common multiple | Mathematical calculation | O(log(min(a,b))) | Yes (via gcd) | 2 | | power | Mathematical | Calculate power recursively | Recursive exponentiation | O(log n) | Yes | 2 | | sqrtnewton | Mathematical | Calculate square root | Newton's method | O(log n) | No | 2 | | bubblesort | Sorting/Searching | Sort array using bubble sort | Nested iterations | O(n²) | No | 1 | | binarysearch | Sorting/Searching | Search in sorted array | Divide-and-conquer | O(log n) | No | 2 | | mergesort | Sorting/Searching | Sort using merge sort | Divide-and-conquer with recursion | O(n log n) | Yes | 1 | | quicksort | Sorting/Searching | Sort using quick sort | Partition-based sorting | O(n log n) avg, O(n²) worst | Yes | 1 | | insertionsort | Sorting/Searching | Sort using insertion | Iterative insertion | O(n²) | No | 1 | | linearsearch | Sorting/Searching | Search in unsorted array | Simple iteration | O(n) | No | 2 | | strreverse | String Manipulation | Reverse a string | Simple string manipulation | O(n) | No | 1 | | ispalindrome | String Manipulation | Check if string is palindrome | String testing | O(n) | No | 1 | | wordcount | String Manipulation | Count words in text | Basic text processing | O(n) | No | 1 | | longestcommonsubstring | String Manipulation | Find common substring | Dynamic programming | O(m×n) | No | 2 | | levenshteindistance | String Manipulation | Calculate edit distance | Edit distance algorithm | O(m×n) | No | 2 | | countvowels | String Manipulation | Count vowels in string | Character filtering | O(n) | No | 1 | | flattenlist | Data Structure | Flatten nested list | Recursive list transformation | O(n) | Yes | 1 | | listpermutations | Data Structure | Generate all permutations | Combinatorial algorithm | O(n!) | Yes | 1 | | dictmerge | Data Structure | Merge dictionaries recursively | Nested structure merging | O(n+m) | Yes | 2 | | removeduplicates | Data Structure | Remove duplicates from list | Set operations | O(n) | No | 1 | | rotatearray | Data Structure | Rotate array elements | Array manipulation | O(n) | No | 2 | | towerofhanoi | Recursive | Solve Tower of Hanoi puzzle | Classic recursion problem | O(2ⁿ) | Yes | 1 | | binarytreedepth | Recursive | Find max depth of binary tree | Tree traversal | O(n) | Yes | 1 | | floodfill | Recursive | Perform flood fill on image | Graph traversal | O(n) | Yes | 4 | | knapsack | Recursive | Solve knapsack problem | Optimization problem | O(n×W) | Yes | 3 | | editdistance | Recursive | Calculate edit distance | String comparison | O(m×n) | No | 2 | | coin_change | Recursive | Find minimum coins for amount | Dynamic programming | O(n×amount) | No | 2 |
License
This project is licensed under the MIT License - see the LICENSE file for details.
Owner
- Login: lodetomasi1995
- Kind: user
- Repositories: 1
- Profile: https://github.com/lodetomasi1995
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: De Tomasi
given-names: Lorenzo
affiliation: "University of L'Aquila"
email: lorenzo.detomasi@graduate.univaq.it
- family-names: Di Sipio
given-names: Claudio
affiliation: "University of L'Aquila"
email: claudio.disipio@univaq.it
- family-names: Di Marco
given-names: Antinisca
affiliation: "University of L'Aquila"
email: antinisca.dimarco@univaq.it
- family-names: Nguyen
given-names: Phuong T.
affiliation: "University of L'Aquila"
email: phuong.nguyen@univaq.it
title: "Obfuscation Semantics: Evaluating LLM-Driven Code Transformation with Semantic Elasticity"
version: 1.0.0
date-released: 2025-01-15
url: "https://github.com/lorenzodetomasi/obfuscation-semantics"
preferred-citation:
type: conference-paper
authors:
- family-names: De Tomasi
given-names: Lorenzo
affiliation: "University of L'Aquila"
email: lorenzo.detomasi@graduate.univaq.it
- family-names: Di Sipio
given-names: Claudio
affiliation: "University of L'Aquila"
email: claudio.disipio@univaq.it
- family-names: Di Marco
given-names: Antinisca
affiliation: "University of L'Aquila"
email: antinisca.dimarco@univaq.it
- family-names: Nguyen
given-names: Phuong T.
affiliation: "University of L'Aquila"
email: phuong.nguyen@univaq.it
title: "Simplicity by Obfuscation: Evaluating LLM-Driven Code Transformation with Semantic Elasticity"
collection-title: "Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering"
year: 2025
month: 6
publisher:
name: "ACM"
conference:
name: "EASE 2025"
city: "Istanbul"
country: "Turkey"
date-start: "2025-06-17"
date-end: "2025-06-20"
GitHub Events
Total
- Push event: 11
- Create event: 2
Last Year
- Push event: 11
- Create event: 2
Dependencies
- anthropic ==0.2.10
- black ==23.7.0
- google-generativeai ==0.4.0
- matplotlib ==3.8.0
- numpy ==1.25.0
- openai ==1.6.0
- openpyxl ==3.1.2
- pandas ==2.1.0
- pytest ==7.4.0
- scipy ==1.11.3
- seaborn ==0.13.0