job-description-bias
Exploring Gender Bias in AI-Generated Job Descriptions
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.9%) to scientific vocabulary
Repository
Exploring Gender Bias in AI-Generated Job Descriptions
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
job-description-bias
Gendered Language Analysis
This project analyzes gendered language in job descriptions to identify patterns in word usage associated with male-coded, female-coded, and gender-neutral job descriptions. It includes scripts for:
- Calculating gendered word frequencies
- Generating visualizations for feminine and masculine word distribution
- Running cosine similarity analysis
- Clustering and word cloud generation
Table of Contents
Project Structure
This repository includes the following key files and folders: * genderedanalysis.py – Main analysis script for gendered word frequencies and visualizations * similarityanalysis.py – cosine similarity and clustering analysis script * wordcloud_analysis.py – Script for generating word clouds * data/ – Folder containing gendered word lists and job description text files * requirements.txt – Dependencies for running the analysis
Installation
- Clone the Repository:
```bash git clone https://github.com/yourusername/your-repo-name.git cd your-repo-name
- Install Required Python Packages:
Make sure you have Python 3.6+ installed, then install dependencies using:
pip install -r requirements.txt
Verify PyTorch Compatibility
Ensure that PyTorch is installed correctly, as it is required by the BERT model. You may need to install it separately depending on your hardware (CPU vs. GPU). Check the PyTorch installation page for details.
Data Preparation
Gendered Word Lists: - The folder data/ should contain two text files, Femalewords.txt and Malewords.txt, with gendered words for analysis. These files come from well-known gender bias studies. Job Description Files: - Place your job description files in the data/ folder. The sample file names used in the script are: - malecodedjobs.txt - femalecodedjobs.txt - genderneutraljobs.txt - You can add or rename files, but be sure to update the filestoanalyze variable in each script accordingly.
Running the Analysis
- Run Gendered Word Frequency and Visualization Analysis
python gendered_analysis.py
This script will:
- Calculate and display the percentage of feminine and masculine words in each job description file.
- Generate bar charts for feminine vs. masculine word usage and the top words in each category.
- Run Cosine Similarity and Clustering Analysis
python similarity_analysis.py
This script will:
- Calculate cosine similarity between job description types for gendered words using word embeddings.
- Create a clustering plot to visualize similarities.
- Generate Word Clouds
python wordcloud_analysis.py
This script will:
- Generate word clouds for each job description type, highlighting frequent words after removing common stopwords and top words.
Interpreting the Results
- Frequency Visualizations:
- The bar charts provide an overview of the distribution of gendered words across job description types.
- Similarity Analysis:
- Cosine similarity values indicate how similar or different the job descriptions are in terms of gendered language.
- The clustering plot visualizes the groupings of sentences based on gendered word usage.
- Word Clouds:
- Word clouds highlight prominent words, offering insights into the themes and language patterns within each job description category.
Owner
- Login: jenniferkrebsbach
- Kind: user
- Repositories: 1
- Profile: https://github.com/jenniferkrebsbach
Citation (citation.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Krebsbach" given-names: "Jennifer M." orcid: "0000-0002-7934-0918" title: "Exploring gender bias in AI-Generated job descriptions" version: 1.0.0 date-released: 2024-11-11 url: "https://github.com/jenniferkrebsbach/job-description-bias"
GitHub Events
Total
- Release event: 1
- Delete event: 1
- Push event: 30
- Pull request event: 2
- Fork event: 1
- Create event: 4
Last Year
- Release event: 1
- Delete event: 1
- Push event: 30
- Pull request event: 2
- Fork event: 1
- Create event: 4
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 4 days
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: 4 days
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- xaintly (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- matplotlib *
- pandas *
- scikit-learn *
- torch *
- transformers *
- wordcloud *