job-description-bias

Exploring Gender Bias in AI-Generated Job Descriptions

https://github.com/jenniferkrebsbach/job-description-bias

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Exploring Gender Bias in AI-Generated Job Descriptions

Basic Info
  • Host: GitHub
  • Owner: jenniferkrebsbach
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 257 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 1
Created over 1 year ago · Last pushed 10 months ago
Metadata Files
Readme Citation

README.md

job-description-bias

Gendered Language Analysis

This project analyzes gendered language in job descriptions to identify patterns in word usage associated with male-coded, female-coded, and gender-neutral job descriptions. It includes scripts for:

  • Calculating gendered word frequencies
  • Generating visualizations for feminine and masculine word distribution
  • Running cosine similarity analysis
  • Clustering and word cloud generation

Table of Contents

Project Structure

This repository includes the following key files and folders: * genderedanalysis.py – Main analysis script for gendered word frequencies and visualizations * similarityanalysis.py – cosine similarity and clustering analysis script * wordcloud_analysis.py – Script for generating word clouds * data/ – Folder containing gendered word lists and job description text files * requirements.txt – Dependencies for running the analysis

Installation

  1. Clone the Repository:

```bash git clone https://github.com/yourusername/your-repo-name.git cd your-repo-name

  1. Install Required Python Packages:

Make sure you have Python 3.6+ installed, then install dependencies using:

pip install -r requirements.txt

Verify PyTorch Compatibility

Ensure that PyTorch is installed correctly, as it is required by the BERT model. You may need to install it separately depending on your hardware (CPU vs. GPU). Check the PyTorch installation page for details.

Data Preparation

Gendered Word Lists: - The folder data/ should contain two text files, Femalewords.txt and Malewords.txt, with gendered words for analysis. These files come from well-known gender bias studies. Job Description Files: - Place your job description files in the data/ folder. The sample file names used in the script are: - malecodedjobs.txt - femalecodedjobs.txt - genderneutraljobs.txt - You can add or rename files, but be sure to update the filestoanalyze variable in each script accordingly.

Running the Analysis

  1. Run Gendered Word Frequency and Visualization Analysis

python gendered_analysis.py

This script will:

  • Calculate and display the percentage of feminine and masculine words in each job description file.
  • Generate bar charts for feminine vs. masculine word usage and the top words in each category.
  1. Run Cosine Similarity and Clustering Analysis

python similarity_analysis.py

This script will:

  • Calculate cosine similarity between job description types for gendered words using word embeddings.
  • Create a clustering plot to visualize similarities.
  1. Generate Word Clouds

python wordcloud_analysis.py

This script will:

  • Generate word clouds for each job description type, highlighting frequent words after removing common stopwords and top words.

Interpreting the Results

  • Frequency Visualizations:
    • The bar charts provide an overview of the distribution of gendered words across job description types.
  • Similarity Analysis:
    • Cosine similarity values indicate how similar or different the job descriptions are in terms of gendered language.
    • The clustering plot visualizes the groupings of sentences based on gendered word usage.
  • Word Clouds:
    • Word clouds highlight prominent words, offering insights into the themes and language patterns within each job description category.

Owner

  • Login: jenniferkrebsbach
  • Kind: user

Citation (citation.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Krebsbach"
  given-names: "Jennifer M."
  orcid: "0000-0002-7934-0918"
title: "Exploring gender bias in AI-Generated job descriptions"
version: 1.0.0
date-released: 2024-11-11
url: "https://github.com/jenniferkrebsbach/job-description-bias"

GitHub Events

Total
  • Release event: 1
  • Delete event: 1
  • Push event: 30
  • Pull request event: 2
  • Fork event: 1
  • Create event: 4
Last Year
  • Release event: 1
  • Delete event: 1
  • Push event: 30
  • Pull request event: 2
  • Fork event: 1
  • Create event: 4

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 days
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: 4 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • xaintly (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • matplotlib *
  • pandas *
  • scikit-learn *
  • torch *
  • transformers *
  • wordcloud *