pandas-ai

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

https://github.com/sinaptik-ai/pandas-ai

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 106 committers (0.9%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.3%) to scientific vocabulary

Keywords

ai csv data data-analysis data-science data-visualization database datalake gpt-4 llm pandas sql text-to-sql

Keywords from Contributors

agents application fine-tuning llamaindex multi-agents rag vector-database
Last synced: 6 months ago · JSON representation ·

Repository

Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.

Basic Info
  • Host: GitHub
  • Owner: sinaptik-ai
  • License: other
  • Language: Python
  • Default Branch: main
  • Homepage: https://pandas-ai.com
  • Size: 54.8 MB
Statistics
  • Stars: 21,955
  • Watchers: 168
  • Forks: 2,135
  • Open Issues: 23
  • Releases: 196
Topics
ai csv data data-analysis data-science data-visualization database datalake gpt-4 llm pandas sql text-to-sql
Created almost 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation

README.md

PandasAI

Release CI CD Coverage Discord Downloads License: MIT Open in Colab

PandasAI is a Python platform that makes it easy to ask questions to your data in natural language. It helps non-technical users to interact with their data in a more natural way, and it helps technical users to save time, and effort when working with data.

🔧 Getting started

You can find the full documentation for PandasAI here.

You can either decide to use PandasAI in your Jupyter notebooks, Streamlit apps, or use the client and server architecture from the repo.

📚 Using the library

Python Requirements

Python version 3.8+ <3.12

📦 Installation

You can install the PandasAI library using pip or poetry.

With pip:

bash pip install "pandasai>=3.0.0b2"

With poetry:

bash poetry add "pandasai>=3.0.0b2"

💻 Usage

Ask questions

```python import pandasai as pai from pandasai_openai.openai import OpenAI

llm = OpenAI("OPENAIAPI_KEY")

pai.config.set({ "llm": llm })

Sample DataFrame

df = pai.DataFrame({ "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"], "revenue": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000] })

df.chat('Which are the top 5 countries by sales?') ```

China, United States, Japan, Germany, Australia


Or you can ask more complex questions:

python df.chat( "What is the total sales for the top 3 countries by sales?" )

The total sales for the top 3 countries by sales is 16500.

Visualize charts

You can also ask PandasAI to generate charts for you:

python df.chat( "Plot the histogram of countries showing for each one the gd. Use different colors for each bar", )

Chart

Multiple DataFrames

You can also pass in multiple dataframes to PandasAI and ask questions relating them.

```python import pandasai as pai from pandasai_openai.openai import OpenAI

employees_data = { 'EmployeeID': [1, 2, 3, 4, 5], 'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'], 'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance'] }

salaries_data = { 'EmployeeID': [1, 2, 3, 4, 5], 'Salary': [5000, 6000, 4500, 7000, 5500] }

llm = OpenAI("OPENAIAPI_KEY")

pai.config.set({ "llm": llm })

employeesdf = pai.DataFrame(employeesdata) salariesdf = pai.DataFrame(salariesdata)

pai.chat("Who gets paid the most?", employeesdf, salariesdf) ```

Olivia gets paid the most.

Docker Sandbox

You can run PandasAI in a Docker sandbox, providing a secure, isolated environment to execute code safely and mitigate the risk of malicious attacks.

Python Requirements

bash pip install "pandasai-docker"

Usage

```python import pandasai as pai from pandasaidocker import DockerSandbox from pandasaiopenai.openai import OpenAI

Initialize the sandbox

sandbox = DockerSandbox() sandbox.start()

employees_data = { 'EmployeeID': [1, 2, 3, 4, 5], 'Name': ['John', 'Emma', 'Liam', 'Olivia', 'William'], 'Department': ['HR', 'Sales', 'IT', 'Marketing', 'Finance'] }

salaries_data = { 'EmployeeID': [1, 2, 3, 4, 5], 'Salary': [5000, 6000, 4500, 7000, 5500] }

llm = OpenAI("OPENAIAPI_KEY")

pai.config.set({ "llm": llm })

employeesdf = pai.DataFrame(employeesdata) salariesdf = pai.DataFrame(salariesdata)

pai.chat("Who gets paid the most?", employeesdf, salariesdf, sandbox=sandbox)

Don't forget to stop the sandbox when done

sandbox.stop() ```

Olivia gets paid the most.

You can find more examples in the examples directory.

📜 License

PandasAI is available under the MIT expat license, except for the pandasai/ee directory of this repository, which has its license here.

If you are interested in managed PandasAI Cloud or self-hosted Enterprise Offering, contact us.

Resources

Beta Notice
Release v3 is currently in beta. The following documentation and examples reflect the features and functionality in progress and may change before the final release.

  • Docs for comprehensive documentation
  • Examples for example notebooks
  • Discord for discussion with the community and PandasAI team

🤝 Contributing

Contributions are welcome! Please check the outstanding issues and feel free to open a pull request. For more information, please check out the contributing guidelines.

Thank you!

Contributors

Owner

  • Name: PandasAI
  • Login: Sinaptik-AI
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
date-released: 2023-04-29
message: "If you use this software, please cite it as below."
title: "PandasAI: the conversational data analysis framework"
abstract: "PandasAI is a python library that makes it easy to ask questions to your data in natural language."
url: "https://github.com/sinaptik-ai/pandas-ai"
authors:
- family-names: "Venturi"
  given-names: "Gabriele"
  affiliation: "Sinaptik"
license: MIT

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 1,234
  • Total Committers: 106
  • Avg Commits per committer: 11.642
  • Development Distribution Score (DDS): 0.46
Past Year
  • Commits: 345
  • Committers: 29
  • Avg Commits per committer: 11.897
  • Development Distribution Score (DDS): 0.51
Top Committers
Name Email Commits
Gabriele Venturi l****i@g****m 666
Arslan Saleem k****8@g****m 185
Raoul Scalise 3****l 50
Massimiliano Pronesti m****i@g****m 47
Ihor 3****9 34
Henrique Branco h****o@g****m 25
Jonathan Biemond 1****d 14
Sanchit Bhavsar 3****n 13
Shahrukh Khan s****u@g****m 12
Cheng Wai 1****o 10
josephsinaptik g****e@s****i 10
Lorenzobattistela l****h@g****m 9
Milind Lalwani m****i@M****l 9
Johnson A 5****7 5
Oral Ersoy Dokumacı o****i@g****m 5
Tanmay patil 7****3 5
Avelino a****n@g****m 4
Victor Del Carpio v****g@g****m 4
ma-raza a****4@g****m 4
Hanchung Lee l****g@g****m 4
milind-sinaptik 1****k 3
goriri g****i@1****m 3
dohertychristopher4 4****4 3
Gaurang Pawar 5****1 3
yzaparto y****3@g****m 2
sourcery-ai[bot] 5****] 2
rekwet 3****t 2
mintlify[bot] 1****] 2
kukushking k****n@g****m 2
dSupertramp s****o@g****m 2
and 76 more...
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 525
  • Total pull requests: 567
  • Average time to close issues: 3 months
  • Average time to close pull requests: 14 days
  • Total issue authors: 366
  • Total pull request authors: 107
  • Average comments per issue: 2.47
  • Average comments per pull request: 0.85
  • Merged pull requests: 370
  • Bot issues: 0
  • Bot pull requests: 8
Past Year
  • Issues: 204
  • Pull requests: 353
  • Average time to close issues: 20 days
  • Average time to close pull requests: 4 days
  • Issue authors: 151
  • Pull request authors: 44
  • Average comments per issue: 1.36
  • Average comments per pull request: 0.49
  • Merged pull requests: 234
  • Bot issues: 0
  • Bot pull requests: 8
Top Authors
Issue Authors
  • gventuri (10)
  • anilmadishetty2498 (10)
  • Alan-zhong (9)
  • metalshanked (7)
  • gDanzel (7)
  • PavelAgurov (7)
  • ssling0817 (6)
  • johnfelipe (5)
  • lwdnxu (5)
  • HenriqueAJNB (5)
  • via007 (4)
  • prasum (4)
  • Yekai97 (4)
  • tytung2020 (4)
  • vpurandara (4)
Pull Request Authors
  • ArslanSaleem (163)
  • scaliseraoul (105)
  • gventuri (24)
  • gdcsinaptik (21)
  • shahrukh802 (18)
  • matteocacciola (16)
  • chengwaikoo (10)
  • HenriqueAJNB (8)
  • mspronesti (8)
  • nehcneb (6)
  • codebeaver-ai[bot] (6)
  • christophfroeschl (6)
  • oedokumaci (5)
  • prasum (4)
  • Muhammad-Adam1 (4)
Top Labels
Issue Labels
bug (128) stale (68) enhancement (67) duplicate (7) good first issue (7) documentation (5) test (4) security (3) help wanted (3) regression (2) dev (2) v1 (1) size:S (1) lgtm (1) size:XS (1)
Pull Request Labels
size:L (49) size:XS (44) size:M (33) size:S (23) size:XL (19) size:XXL (18) lgtm (12) enhancement (2) stale (1)

Packages

  • Total packages: 20
  • Total downloads:
    • pypi 18,946 last-month
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 83
  • Total maintainers: 2
pypi.org: jcloudai-litellm

LiteLLM integration for PandasAI

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 318 Last month
Rankings
Stargazers count: 0.4%
Forks count: 0.7%
Dependent packages count: 8.7%
Average: 14.7%
Dependent repos count: 49.0%
Maintainers (1)
Last synced: 6 months ago
pypi.org: jcloudai

Chat with your database (SQL, CSV, pandas, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 476 Last month
Rankings
Dependent packages count: 8.7%
Average: 28.9%
Dependent repos count: 49.1%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-litellm

LiteLLM integration for PandaAI

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 5,890 Last month
Rankings
Dependent packages count: 9.5%
Average: 31.5%
Dependent repos count: 53.6%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-docker
  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 656 Last month
Rankings
Dependent packages count: 9.7%
Average: 32.2%
Dependent repos count: 54.6%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-huggingface

Hugging Face integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 34 Last month
Rankings
Dependent packages count: 9.7%
Average: 32.3%
Dependent repos count: 54.8%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-oracle

Oracle connector integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 37 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.2%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-snowflake

Snowflake connector integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 46 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.2%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-databricks

Databricks connector integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 30 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.2%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-bigquery

Google BigQuery connector integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 20 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.2%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-local

Local LLM integration for PandaAI

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 80 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-langchain

Langchain integration for PandaAI

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 59 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-bedrock

AWS Bedrock integration for PandaAI

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 152 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-ibm

IBM integration for PandaAI

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 26 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-openai

OpenAI integration for PandasAI

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 10,939 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-google

Google integration for PandaAI

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 64 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.7%
Dependent repos count: 57.3%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-lancedb

LanceDB integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 28 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-pinecone

Pinecone integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 20 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-chromadb

ChromaDB integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 17 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-qdrant

Qdrant integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 24 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 6 months ago
pypi.org: pandasai-milvus

Milvus integration for PandaAI

  • Versions: 4
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 30 Last month
Rankings
Dependent packages count: 10.2%
Average: 33.8%
Dependent repos count: 57.4%
Maintainers (1)
Last synced: 6 months ago