ech0js

https://github.com/ech0ai/ech0js

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.1%) to scientific vocabulary

Last synced: 9 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: ech0ai
License: apache-2.0
Language: TypeScript
Default Branch: main
Size: 209 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

ech0js

ech0 is a framework to easily create LLM powered bots over any dataset. ech0js is Javascript version of ech0. If you want a python version, check out ech0-python

How it works

It abstracts the entire process of loading dataset, chunking it, creating embeddings and then storing in vector database.

You can add a single or multiple dataset using .add and .addLocal function and then use .query function to find an answer from the added datasets.

If you want to create a Naval Ravikant bot which has 2 of his blog posts, as well as a question and answer pair you supply, all you need to do is add the links to the blog posts and the QnA pair and ech0 will create a bot for you.

```javascript const dotenv = require("dotenv"); dotenv.config(); const { App } = require("ech0");

//Run the app commands inside an async function only async function testApp() { const navalChatBot = await App();

// Embed Online Resources await navalChatBot.add("webpage", "https://nav.al/feedback"); await navalChatBot.add("webpage", "https://nav.al/agi"); await navalChatBot.add( "pdffile", "https://navalmanack.s3.amazonaws.com/Eric-JorgensonThe-Almanack-of-Naval-Ravikant_Final.pdf" );

// Embed Local Resources await navalChatBot.addLocal("qna_pair", [ "Who is Naval Ravikant?", "Naval Ravikant is an Indian-American entrepreneur and investor.", ]);

const result = await navalChatBot.query( "What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?" ); console.log(result); // answer: Naval argues that humans possess the unique capacity to understand explanations or concepts to the maximum extent possible in this physical reality. }

testApp(); ```

Getting Started

Installation

First make sure that you have the package installed. If not, then install it using npm

bash npm install ech0 && npm install -S openai@^3.3.0

Currently, it is only compatible with openai 3.X, not the latest version 4.X. Please make sure to use the right version, otherwise you will see the ChromaDB error TypeError: OpenAIApi.Configuration is not a constructor
Make sure that dotenv package is installed and your OPENAI_API_KEY in a file called .env in the root folder. You can install dotenv by

js npm install dotenv

Download and install Docker on your device by visiting this link. You will need this to run Chroma vector database on your machine.
Run the following commands to setup Chroma container in Docker

bash git clone https://github.com/chroma-core/chroma.git cd chroma docker-compose up -d --build

Once Chroma container has been set up, run it inside Docker

Usage

We use OpenAI's embedding model to create embeddings for chunks and ChatGPT API as LLM to get answer given the relevant docs. Make sure that you have an OpenAI account and an API key. If you have dont have an API key, you can create one by visiting this link.
Once you have the API key, set it in an environment variable called OPENAI_API_KEY

js // Set this inside your .env file OPENAI_API_KEY = "sk-xxxx";

Load the environment variables inside your .js file using the following commands

js const dotenv = require("dotenv"); dotenv.config();

Next import the App class from ech0 and use .add function to add any dataset.
Now your app is created. You can use .query function to get the answer for any query.

```js const dotenv = require("dotenv"); dotenv.config(); const { App } = require("ech0");

async function testApp() { const navalChatBot = await App();

// Embed Local Resources await navalChatBot.addLocal("qna_pair", [ "Who is Naval Ravikant?", "Naval Ravikant is an Indian-American entrepreneur and investor.", ]);

testApp(); ```

If there is any other app instance in your script or app, you can change the import as

```javascript const { App: EmbedChainApp } = require("ech0");

// or

const { App: ECApp } = require("ech0"); ```

Format supported

We support the following formats:

PDF File

To add any pdf file, use the datatype as `pdffile`. Eg:

javascript await app.add("pdf_file", "a_valid_url_where_pdf_file_can_be_accessed");

Web Page

To add any web page, use the datatype as `webpage`. Eg:

javascript await app.add("web_page", "a_valid_web_page_url");

QnA Pair

To supply your own QnA pair, use the datatype as `qnapair` and enter a tuple. Eg:

javascript await app.addLocal("qna_pair", ["Question", "Answer"]);

More Formats coming soon

If you want to add any other format, please create an issue and we will add it to the list of supported formats.

Testing

Before you consume valueable tokens, you should make sure that the embedding you have done works and that it's receiving the correct document from the database.

For this you can use the dryRun method.

Following the example above, add this to your script:

```js let result = await navalchatbot.dryRun("What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts?");console.log(result);

''' Use the following pieces of context to answer the query at the end. If you don't know the answer, just say that you don't know, don't try to make up an answer. terms of the unseen. And I think that’s critical. That is what humans do uniquely that no other creature, no other computer, no other intelligence—biological or artificial—that we have ever encountered does. And not only do we do it uniquely, but if we were to meet an alien species that also had the power to generate these good explanations, there is no explanation that they could generate that we could not understand. We are maximally capable of understanding. There is no concept out there that is possible in this physical reality that a human being, given sufficient time and resources and Query: What unique capacity does Naval argue humans possess when it comes to understanding explanations or concepts? Helpful Answer: ''' ```

The embedding is confirmed to work as expected. It returns the right document, even if the question is asked slightly different. No prompt tokens have been consumed.

The dry run will still consume tokens to embed your query, but it is only ~1/15 of the prompt.

How does it work?

Creating a chat bot over any dataset needs the following steps to happen

load the data
create meaningful chunks
create embeddings for each chunk
store the chunks in vector database

Whenever a user asks any query, following process happens to find the answer for the query

create the embedding for query
find similar documents for this query from vector database
pass similar documents as context to LLM to get the final answer.

The process of loading the dataset and then querying involves multiple steps and each steps has nuances of it is own.

How should I chunk the data? What is a meaningful chunk size?
How should I create embeddings for each chunk? Which embedding model should I use?
How should I store the chunks in vector database? Which vector database should I use?
Should I store meta data along with the embeddings?
How should I find similar documents for a query? Which ranking model should I use?

These questions may be trivial for some but for a lot of us, it needs research, experimentation and time to find out the accurate answers.

ech0 is a framework which takes care of all these nuances and provides a simple interface to create bots over any dataset.

In the first release, we are making it easier for anyone to get a chatbot over any dataset up and running in less than a minute. All you need to do is create an app instance, add the data sets using .add function and then use .query function to get the relevant answer.

Tech Stack

ech0 is built on the following stack:

Langchain as an LLM framework to load, chunk and index data
OpenAI's Ada embedding model to create embeddings
OpenAI's ChatGPT API as LLM to get answers given the context
Chroma as the vector database to store embeddings

```

Owner

Login: ech0ai
Kind: user

Repositories: 1
Profile: https://github.com/ech0ai

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Singh"
  given-names: "Taranjeet"
title: "Embedchain"
date-released: 2023-06-25
url: "https://github.com/ech0/ech0js"

GitHub Events

Total

Watch event: 1
Push event: 2
Create event: 2

Last Year

Watch event: 1
Push event: 2
Create event: 2

Dependencies

package-lock.json npm

816 dependencies

package.json npm

@commitlint/cli ^17.1.2 development
@commitlint/config-conventional ^17.1.0 development
@commitlint/cz-commitlint ^17.1.2 development
@types/jest ^29.5.1 development
@types/jsdom ^21.1.1 development
@typescript-eslint/eslint-plugin ^5.41.0 development
@typescript-eslint/parser ^5.41.0 development
eslint ^8.34.0 development
eslint-config-airbnb-base ^15.0.0 development
eslint-config-airbnb-typescript ^17.0.0 development
eslint-config-prettier ^8.5.0 development
eslint-plugin-import ^2.27.5 development
eslint-plugin-prettier ^4.2.1 development
eslint-plugin-simple-import-sort ^8.0.0 development
eslint-plugin-testing-library ^5.9.1 development
eslint-plugin-unused-imports ^2.0.0 development
husky ^8.0.1 development
jest ^29.5.0 development
lint-staged ^13.0.3 development
prettier ^2.7.1 development
ts-jest ^29.1.0 development
ts-loader ^9.4.2 development
typescript ^5.2.2 development
axios ^1.4.0
chromadb ^1.5.6
jsdom ^22.1.0
langchain ^0.0.136
openai ^4.3.1
pdfjs-dist ^3.8.162
uuid ^9.0.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

ech0js

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

ech0js

How it works

Getting Started

Installation

Usage

Format supported

PDF File

Web Page

QnA Pair

More Formats coming soon

Testing

How does it work?

Tech Stack

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies