Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: EdmundNegan
  • Language: Python
  • Default Branch: main
  • Size: 22.8 MB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme Citation

README.md

JarVision: Intelligent Interface for Robotic Arm Control

UR3e Robotic Arm

JarVision is an intelligent interface developed for the UR3e robotic arm, integrating advanced computer vision and natural language processing (NLP) technologies. This project enables seamless human-robot interaction through natural language commands and real-time visual perception of the robot's environment.

Table of Contents

Project Overview

JarVision bridges the gap between humans and robots by combining natural language processing with computer vision to empower the UR3e robotic arm to interpret complex user commands, interact dynamically with its environment, and execute tasks with precision.

Features

  • Face and Object Detection: Utilizes OpenCV and YOLO algorithms for real-time face/object tracking and interaction.
  • Environment Description: Generates detailed, context-aware descriptions of surroundings using OpenAI’s Vision API.
  • Natural Language Command Processing: Leverages LangChain and OpenAI’s ChatGPT API to process nuanced user queries.
  • Modular Design: Tools like track_face and track_object are integrated via LangChain for scalable and maintainable functionality.

System Architecture

The system architecture includes the following components: 1. Video and Audio Input: Captures video feed from a camera mounted on the UR3e robotic arm and voice commands from microphone. 2. LangChain Agent: Processes user commands and determines the appropriate action. 3. Tools: Includes functionalities like face tracking, object tracking, environment description, and robot control. 4. Robot Control Output: Sends URScript commands to the robotic arm for task execution.

System Overview

Technologies Used

  • Programming Languages: Python
  • Computer Vision: OpenCV, YOLO
  • Natural Language Processing: OpenAI’s ChatGPT API, LangChain
  • Simulations: Universal Robots Offline Simulator (URSim)
  • Hardware: UR3e robotic arm
  • Others: Tavily API

Setup Instructions

  1. Clone the repository: ```bash git clone https://github.com/EdmundNegan/JarVision.git
  2. Navigate to the project directory: ```bash cd JarVision
  3. Set up the Python environment: ```bash python -m venv jarvisionenv source jarvisionenv/bin/activate # For Linux/Mac jarvision_env\Scripts\activate # For Windows
  4. Install the required dependencies: ```bash pip install -r requirements.txt

For Virtual Environment 5. Download the latest version of Oracle VirtualBox from https://www.virtualbox.org/ (or your preferred virtualization software).

  1. Download the offline simulator for the UR3e robotic arm from Universal Robots https://www.universal-robots.com/download/software-e-series/simulator-non-linux/offline-simulator-e-series-ur-sim-for-non-linux-5126-lts/.

  2. Set the virtual machine network type to "host only Ethernet adapter".

  3. Find the IP address of the robot in URSim by clicking URSim UR3 and selecting the "About" option. Robot IP

  4. Create a .env file in the project root directory with the following format: ```plaintext OPENAIAPIKEY= TAVILYAPIKEY= HOST='' PORT='30002' VOSKMODEL='PATHTOVOSKMODEL'

  5. Set the payload to an arbitrary number like 0.1kg and start the robot. Start Robot

  6. Run the main.py file to begin issuing commands to the robotic arm (select chatbot or voice mode and use default camera settings) main.py face tracking

For Physical Robot 5. Connect UR3e robotic arm to power supply and start up system Start Robot

  1. Connect UR3e robotic arm to host computer via ethernet cable and connect Azure Kinect camera to host computer via USB-C cable

  2. Find the IP address of the robot in URSim by selecting the "About" option. Robot IP

  3. Create a .env file in the project root directory with the following format: ```plaintext OPENAIAPIKEY= TAVILYAPIKEY= HOST='' PORT='30002' VOSKMODEL='PATHTOVOSKMODEL'

  4. Go to network settings and manually configure IP assignment to settings shown Network Config

  5. Run the main.py file to begin issuing commands to the physical robotic arm (select chatbot or voice mode and use Kinect camera settings)

UR3e Physical Robot

Future Work

  • Upload short videos to the Vision API for enhanced environmental understanding.
  • Integrate tracking functionalities (trackface, trackobject) as modular LangChain tools.
  • Refine spatial awareness for precise distance measurements. (Uncomment depth configuration code in main.py and detection.py to edit)
  • Add gripper and mobility to robot

Acknowledgments

This project was developed as part of an academic project under the guidance of Dr. Yu Wu. It leverages the capabilities of cutting-edge technologies like OpenAI’s APIs and LangChain.

Special thanks to the Universal Robots community for providing simulation tools and the OpenCV community for their vision resources.

Owner

  • Name: Edmund Ngan
  • Login: EdmundNegan
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Ngan"
  given-names: "Ngan"
title: "JarVision Github Repository"
version: 2.0.4
date-released: 2024-12-15
url: "https://github.com/EdmundNegan/JarVision"

GitHub Events

Total
  • Watch event: 2
  • Member event: 1
  • Push event: 12
  • Create event: 5
Last Year
  • Watch event: 2
  • Member event: 1
  • Push event: 12
  • Create event: 5