jarvision
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: EdmundNegan
- Language: Python
- Default Branch: main
- Size: 22.8 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
JarVision: Intelligent Interface for Robotic Arm Control

JarVision is an intelligent interface developed for the UR3e robotic arm, integrating advanced computer vision and natural language processing (NLP) technologies. This project enables seamless human-robot interaction through natural language commands and real-time visual perception of the robot's environment.
Table of Contents
- Project Overview
- Features
- System Architecture
- Technologies Used
- Setup Instructions
- Usage
- Future Work
- Acknowledgments
Project Overview
JarVision bridges the gap between humans and robots by combining natural language processing with computer vision to empower the UR3e robotic arm to interpret complex user commands, interact dynamically with its environment, and execute tasks with precision.
Features
- Face and Object Detection: Utilizes OpenCV and YOLO algorithms for real-time face/object tracking and interaction.
- Environment Description: Generates detailed, context-aware descriptions of surroundings using OpenAI’s Vision API.
- Natural Language Command Processing: Leverages LangChain and OpenAI’s ChatGPT API to process nuanced user queries.
- Modular Design: Tools like
track_faceandtrack_objectare integrated via LangChain for scalable and maintainable functionality.
System Architecture
The system architecture includes the following components: 1. Video and Audio Input: Captures video feed from a camera mounted on the UR3e robotic arm and voice commands from microphone. 2. LangChain Agent: Processes user commands and determines the appropriate action. 3. Tools: Includes functionalities like face tracking, object tracking, environment description, and robot control. 4. Robot Control Output: Sends URScript commands to the robotic arm for task execution.

Technologies Used
- Programming Languages: Python
- Computer Vision: OpenCV, YOLO
- Natural Language Processing: OpenAI’s ChatGPT API, LangChain
- Simulations: Universal Robots Offline Simulator (URSim)
- Hardware: UR3e robotic arm
- Others: Tavily API
Setup Instructions
- Clone the repository: ```bash git clone https://github.com/EdmundNegan/JarVision.git
- Navigate to the project directory: ```bash cd JarVision
- Set up the Python environment: ```bash python -m venv jarvisionenv source jarvisionenv/bin/activate # For Linux/Mac jarvision_env\Scripts\activate # For Windows
- Install the required dependencies: ```bash pip install -r requirements.txt
For Virtual Environment 5. Download the latest version of Oracle VirtualBox from https://www.virtualbox.org/ (or your preferred virtualization software).
Download the offline simulator for the UR3e robotic arm from Universal Robots https://www.universal-robots.com/download/software-e-series/simulator-non-linux/offline-simulator-e-series-ur-sim-for-non-linux-5126-lts/.
Set the virtual machine network type to "host only Ethernet adapter".
Find the IP address of the robot in URSim by clicking URSim UR3 and selecting the "About" option.
Create a .env file in the project root directory with the following format: ```plaintext OPENAIAPIKEY=
TAVILYAPIKEY= HOST=' ' PORT='30002' VOSKMODEL='PATHTOVOSKMODEL' Set the payload to an arbitrary number like 0.1kg and start the robot.
Run the main.py file to begin issuing commands to the robotic arm (select chatbot or voice mode and use default camera settings)

For Physical Robot
5. Connect UR3e robotic arm to power supply and start up system
Connect UR3e robotic arm to host computer via ethernet cable and connect Azure Kinect camera to host computer via USB-C cable
Find the IP address of the robot in URSim by selecting the "About" option.
Create a .env file in the project root directory with the following format: ```plaintext OPENAIAPIKEY=
TAVILYAPIKEY= HOST=' ' PORT='30002' VOSKMODEL='PATHTOVOSKMODEL' Go to network settings and manually configure IP assignment to settings shown
Run the main.py file to begin issuing commands to the physical robotic arm (select chatbot or voice mode and use Kinect camera settings)
Future Work
- Upload short videos to the Vision API for enhanced environmental understanding.
- Integrate tracking functionalities (trackface, trackobject) as modular LangChain tools.
- Refine spatial awareness for precise distance measurements. (Uncomment depth configuration code in main.py and detection.py to edit)
- Add gripper and mobility to robot
Acknowledgments
This project was developed as part of an academic project under the guidance of Dr. Yu Wu. It leverages the capabilities of cutting-edge technologies like OpenAI’s APIs and LangChain.
Special thanks to the Universal Robots community for providing simulation tools and the OpenCV community for their vision resources.
Owner
- Name: Edmund Ngan
- Login: EdmundNegan
- Kind: user
- Repositories: 1
- Profile: https://github.com/EdmundNegan
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Ngan" given-names: "Ngan" title: "JarVision Github Repository" version: 2.0.4 date-released: 2024-12-15 url: "https://github.com/EdmundNegan/JarVision"
GitHub Events
Total
- Watch event: 2
- Member event: 1
- Push event: 12
- Create event: 5
Last Year
- Watch event: 2
- Member event: 1
- Push event: 12
- Create event: 5