https://github.com/ansleybrown1337/sms-api-scrapers

Python tools developed to obtain sensor data from the Sentek Irrimax platform

Last synced: 10 months ago · JSON representation

Repository

Python tools developed to obtain sensor data from the Sentek Irrimax platform

Basic Info

Host: GitHub
Owner: ansleybrown1337
License: gpl-2.0
Language: Python
Default Branch: main
Size: 37.1 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License

Soil‑Moisture Sensor API Scrapers

Created by A.J. Brown
Agricultural Data Scientist — CSU Agricultural Water Quality Program
Ansley.Brown@colostate.edu

1 • Project Scope

This repository houses API-based data fetching tools for three commercial soil‑moisture platforms:

| Vendor | API Docs from Company | Script | Reference Notes | |---------------------|----------------------------------------------------------|---------------------|--------------------------------------------| | Sentek IrriMAX | Irrimax Site | irrimax_scraper.py | View Notes | | AquaSpy (Hold)| AquaSpy Site | aquaspy_scraper.py | View Notes | | GroGuru InSites | GroGuru Site | groguru_scraper.py | View Notes |

All scrapers share a common goal: fetch logger metadata + time‑series moisture data → return a tidy pandas.DataFrame that can be streamed directly into WISE Pro (Water Irrigation Scheduling for Efficient Application).

[!IMPORTANT] Before using a scraper, please review its corresponding markdown file in the /docs/ folder, which are linked in the table above. These files contain important notes about API limitations, usage, and other relevant information.

2 • How This Supports WISE Pro

WISE Pro is an integrated decision‑support platform jointly developed by CSU, USDA‑ARS, and NMSU. It fuses real‑time sensing, machine‑learning data assimilation, and SWAT+/pyFAO56 modeling to generate actionable irrigation & nutrient recommendations.
Automated ingestion of Sentek, AquaSpy, and GroGuru data:

improves water‑balance forecasts,
reduces manual file wrangling, and
enables comparative analytics across hardware vendors.

3 • Repository Layout

```bash sms-api-scrapers/ │ ├── README.md # Project overview and usage instructions ├── requirements.txt # Shared Python dependencies ├── logininfo.md # Manual record of usernames/passwords (not committed) ├── LICENSE # GNU GPL v2.0 license ├── .gitignore # Prevents secrets and compiled files from being tracked │ ├── code/ # All executable scraping scripts and configuration │ ├── irrimaxscraper.py # Sentek IrriMAX API scraper │ ├── aquaspyscraper.py # AquaSpy API scraper (partially functional) │ ├── groguruscraper.py # GroGuru InSites API scraper │ ├── configtemplate.py # Safe starter config file to copy and edit │ ├── config.py # Local, untracked config with real credentials │ └── _pycache_/ # Compiled Python bytecode (ignored) │ └── docs/ # API documentation and notes (not for execution) ├── irrimaxapiinfo.md # IrriMAX v1.9 API documentation summary ├── agspyapiinfo.md # AquaSpy AgSpy API limitations + usage notes ├── groguruinfo.md # Usage notes and walkthrough for GroGuru API └── groguruinsites.apib # Original GroGuru API Blueprint (APIary format)

```

4 • Quick Start

```bash

clone & create env

git clone https://github.com/your-org/soil‑moisture‑scrapers.git cd soil‑moisture‑scrapers python -m venv venv && source venv/bin/activate # Windows: venv\Scripts\activate

install shared deps

pip install -r requirements.txt ```

4.1 Configure Secrets

Create your private config file (one place for every platform’s credentials):
bash cp code/config_template.py code/config.py
Add your credentials / API keys.
Do not commit config.py - it is already listed in .gitignore

4.2 Run a scrape via console command (IrriMAX example)

```bash

python irrimaxscraper.py Choose an option: 1. List available loggers 2. Fetch readings for a specific logger Enter 1 or 2: 2 Enter logger name (case-sensitive): 8A8 Enter start date (YYYY-MM-DD): 2024-06-10 Enter end date (YYYY-MM-DD): 2024-10-15

Fetching data for logger: 8A_8 (2024-06-10 to 2024-10-15)... Date Time V1 V2 A1(5) S1(5) ... S8(75) T8(75) A9(85) S9(85) T9(85) 0 2024/06/10 00:00:00 13.735 13.474 16.63106 379.6608 ... 844.3812 17.04999 40.48423 4438.727 16.72 1 2024/06/10 00:30:00 13.769 -1.000 16.62494 376.4792 ... 844.4537 17.04999 40.48423 4438.727 16.72 2 2024/06/10 01:00:00 13.787 -1.000 16.62494 378.4209 ... 844.3812 17.01999 40.49162 4439.438 16.72 3 2024/06/10 01:30:00 13.783 -1.000 16.58220 373.6646 ... 844.0801 16.95001 40.46944 4437.237 16.75 4 2024/06/10 02:00:00 13.707 -1.000 16.60052 373.4932 ... 844.3039 17.00000 40.49162 4422.122 16.72

[5 rows x 30 columns] ``` The script prints the first few rows of the returned DataFrame

4.3 • Example Usage on a Linux Server (Automated Cloud Deployment)

Your irrimax_scraper.py script supports both interactive use and programmatic importing. For cloud deployment (e.g., using cron), you can create a simple runner script to automate daily data pulls.

[!IMPORTANT] Be sure your config.py file is properly populated and present in the same directory or Python path when the script runs.

4.3.1 • IrriMAX Live via `get_readings()`

To automate data ingestion:

Create a runner script (e.g., daily_pull.py):

```python from irrimaxscraper import getreadings import datetime import pandas as pd

Define logger and time window

loggername = "Soybeans 1" todate = datetime.datetime.utcnow() fromdate = todate - datetime.timedelta(days=1)

Fetch and save data

df = getreadings(loggername, fromdate, todate)

if not df.empty: outpath = f"/home/user/data/{loggername.replace(' ', '')}{fromdate:%Y%m%d}.csv" df.tocsv(out_path, index=False) ```

Add the task to your crontab (crontab -e):

cron 0 3 * * * /usr/bin/python3 /home/user/scripts/daily_pull.py >> /home/user/logs/irrimax.log 2>&1

This will:

Run the script every day at 3:00 AM
Save a CSV to /home/user/data/
Log all output and errors to /home/user/logs/irrimax.log

4.3.2 • GroGuru InSites via `get_brute_force_readings()`

To automate GroGuru data collection from a known site:

Create a runner script (e.g., groguru_pull.py):

```python from groguruscraper import authenticate, getorganizationview, listsitesfromorg, getbruteforce_readings import datetime import config

Authenticate and fetch organization structure

token, userid = authenticate(config.GROGURUUSERNAME, config.GROGURUPASSWORD) orgdata = getorganizationview(token, userid) sites = listsitesfromorg(org_data)

Choose a specific siteId and deviceId (twigId is auto-selected as first device)

siteid = "11697" # Replace with your actual GroGuru siteId deviceid = sites[0]["devices"][0] # Automatically selects the first twigId

Define date range

todate = datetime.datetime.utcnow() fromdate = to_date - datetime.timedelta(days=1)

Fetch data using brute-force workaround

df = getbruteforcereadings(token, siteid, deviceid, fromdate, to_date)

Save to CSV

if not df.empty: outpath = f"/home/user/data/groguru{siteid}{fromdate:%Y%m%d}.csv" df.tocsv(out_path, index=False) ``2. **Add the task to your crontab (crontab -e`):**

cron 30 3 * * * /usr/bin/python3 /home/user/scripts/groguru_pull.py >> /home/user/logs/groguru.log 2>&1

This will:

Run daily at 3:30 AM
Save a GroGuru CSV with the last 24 hours of data
Log output/errors to /home/user/logs/groguru.log

[!NOTE] The GroGuru API limits each request to 5 data points. getbruteforce_readings() uses a looping strategy with 2-hour windows to stitch together full time series.

4.3.3 • AquaSpy AgSpy API (Metadata Only)

The aquaspy_scraper.py script retrieves site metadata only. Seasonal data is unavailable unless AquaSpy probes are actively deployed and marked "InSeason" in the AquaSpy portal.

Example run:

```bash

python aquaspy_scraper.py Site 33853: Farm 1 - 4D - Block II InSeason: False HasEquipment: False Customer: Farm 1 ``Seasonal data endpoints (GetSeasonApiData,GetSeasonDifferentialApiData) will return empty or error if the site has noCurrentFieldSeasonID`. This integration is paused until sensor deployment.

5 • Common Features

Secure authentication — credentials isolated in private config files.
Logger discovery — enumerate available sites/probes before requesting data.
Date‑range queries — RFC‑3339 / yyyymmddHHMMSS handled automatically.
Returns pandas.DataFrame — ready for in‑memory analytics or other pipelines.
Optional CSV export or direct write to Postgres/BigQuery (coming soon).

[!NOTE] Data are returned in the same schema as the vendor provides; unification may be a feature added later.

6 • Known API Limitations by Vendor

| Vendor | Limitation | |--------------|----------------------------------------------------------------------------| | AquaSpy | Requires hardcoded siteIDs; no endpoint to list all available sites. | | | No seasonal data if InSeason = False. Only metadata retrieval possible. | | GroGuru | 5-point limit per request; requires looping workaround for full time series. | | IrriMAX | CSV parsing may fail silently for malformed timestamps. |

7 • Contributing

Fork → feature branch → PR.
Follow PEP8; run black before committing.
Unit tests live in tests/; please add coverage for new endpoints.

8 • License

Owner

Name: AJ Brown
Login: ansleybrown1337
Kind: user
Company: Colorado State University

Website: sites.google.com/view/ansleyjbrown
Repositories: 4
Profile: https://github.com/ansleybrown1337

Data Specialist & Agronomist | Ag Water Quality Program

GitHub Events

Total

Watch event: 1
Push event: 8

Last Year

Watch event: 1
Push event: 8

https://github.com/ansleybrown1337/sms-api-scrapers

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Soil‑Moisture Sensor API Scrapers

1 • Project Scope

📚 Table of Contents

2 • How This Supports WISE Pro

3 • Repository Layout

4 • Quick Start

clone & create env

install shared deps

4.1 Configure Secrets

4.2 Run a scrape via console command (IrriMAX example)

4.3 • Example Usage on a Linux Server (Automated Cloud Deployment)

4.3.1 • IrriMAX Live via get_readings()

Define logger and time window

Fetch and save data

4.3.2 • GroGuru InSites via get_brute_force_readings()

Authenticate and fetch organization structure

Choose a specific siteId and deviceId (twigId is auto-selected as first device)

Define date range

Fetch data using brute-force workaround

Save to CSV

4.3.3 • AquaSpy AgSpy API (Metadata Only)

5 • Common Features

6 • Known API Limitations by Vendor

7 • Contributing

8 • License

Owner

GitHub Events

Total

Last Year

1 • Project Scope

2 • How This Supports WISE Pro

3 • Repository Layout

4 • Quick Start

4.3 • Example Usage on a Linux Server (Automated Cloud Deployment)

4.3.1 • IrriMAX Live via `get_readings()`

4.3.2 • GroGuru InSites via `get_brute_force_readings()`

5 • Common Features

7 • Contributing

8 • License