Recent Releases of observatory-platform

observatory-platform - 0.6.0

What's Changed

  • Fix seed_db service in docker-compose by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/573
  • Inf 418/add dag tags to workflows by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/572
  • jsonl comparison fix by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/575
  • Updated docker compose process to pipe its stderr to stdout by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/577
  • precommit rebase fix by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/579
  • INF-547: Random_id generator to use hostname of machine by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/581
  • Added prefix functionality to ObservatoryEnvironment class by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/578
  • INF-558: Added exception handling for deleting datasets and buckets produced by unit tests by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/585
  • Config default update by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/561
  • Increase BQ bytes budget for new DOI workflow by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/587
  • Log BulkIndexErrors by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/588
  • BAD-308 schema and table description updates by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/589
  • API core version by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/594
  • Updated github unit tests to only run on push by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/593
  • View creation function update by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/596
  • INF-588: HTTP Request Updates by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/597
  • Remove project and workflow generation by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/599
  • INF-597: Terraform config files not updating and Flower docker container not building successfully. by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/598
  • Feature/remove query api by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/600
  • Remove Elastic and Kibana from local platform by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/603
  • Upgrade to Docker Compose V2 by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/602
  • Update to response logging criteria by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/605
  • Fix: Only delete non hidden files in terraform directory. by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/604
  • Move helper functions from Thoth Telescope to common Observatory Platform utilities by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/607
  • Change tableid to tablename by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/608
  • Add constraint for alembic by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/609
  • Fix/dependencies by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/612
  • Add POSTGRES_USER environment variable by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/613
  • INF-465: Cloud Endpoints Portal Deprecation - move observatory-api container by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/610
  • Fix/Change apiserver container docker network by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/615
  • INF-609: Add limits to Bigquery for per user per day and project per day. by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/616
  • Fix api and workers network settings by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/617
  • Use config file to specify workflows by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/606
  • Added .env to gitignore by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/619
  • Terraform deploy may 2023 fixes by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/620
  • Fix issues with Docker and remove unneeded code by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/622
  • Feature/Add FTP Server by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/621
  • Changed default write disposition by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/623
  • Added function to list blobs in gcs bucket by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/624
  • Added multi-uri table load by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/626
  • Feature/Match on multiple keys for upserts and deletes in Bigquery. by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/625
  • Upgrade to Airflow 2.6.3 by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/627
  • Fix Google Cloud Storage based logging by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/629
  • Fix/Logs disappear when tasks are in "up for retry" state by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/630
  • Upgrade Terraform and Packer by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/634
  • Make dagrunid nullable in openapi.yaml by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/631
  • hmac key by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/632
  • Added glob match functionality to list_blobs by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/633
  • Fix/Update VM Template File by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/636
  • Feature/Add functionality to bqloadfrom_memory function by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/637
  • Feature/json custom datetime format by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/638
  • Add functions required to make use of Airflow TaskGroups and test them by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/642
  • Fix/Install Packer plugins when building the observatory image by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/641
  • Feature/Create buckets with roles for unit tests by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/639
  • Added readthedocs config file by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/643
  • python 3.10 by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/628
  • Fix SlackWebhookHook missing 1 required keyword-only argument: 'slackwebhookconn_id' by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/644
  • Update unittest os to ubuntu-latest by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/645
  • Updates to contributing.md and readme by @keegansmith21 in https://github.com/The-Academic-Observatory/observatory-platform/pull/640
  • Moving logging for comparelistsof_dicts into its own function by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/647
  • Feature: Add bq functions to the platform by @alexmassen-hane in https://github.com/The-Academic-Observatory/observatory-platform/pull/646

New Contributors

  • @alexmassen-hane made their first contribution in https://github.com/The-Academic-Observatory/observatory-platform/pull/581

Full Changelog: https://github.com/The-Academic-Observatory/observatory-platform/compare/0.5.0...0.6.0

- Python
Published by jdddog about 2 years ago

observatory-platform - 0.5.0

What's Changed

  • streamtelescope: add diff merge by lex order by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/557
  • Add functionality to use BigQuery snapshots by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/559
  • Add api server isolation and db seeding by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/562
  • Add dag tags to Workflow class by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/565
  • Parametrise host api port by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/564
  • Create directories that are mounted as volumes with Docker by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/563
  • Separate config loading from config use by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/560
  • Use findfreeport in more unit tests by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/569
  • Raise exception if dataset missing when adding releases by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/566
  • Add api_port to config generation by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/570
  • Propagate dag tag to telescopes by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/571
  • Update to Elastic & Kibana v8 by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/568

Full Changelog: https://github.com/The-Academic-Observatory/observatory-platform/compare/0.4.0...0.5.0

- Python
Published by jdddog over 3 years ago

observatory-platform - 0.4.0

What's Changed

  • Add a DOI badge to README.md by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/549
  • Create .zenodo.json by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/550
  • Add functionality to use a schema when creating table from query by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/551
  • MEL-798 added data trust zenodo community by @kathrynnapier in https://github.com/The-Academic-Observatory/observatory-platform/pull/552
  • api: add local api server by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/553
  • Add dataset release utils by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/540
  • Fix/api update by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/554
  • Fix dag-delete error by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/556
  • Build Observatory API Image & Push to Google Cloud Artifact Registry by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/558

New Contributors

  • @kathrynnapier made their first contribution in https://github.com/The-Academic-Observatory/observatory-platform/pull/552

Full Changelog: https://github.com/The-Academic-Observatory/observatory-platform/compare/0.3.0...0.4.0

- Python
Published by jdddog almost 4 years ago

observatory-platform - 0.3.0

What's Changed

  • Fix potential duplicates in table after merge using stream telescope by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/496
  • Fix double bucket delete race by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/510
  • Remove bq merge days functionality by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/512
  • Inf 32/installer script after repo separation by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/509
  • installer script doc fixes by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/514
  • Add global prefix_dir override by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/515
  • disable download timeout by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/517
  • INF-166/airflow 2.2 by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/516
  • add getairflowconnection_login by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/519
  • Enable Airflow operators to be added directly as tasks by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/518
  • remove unused wos/scopus code by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/520
  • installer: add https/ssh clone option by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/521
  • Fix ModuleNotFoundError: No module named 'wtforms.compat' error by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/522
  • Upgrade apache-airflow to 2.2.1 by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/523
  • Enable a bucket path to specified in azuretogooglecloudstorage_transfer by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/524
  • Inf 65/mag update by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/525
  • Update requirements.txt by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/527
  • Inf 278/workflow xcom cleanup by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/528
  • Inf 279/port vm create destroy template by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/529
  • Fix xcom topic by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/530
  • Add Dataset, DatasetRelease, DatasetStorage API extension by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/526
  • Fix VM warning message on slack by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/532
  • Add ignoreunknownvalues by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/531
  • onfailurecallback: handle exception value that is a string by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/535
  • Add bigquery bytes processed tripwire by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/533
  • Remove description fields from 401 error by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/536
  • BigQuery bytes processed by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/537
  • Add functionality to use multiple instances for Elastic Import workflow by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/538
  • Parameterise select table shard date limit by @tuanchien in https://github.com/The-Academic-Observatory/observatory-platform/pull/539
  • OpenAlex telescope changes by @aroelo in https://github.com/The-Academic-Observatory/observatory-platform/pull/542
  • Add checkblobhash parameter by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/544
  • Ensure that the blob name is unique across tables with the same name … by @jdddog in https://github.com/The-Academic-Observatory/observatory-platform/pull/548

Full Changelog: https://github.com/The-Academic-Observatory/observatory-platform/compare/0.2.1...0.3.0

- Python
Published by jdddog almost 4 years ago

observatory-platform - 0.3.0-dev

- Python
Published by aroelo about 4 years ago

observatory-platform - 0.2.1

This release includes the following bugfix in the Dockerfile: * Install apache-airflow-providers-google==5.1.0 with --no-deps so that pip doesn't spend forever trying to resolve dependencies for the package, which we only use for remote logging and secret manager backend in the cloud deployment. The google-cloud-secret-manager Python package is added as a dependency in requirements.txt.

- Python
Published by jdddog over 4 years ago

observatory-platform - 0.2.0

This release includes the following changes / new features: * Upgrade to Airflow 2.1.4. * Stream Telescope: remove use of XComs so that it is easier to maintain. * Updated documentation. * downloadfiles: uses DownloadInfo class and prefixdir parameter to allow prefixing the filename paths. * Remove third party getfile and _hashfile functions as they are replaced by getfilehash and download_file. * Command line interface: added generate workflow and project commands. * Added OrganisationTelescope.

And the following bugfixes: * Docker Compose file: rename deprecated Airflow config environment variables. * Docker Compose file: change AIRFLOWSECRETSBACKEND to use class installed from apache-airflow-providers-google package and remove airflow subpackage as is no longer required. * Fix typo in config.yaml.jinja2. * Fix onfailurecallback function.

- Python
Published by jdddog over 4 years ago

observatory-platform - 0.1.1

This release includes the following bugfixes: * Sdist building: * added missing data_files in config.cfg. * Docker Compose / Airflow 2: * Received the error "daemonic processes are not allowed to have children" when tasks ran that use multiprocessing.Pool, to address it added AIRFLOW__CORE__EXECUTE_TASKS_NEW_PYTHON_INTERPRETER to the Docker Compose file. This is the same error described in this Stack Overflow post. * Set Docker Compose volume paths correctly for editable workflows packages when deployed to Terraform. * Terraform: * For Terraform config where Google Cloud Secrets that were made had their value set to the secret key instead of the secret value. * Update TerraformBuilder so that it builds with the latest changes. * Observatory API: postgres connection prefix deprecated in PostgresSQL 1.4, so changed in Terraform file to postgresql. * Address inconsistent use of dates: * Change type hints pendulum.datetime to pendulum.DateTime (the class, not function). * Change datetime.datetime calls to pendulum.datetime. * Make select_table_shard_dates return List[pendulum.Date] * Add a make_release_date function, which returns a pendulum.DateTime instance, which is required for some of the downstream functions that use it. * getairflowconnectionurl: call geturi to get the uri.

And the following new features: * Added black to precommit config. * loaddags.py: * When DagBag has import errors, raise an exception that has a message with all of the errors so that the Dag import errors are visible in the * Testing: * Add simple threaded httpserver for testing use * Utilities * Add getobservatoryhttpheader to create simple header dict using custom user agent * Add getfienamefromurl to get a filename from a http url * Add getchunks function to split lists into constant size (unless last chunk) chunks. * Add getairflowconnectionurl to pull a url from an airflow connection, validate it, and add trailing "/" if necessary. * Add converter function for csv to jsonl files. * Add http get response functions for simple interfaces to standardise getting http raw text response, xml -> dict, json->dict. * Add AsyncHttpFileDownloader with downloadfile and downloadfiles interfaces for downloading files using http. Allows custom headers to be used in http connection. * downloadfiles allows concurrent downloading through asyncio and aiohttp. Supports retry on failure with exponential backoff. * downloadfile piggybacks off downloadfiles. No speed benefit from asyncio, but provides a simpler interface. * add getairflowconnectionpassword * add unzipfiles function * add findreplacefile (sed cli replacement) * add fn to wrap shell cmd calls. treats non zero exit as error. * Snapshot telescope: * Add uploaddownloaded as a snapshot telescope task. I noticed a lot of the uploaddownloaded tasks in snapshot telescopes are identical in implementation. They all just upload the downloadfiles list of files from the release object to the downloadbucket in the cloud. Since this is a standard pattern we have adopted, it may as well just be part of the snapshot telescope implementation. * Add download, extract, transform tasks to template. * Stream telescope: * Add download, upload_downloaded, extract, transform tasks to template.

- Python
Published by jdddog over 4 years ago

observatory-platform - 0.1.0

First observatory-platform release.

- Python
Published by jdddog over 4 years ago