data-safe-haven
Science Score: 62.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
16 of 43 committers (37.2%) from academic institutions -
✓Institutional organization owner
Organization alan-turing-institute has institutional domain (turing.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Basic Info
- Host: GitHub
- Owner: alan-turing-institute
- License: bsd-3-clause
- Language: Python
- Default Branch: develop
- Homepage: https://data-safe-haven.readthedocs.io
- Size: 157 MB
Statistics
- Stars: 66
- Watchers: 17
- Forks: 16
- Open Issues: 80
- Releases: 35
Topics
Metadata Files
README.md

👀 What is the Turing Data Safe Haven?
The Turing Data Safe Haven is an open-source framework for creating secure environments to analyse sensitive data. It provides a set of scripts and templates that will allow you to deploy, administer and use your own secure environment. It was developed as part of the Alan Turing Institute's Data Safe Havens in the Cloud project.
<!-- ALL-CONTRIBUTORS-BADGE:START - Do not remove or modify this section -->
<!-- ALL-CONTRIBUTORS-BADGE:END -->
🧑🧑🧒 Community & support
- Visit the Data Safe Haven website for full documentation and useful links.
- Join our Slack workspace to ask questions, discuss features, and for general API chat.
- Open a discussion on GitHub for general questions, feature suggestions, and help with our deployment scripts.
- Look through our issues on GitHub to see what we're working on and progress towards specific fixes.
- Send us an email.
👐 Contributing
We are keen to transition our implementation from being a Turing project to being a community owned platform. We have worked together with the community to develop the policy, processes and design decisions for the Data Safe Haven.
We welcome contributions from anyone who is interested in the project. There are lots of ways to contribute, not just writing code!
See our Code of Conduct and our Contributor Guide to learn more about how we work together as a community and how you can contribute.
Contributors
🍰 Releases
If you're new to the project, why not check out our latest release?
You can also browse all our releases. Follow the link from any release to view and clone this repository as at that release.
Read our versioning scheme for how we number and label releases, as well as details of releases that have been used in production and releases that have undergone formal security evaluation.
When making a new release, open an issue on GitHub and choose the Release checklist template, which can be used to track the completion of security checks for the release.
📬 Vulnerability disclosure
We value those who take the time and effort to report security vulnerabilities. If you believe you have found a security vulnerability, please report it as outlined in our Security and vulnerability disclosure policy.
🙇 Acknowledgements
We are grateful for the following support for this project:
- The Alan Turing Institute's core and additional EPSRC funding (EP/N510129/1, EP/W001381/1, EP/W037211/1, EP/X03870X/1).
- The UKRI Strategic Priorities Fund - AI for Science, Engineering, Health and Government programme (EP/T001569/1), particularly the "Tools, Practices and Systems" theme within that grant.
- Microsoft's generous donation of Azure credits to the Alan Turing Institute.
⚠️ Disclaimer
The Alan Turing Institute and its group companies ("we", "us", the "Turing") make no representations, warranties, or guarantees, express or implied, regarding the information contained in this repository, including but not limited to information about the use or deployment of the Data Safe Haven and/or related materials. We expressly exclude any implied warranties or representations whatsoever including without limitation regarding the use of the Data Safe Haven and related materials for any particular purpose. The Data Safe Haven and related materials are provided on an 'as is' and 'as available' basis and you use them at your own cost and risk. To the fullest extent permitted by law, the Turing excludes any liability arising from your use of or inability to use this repository, any of the information or materials contained on it, and/or the Data Safe Haven.
Deployments of the Data Safe Haven code and/or related materials depend on their specific implementation into different environments and we cannot account for all of these variations. Safe use of any Data Safe Haven code or materials also relies upon individuals' and their organisations' good and responsible data handling processes and protocols and we make no representations and give no guarantees regarding the safety, security or suitability of any instance(s) of the deployment of the Data Safe Haven. The Turing assumes no responsibility for updating any of the content in this repository; however, the underlying code and related materials may change from time to time with updates and it is the user's responsibility to keep abreast of these updates.
Owner
- Name: The Alan Turing Institute
- Login: alan-turing-institute
- Kind: organization
- Email: info@turing.ac.uk
- Website: https://turing.ac.uk
- Repositories: 477
- Profile: https://github.com/alan-turing-institute
The UK's national institute for data science and artificial intelligence.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "To acknowledge the data safe haven please use the citation and references below."
title: "Turing Data Safe Haven"
url: "https://data-safe-haven.readthedocs.io"
repository-code: "https://github.com/alan-turing-institute/data-safe-haven"
authors:
- given-names: James
family-names: Robinson
- given-names: Martin T.
family-names: O'Reilly
- given-names: Jim
family-names: Madge
- given-names: Tom
family-names: Doel
- given-names: James
family-names: Cunningham
- given-names: Miguel
family-names: Morin
- given-names: Catherine
family-names: Lawrence
- given-names: Rob
family-names: Clarke
- given-names: Ben
family-names: Walden
- given-names: Warick
family-names: Wood
- given-names: Oliver
family-names: Forrest
- given-names: James
family-names: Hetherington
- given-names: Kirstie
family-names: Whitaker
- given-names: Tim
family-names: Hobson
- given-names: George
family-names: Holmes
- given-names: Federico
family-names: Nanni
- given-names: Ed
family-names: Chalstrey
- given-names: Tomas
family-names: Lazauskas
- given-names: Rachel
family-names: Winstanley
- given-names: Daniel
family-names: Allen
- given-names: Alvaro
family-names: Cabrejas Egea
- given-names: Ian
family-names: Carter
- given-names: Hari
family-names: Sood
- given-names: Brett
family-names: Todd
- given-names: Diego
family-names: Arenas
- given-names: Kevin
family-names: Xu
- given-names: Sebastian
family-names: Vollmer
- given-names: Jules
family-names: Manser
- given-names: Christopher
family-names: Edsall
- given-names: Jack
family-names: Roberts
- given-names: Guillaume
family-names: Noell
- given-names: David
family-names: Beavan
license: BSD-3-Clause
keywords:
- "trusted research environment"
- "infrastructure as code"
- "cloud computing"
references:
- title: "Design choices for productive, secure, data-intensive research at scale in the cloud"
doi: 10.48550/arXiv.1908.08737
type: article
authors:
- given-names: Diego
family-names: Arenas
- given-names: Jon
family-names: Atkins
- given-names: Claire
family-names: Austin
- given-names: David
family-names: Beavan
- given-names: Alvaro Cabrejas
family-names: Egea
- given-names: Steven
family-names: Carlysle-Davies
- given-names: Ian
family-names: Carter
- given-names: Rob
family-names: Clarke
- given-names: James
family-names: Cunningham
- given-names: Tom
family-names: Doel
- given-names: Oliver
family-names: Forrest
- given-names: Evelina
family-names: Gabasova
- given-names: James
family-names: Geddes
- given-names: James
family-names: Hetherington
- given-names: Radka
family-names: Jersakova
- given-names: Franz
family-names: Kiraly
- given-names: Catherine
family-names: Lawrence
- given-names: Jules
family-names: Manser
- given-names: Martin T.
family-names: O'Reilly
- given-names: James
family-names: Robinson
- given-names: Helen
family-names: Sherwood-Taylor
- given-names: Serena
family-names: Tierney
- given-names: Catalina A.
family-names: Vallejos
- given-names: Sebastian
family-names: Vollmer
- given-names: Kirstie
family-names: Whitaker
abstract: >
We present a policy and process framework for secure environments for
productive data science research projects at scale, by combining
prevailing data security threat and risk profiles into five sensitivity
tiers, and, at each tier, specifying recommended policies for data
classification, data ingress, software ingress, data egress, user access,
user device control, and analysis environments. By presenting design
patterns for security choices for each tier, and using software defined
infrastructure so that a different, independent, secure research
environment can be instantiated for each project appropriate to its
classification, we hope to maximise researcher productivity and minimise
risk, allowing research organisations to operate with confidence.
date-published: "2019-08-23"
url: "https://arxiv.org/abs/1908.08737"
- title: "Data safe havens in the cloud"
type: website
url: "https://www.turing.ac.uk/research/research-projects/data-safe-havens-cloud"
authors:
- name: "The Alan Turing Institute"
GitHub Events
Total
- Create event: 107
- Release event: 7
- Issues event: 121
- Watch event: 7
- Delete event: 88
- Issue comment event: 476
- Push event: 346
- Pull request review comment event: 278
- Pull request review event: 362
- Pull request event: 305
- Fork event: 1
Last Year
- Create event: 107
- Release event: 7
- Issues event: 121
- Watch event: 7
- Delete event: 88
- Issue comment event: 476
- Push event: 346
- Pull request review comment event: 278
- Pull request review event: 362
- Pull request event: 305
- Fork event: 1
Committers
Last synced: 10 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| James Robinson | j****n@g****m | 3,917 |
| Jim Madge | j****e@t****k | 1,542 |
| Martin O'Reilly | d****r@m****t | 1,173 |
| Matt Craddock | 5****m | 949 |
| Ed Chalstrey | e****y@g****m | 308 |
| dependabot[bot] | 4****] | 122 |
| Tom Doel | t****l@c****k | 83 |
| Oscar Giles | o****s@t****k | 77 |
| James Cunningham | j****m@g****m | 67 |
| Daniel | d****n@s****o | 53 |
| Miguel Morin | 3****n | 49 |
| cathiest | 3****t | 47 |
| Rob Clarke | r****e@c****m | 36 |
| Carlos Gavidia-Calderon | c****n@t****k | 36 |
| bw-faststream | 5****m | 33 |
| wwood | w****d@t****k | 31 |
| warwick26_wood | w****d@t****k | 29 |
| oforrest | 4****t | 29 |
| James Hetherington | j****h@g****m | 28 |
| Kirstie Whitaker | k****1@c****k | 22 |
| Tim Hobson | t****n@t****k | 20 |
| George Holmes | g****s@e****m | 18 |
| fedenanni | n****o@g****m | 13 |
| Tomas Lazauskas | 1****z | 11 |
| rwinstanley1 | 5****1 | 7 |
| Alvaro Cabrejas Egea | a****a@w****k | 6 |
| harisood | 6****d | 6 |
| daniel | d****n@t****k | 6 |
| icarter2 | i****r@t****k | 5 |
| JulesMarz | j****g@g****m | 4 |
| and 13 more... | ||
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 298
- Total pull requests: 506
- Average time to close issues: 5 months
- Average time to close pull requests: 7 days
- Total issue authors: 16
- Total pull request authors: 8
- Average comments per issue: 3.36
- Average comments per pull request: 1.65
- Merged pull requests: 418
- Bot issues: 0
- Bot pull requests: 186
Past Year
- Issues: 92
- Pull requests: 239
- Average time to close issues: 15 days
- Average time to close pull requests: 5 days
- Issue authors: 12
- Pull request authors: 7
- Average comments per issue: 2.97
- Average comments per pull request: 1.78
- Merged pull requests: 186
- Bot issues: 0
- Bot pull requests: 109
Top Authors
Issue Authors
- jemrobinson (146)
- JimMadge (76)
- craddm (64)
- edwardchalstrey1 (22)
- helendduncan (21)
- cptanalatriste (10)
- dsj976 (10)
- mattwestby (6)
- DDelbarre (3)
- martintoreilly (3)
- edchapman88 (3)
- dependabot[bot] (2)
- callummole (1)
- J0shev (1)
- FruityND (1)
Pull Request Authors
- jemrobinson (237)
- dependabot[bot] (173)
- JimMadge (162)
- craddm (120)
- github-actions[bot] (104)
- cptanalatriste (21)
- edwardchalstrey1 (6)
- llewelld (3)
- dsj976 (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 2
-
Total downloads:
- pypi 42 last-month
-
Total dependent packages: 0
(may contain duplicates) -
Total dependent repositories: 0
(may contain duplicates) - Total versions: 45
- Total maintainers: 3
proxy.golang.org: github.com/alan-turing-institute/data-safe-haven
- Documentation: https://pkg.go.dev/github.com/alan-turing-institute/data-safe-haven#section-documentation
- License: bsd-3-clause
-
Latest release: v5.5.0+incompatible
published 8 months ago
Rankings
pypi.org: data-safe-haven
An open-source framework for creating secure environments to analyse sensitive data.
- Documentation: https://data-safe-haven.readthedocs.io
- License: BSD-3-Clause
-
Latest release: 5.5.1
published 6 months ago
Rankings
Maintainers (3)
Dependencies
- lxml *
- natsort *
- requests *
- GitPython ==3.1.28
- Jinja2 ==3.1.2
- Pygments ==2.13.0
- Sphinx ==5.2.3
- emoji ==2.1.0
- myst-parser ==0.18.1
- pydata-sphinx-theme ==0.11.0
- rinoh-typeface-symbola ==0.1.1
- rinohtype ==0.5.4