msb1015_assignment2
Repository to keep track of the progress of Msc Systems Biology - MSB1015 2019 Assignment 2
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary
Repository
Repository to keep track of the progress of Msc Systems Biology - MSB1015 2019 Assignment 2
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
MSB1015 Assignment 2
Welcome to the repository of MSB1015 Assignment 2! Here I keep track of my progress of MSB1015 2019 Assignment 2 at Maastricht University. The result of this assignment can be seen here
Project Description
Chemical properties, such as the boiling point, can be derived from the structure of a chemical compound. In 1947, Harry Wiener already made a correlation model to link structural features to boiling points (ref1). The idea to use mathematical models to predict chemical properties from compound structures has been expaned since then.
In this project I use a SPARQL query to obtain the smiles and boiling points of simple alkanes from WikiData (ref2). I use the smiles to get descriptors from the chemical development kit (CDK) database (ref3-6). These descriptors contain information on the structural properties of the alkanes (see section 2 for more details). Finally, I train a Partial Least Squares (PLS) model to predict these properties from the chemical properties of the compounds and plot the results.
Files
MSB1015Assignment2SuzannetenHage.rmd <- Code that contains models to predict the boiling point from structural properties of alkanes. The code has been developed to explain the creation of the model and the results step-by-step.
index.html <- html notebook that resulted from MSB1015Assignment2SuzannetenHage.rmd, in order to make a website using GitHub pages.
Installation
JAVA
The rJava package requires Java to be installed. The code has been developed using Java version 1.8.0_191. A tutorial on how to install this Java Version on windows can be found here.
Required R packages:
The project requires several packages. The code checks automatically for missing packages and installs them. The required packages are:
* WikidataQuery
* rJava
* rcdk
* stringi
* caTools
* pls
* Metrics
Authors
Suzanne ten Hage
Additional Information
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
References
- Wiener H. Structural Determination of Paraffin Boiling Points. Journal of the American Chemical Society. 1947 Jan;69(1):17–20.
- https://www.wikidata.org/wiki/Wikidata:Main_Page (12-10-2019)
- Willighagen et al. The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J. Cheminform. 2017; 9(3), doi:10.1186/s13321-017-0220-4
- May and Steinbeck. Efficient ring perception for the Chemistry Development Kit. J. Cheminform. 2014, doi:10.1186/1758-2946-6-3
- Steinbeck et al. Recent Developments of the Chemistry Development Kit (CDK) - An Open-Source Java Library for Chemo- and Bioinformatics. Curr. Pharm. Des. 2006; 12(17):2111-2120, doi:10.2174/138161206777585274
- Steinbeck et al. The Chemistry Development Kit (CDK): An Open-Source Java Library for Chemo- and Bioinformatics. J. Chem. Inf. Comput. Sci. 2003 Mar-Apr; 43(2):493-500, doi:10.1021/ci025584y
Owner
- Login: setenhage
- Kind: user
- Location: Utrecht
- Website: https://www.linkedin.com/in/suzannetenhage/
- Repositories: 1
- Profile: https://github.com/setenhage
Citation (CITATION.cff)
cff-version: 1
message: If you use this software, please cite it as below.
authors:
- family-names: ten Hage
given-names: Suzanne Eva
title: MSB1015_Assignment_2
version: 1
date-released: 2019-09-25