pharmit
Open-source online virtual screening tools for large databases
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: acs.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.6%) to scientific vocabulary
Repository
Open-source online virtual screening tools for large databases
Basic Info
- Host: GitHub
- Owner: dkoes
- License: apache-2.0
- Language: HTML
- Default Branch: master
- Homepage: http://pharmit.csb.pitt.edu
- Size: 40.8 MB
Statistics
- Stars: 19
- Watchers: 2
- Forks: 2
- Open Issues: 2
- Releases: 1
Metadata Files
README.md
Pharmit
Copyright (c) David Ryan Koes, University of Pittsburgh and contributors. All rights reserved.
Pharmit is licensed under both the BSD 3-clause license and the GNU Public License version 2. Any use of the code that retains its reliance on the GPL-licensed OpenBabel library is subject to the terms of the GPL2.
Use of the Pharmit code independently of OpenBabel (or any other GPL2 licensed software) may choose between the BSD or GPL licenses.
See the LICENSE file provided with the distribution for more information.
Pharmit is available for searching and creating libraries at http://pharmit.csb.pitt.edu. It generally isn't necessary to build it from source.
BUILDING
CMake is required to build Pharmit. Starting from the src directory: ``` mkdir build cd build
most likely you will have to specify the location of smina nad lemon
cmake .. -DSMINADIR=$HOME/git/smina -DLEMONDIR=/usr/lib/cmake/ make -j12 ```
UBUNTU 22.04
```bash sudo apt install git autoconf automake libtool ghostscript liblemon-dev libeigen3-dev libann-dev bmagic libcgicc-dev libgoogle-perftools-dev libglpk-dev coinor-* libjsoncpp-dev cmake libboost-dev swig libxml2-dev libcairo2-dev libboost-all-dev libcurl4-openssl-dev
git clone https://github.com/FastCGI-Archives/fcgi2.git cd fcgi2 autoreconf -i ./configure sudo make sudo make install cd ..
git clone https://github.com/openbabel/openbabel.git cd openbabel mkdir build cd build sudo cmake .. -DPYTHONBINDINGS=1 -DRUNSWIG=1 -DWITHMAEPARSER=0 -DWITHCOORDGEN=0 sudo make -j12 sudo make install cd ../..
git clone http://git.code.sf.net/p/smina/code smina cd smina mkdir build cd build sudo cmake .. sudo make -j8 sudo cp libsmina.a /usr/local/lib/libsmina.a cd ../..
git clone http://git.code.sf.net/p/pharmit/code pharmit cd pharmit/src mkdir build cd build
change $path/smina the smina directory to reflect the install location from above
sudo cmake .. -DSMINALIB=/usr/local/lib/libsmina.a -DSMINADIR=$path/smina/ -DLEMONDIR=/usr/lib/x86_64-linux-gnu/cmake/lemon/ sudo make -j12 ```
UBUNTU 18.04
``` apt install git autoconf automake libtool ghostscript liblemon-dev libeigen3-dev libann-dev bmagic libcgicc-dev libgoogle-perftools-dev libglpk-dev coinor-* libjsoncpp-dev cmake libboost-dev swig python-dev libxml2-dev libcairo2-dev libboost-all-dev libcurl4-openssl-dev
NOTE: Currently the liblemon-dev package has an incorrect cmake config file that does
not set LEMON_LIBRARY to the correct location. Either edit /usr/lib/cmake/LEMONConfig.cmake
so the correct location (/usr/lib/x8664-linux-gnu/) is set or set LEMONLIBRARY directly
on the cmake commandline
fastcgi
git clone https://github.com/FastCGI-Archives/fcgi2.git cd fcgi2 autoreconf -i ./configure make make install cd ..
openbabel
you can probably use the libopenbabel-dev package, but it doesn't install cmake files by default, so you'd have to provide a custom FindOpenBabel module
In general, I recommend using openbabel 3.0 or later as it fixed a number of bugs.
git clone https://github.com/openbabel/openbabel.git cd openbabel mkdir build cd build
presumably at some point maeparser and coordgen won't be broken...
cmake .. -DPYTHONBINDINGS=1 -DRUNSWIG=1 -DWITHMAEPARSER=0 -DWITHCOORDGEN=0 make -j12 make install cd ../..
smina
git clone http://git.code.sf.net/p/smina/code smina cd smina mkdir build cd build cmake .. make -j8
cd smina/build/linux/release/ make -j8 cp libsmina.a /usr/local/lib/ cd ../../../..
pharmit - ensure SMINA_DIR is the install location above
git clone http://git.code.sf.net/p/pharmit/code pharmit cd pharmit/src mkdir build cd build
/usr/lib/cmake/LEMONConfig.cmake is editted to point LEMONLIBRARY to /usr/lib/x8664-linux-gnu/
cmake .. -DSMINALIB=/usr/local/lib/libsmina.a -DSMINADIR=$HOME/git/smina -DLEMONDIR=/usr/lib/cmake/ make -j12 ```
CENTOS (these instructions may be out of date)
``` yum groupinstall 'Development Tools' yum install git boost-devel autoconf automake libtool make cmake gperftools-libs gperftools-devel ghostscript libcurl-devel
yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum install glpk-devel eigen3-devel fcgi coin-or-*
fastcgi
git clone https://github.com/FastCGI-Archives/fcgi2.git cd fcgi2 autoreconf -i ./configure make make install cd ..
jsoncpp
git clone https://github.com/open-source-parsers/jsoncpp.git cd jsoncpp
using the older version mostly to avoid having to upgrade cmake wget
git checkout 1.7.7 mkdir build cd build cmake -DBUILDSHAREDLIBS=1 .. make make install cd ../..
cgicc
wget http://ftp.gnu.org/gnu/cgicc/cgicc-3.2.19.tar.gz tar xvfz cgicc-3.2.19.tar.gz cd cgicc-3.2.19 ./configure make make install cd ..
openbabel
I'm getting the development version since it fixes some serious bugs with aromaticity detection,
but the version in the epel repository is probably fine
git clone https://github.com/openbabel/openbabel.git cd openbabel mkdir build cd build cmake .. make make install cd ../..
eigen
wget http://bitbucket.org/eigen/eigen/get/3.3.5.tar.gz tar xvfz 3.3.5.tar.g cd eigen-eigen-b3f3d4950030/ mkdir build cd build cmake .. make install cd ../..
lemon
wget http://lemon.cs.elte.hu/pub/sources/lemon-1.3.1.tar.gz tar xvfz lemon-1.3.1.tar.gz cd lemon-1.3.1 mkdir build
remove incompatible cmake policy
sed -i '/POLICY/d' CMakeLists.txt cd build cmake .. make make install cd ../..
libann
wget https://www.cs.umd.edu/~mount/ANN/Files/1.1.2/ann1.1.2.tar.gz tar xvfz ann1.1.2.tar.gz cd ann_1.1.2 make linux-g++ cp -r include/ANN /usr/include/ cp lib/libANN.a /usr/lib/libann.a cd ..
bitmagic
git clone https://github.com/tlk00/BitMagic.git cd BitMagic
ubuntu version is actualy 3.7
git checkout v3.8.0
include as part of pharmit - can also install if you wish, but that's not what cmake is expecting
cp -r src/ ~/pharmit/src/bm
smina
git clone http://git.code.sf.net/p/smina/code smina cd smina/build/linux/release/ make cp libsmina.a /usr/local/lib/
pharmit
git clone http://git.code.sf.net/p/pharmit/code pharmit cd pharmit/src mkdir build cmake .. -DCMAKECXXFLAGS=-std=c++0x -DSMINA_DIR=$HOME/smina
lots of warnings about missing directory which can be ignored for now
```
USING
The --help option will provide the following: ``` USAGE: pharmitserver [options] --cmd command [pharma, dbcreate, dbcreateserverdir, dbsearch, server]
OPTIONS:
-cmd=
To IDENTIFY the pharmacophore features of a molecule ex.pdb:
pharmitserver pharma -in ex.pdb
pharmitserver pharma -in ex.pdb -out out.sdf
pharmitserver pharma -in ex.pdb -out out.json
For interactive, graphical editting of pharmacophore features, try
http://pharmit.csb.pitt.edu
The saved session file is a json file that can be used as input for Pharmer.
To CREATE a database DB from library.sdf:
pharmitserver dbcreate -dbdir DB -in library.sdf
If you have multiple disk drives, you can improve performance by striping the
database across the drives:
pharmitserver dbcreate -dbdir /drive1/DB -dbdir /drive2/DB -in library.sdf
If you pre-split the input file things will go faster:
pharmitserver dbcreate -dbdir /drive1/DB -dbdir /drive2/DB -file-partition -in library1.sdf -in library2.sdf
To SEARCH a database:
pharmitserver dbsearch -dbdir DB -in query.json
pharmitserver dbsearch -dbdir DB -in query.sdf
pharmitserver dbsearch -dbdir DB -in query.ph4
When setting up a SERVER you create separate directories for each database to search:
pharmitserver dbcreateserverdir -ligs molport.ligs -prefixes ../dbprefixes -dbinfo dbinfo.json
The ligs files provides the files to add with unique ids and names (see extractsmisubset.py). e.g.:
/data22/conformers/541/5416265.sdf.gz 5416265 MolPort-020-216-564 MCULE-1493214818 PubChem-56905170 ZINC000072151660
/data17/conformers/53/534584.sdf.gz 534584 MolPort-000-887-730 MolPort-035-708-781 MCULE-2607511133 MCULE-2857980910 PubChem-19618431 PubChem-90484256 ZINC
000002535004 ZINC02535004
/data08/conformers/18890/188901458.sdf.gz 188901458 MolPort-007-641-322 MCULE-8587443370
The prefixes file specifies the directories (on different drives) to partition the database across. e.g.:
/data00/databases
/data01/databases
/data02/databases
The dbinfo.json file describes the library. e.g.:
{
"name" : "MolPort",
"html" : "MolPort<span><a target=\"_blank\" href=\"https://www.molport.com\" class=\"ui-icon-info ui-icon\"></a></span>",
"subdir" : "molport",
"prefix" : "MolPort"
}
To run a SERVER:
pharmitserver server -port 16000 -prefixes /home/dkoes/vendors/dbprefixes -logdir /home/dkoes/log -min-server localhost -min-port 18000
To run a MINIMIZATION server:
~/git/smina/build/linux/release/server -port 18000 -logfile /home/dkoes/log/minlog
To HOST a server, add a fastcgi rule connecting the webserver hosting /web to the storage server running the backend. e.g. (apache): ``` ln -s ~/pharmit/web /var/www/html mkdir /var/www/html/fcgi-bin
apt install apache2 libapache2-mod-php php-mysql apache2-dev
git clone https://github.com/FastCGI-Archives/mod_fastcgi.git cp Makefile.AP2 Makefile
change /usr/local/apache2 to /usr/share/apache2 in Makefile
make make install
create /etc/apache2/mods-available/fastcgi.load
LoadModule fastcgimodule /usr/lib/apache2/modules/modfastcgi.so
fastcgi.conf
Alias /fcgi-bin/ /var/www/html/fcgi-bin/
FastCgiExternalServer /var/www/html/fcgi-bin/pharmitserv.fcgi -host 127.0.0.1:16000 -idle-timeout 300 FastCgiExternalServer /var/www/html/fcgi-bin/createlib.fcgi -host 127.0.0.1:11111 -idle-timeout 300
FastCgiConfig -autoUpdate -maxClassProcesses 1
enable
a2enmod fastcgi mkdir /var/www/html/fcgi-bin #this directory has to exist ```
LIBRARY CREATION
The library creation scripts require rdkit to be installed. ``` apt install python3-numpy mysql-server mysql-client python3-mysqldb python3-pip pip3 install flup psutil mysql < ~/pharmit/scripts/pharmit.sql mysql < ~/pharmit/scripts/conformers.sql
also, run the mysql commands:
CREATE USER 'pharmit'@'localhost'; GRANT ALL PRIVILEGES ON pharmit.* TO 'pharmit'@'localhost'; GRANT ALL PRIVILEGES ON conformers.* TO 'pharmit'@'localhost';
INSERT INTO pharmit.users (email,name,maxprivatedbs,maxprivateconfs,maxdbs,maxconfs) VALUES ("guest","Anonymous",0,0,0,10000);
git clone https://github.com/rdkit/rdkit.git cd rdkit mkdir build cd build cmake .. -DPYTHON_EXECUTABLE=/usr/bin/python3 make -j12 ```
BUGS (Not Really)
When working with large datasets spread across multiple hard drives, you will likely run into a number of limits that will result in segmentation faults (type 6 or 11). The following changes need to be in place before building the database, or it will silently create incomplete databases that crash when you search them.
Make sure nofile is set high enough (1000000) in /etc/security/limits.conf
Make sure vm.maxmapcount is set high enough - add the following to /etc/sysctl.conf
vm.max_map_count=1000000
Make sure you build against libcurl4-openssl-dev, not gnutls, or you will get fortify_fail errors.
Continuing with curl bugs, DNS timeouts can cause a longjmp error: http://stackoverflow.com/questions/9191668/error-longjmp-causes-uninitialized-stack-frame to resolve these rebuild curl with --enable-ares
You can ignore all the curl issues entirely if you build with -DSKIP_REGISTERZINC which is probably what you want unless you are using the full ZINC database.
Some versions of eigen3 will trigger an assert, OBJECTALLOCATEDONSTACKISTOOBIG, at compile time. This can be ignored in the latest versions of eigen3 with -DEIGENDISABLESTACKSIZEASSERT If your version hasn't had this patch applied (which is currently true with Ubuntu 14.04), then you will have to modify Eigen/src/Core/DenseStorage.h appropriately.
Currently, on Ubuntu 14.04, g++ 4.8 experiences an internal compiler error. Use g++-4.6 to get around this.
BUGS (Really)
Send any bug reports to dkoes@pitt.edu with a complete test case.
CITING
Please use the citation from http://pubs.acs.org/doi/abs/10.1021/ci200097m
Default Pharmacophore Definitions
```` Aromatic 18 0 1.1 1 0.1 a1aaaaa1 a1aaaa1
HydrogenDonor 1 1 0.5 1 0.1 [#7!H0&!$(N-SX4(=O)CX4(F)F)] [#8!H0&!$([OH][C,S,P]=O)] [#16!H0]
HydrogenAcceptor 89 2 0.5 1 0.1 [#7&!$([nX3])&!$([NX3]-=[!#6])&!$([NX3]-[a])&!$([NX4])&!$(N=C([C,N])N)] [$([O])&!$(OX2C=O)&!$((~a)~a)]
PositiveIon 7 3 0.75 0 0.1 [+,+2,+3,+4] $(CC)N [$(C(N)(N)=N)] [$(n1cc[nH]c1)]
NegativeIon 8 4 0.75 0 0.1 [-,-2,-3,-4] C(=O)[O-,OH,OX1] [$(S,P[O-,OH,OX1])] c1[nH1]nnn1 c1nn[nH1]n1 C(=O)N[OH1,O-,OX1] C(=O)N[OH1,O-] CO(=N[OH1,O-]) [$(N-SX4(=O)CX4(F)F)]
Hydrophobic 6 5 1 0 2 a1aaaaa1 a1aaaa1 [$([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])&!$(*[CH3X4,CH2X3,CH1X2,F,Cl,Br,I])] [$(([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])[CH3X4,CH2X3,CH1X2,F,Cl,Br,I])&!$(*([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])[CH3X4,CH2X3,CH1X2,F,Cl,Br,I])]([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])[CH3X4,CH2X3,CH1X2,F,Cl,Br,I] *([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])([CH3X4,CH2X3,CH1X2,F,Cl,Br,I])[CH3X4,CH2X3,CH1X2,F,Cl,Br,I] [C&r3]1~[C&r3]~[C&r3]1 [C&r4]1~[C&r4]~[C&r4]~[C&r4]1 [C&r5]1~[C&r5]~[C&r5]~[C&r5]~[C&r5]1 [C&r6]1~[C&r6]~[C&r6]~[C&r6]~[C&r6]~[C&r6]1 [C&r7]1~[C&r7]~[C&r7]~[C&r7]~[C&r7]~[C&r7]~[C&r7]1 [C&r8]1~[C&r8]~[C&r8]~[C&r8]~[C&r8]~[C&r8]~[C&r8]~[C&r8]1 [CH2X4,CH1X3,CH0X2]~[CH3X4,CH2X3,CH1X2,F,Cl,Br,I] [$([CH2X4,CH1X3,CH0X2]~[$([!#1]);!$([CH2X4,CH1X3,CH0X2])])]~[CH2X4,CH1X3,CH0X2]~[CH2X4,CH1X3,CH0X2] [$([CH2X4,CH1X3,CH0X2]~[CH2X4,CH1X3,CH0X2]~[$([CH2X4,CH1X3,CH0X2]~[$([!#1]);!$([CH2X4,CH1X3,CH0X2])])])]~[CH2X4,CH1X3,CH0X2]~[CH2X4,CH1X3,CH0X2]~[CH2X4,CH1X3,CH0X2] [$([S]~[#6])&!$(S~[!#6])] ````
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'Pharmit'
message: 'If you find pharmit useful, please cite our paper'
authors:
- given-names: Jocelyn
family-names: Sunseri
affiliation: University of Pittsburgh
- given-names: David Ryan
family-names: Koes
email: dkoes@pitt.edu
affiliation: University of Pittsburgh
orcid: 'https://orcid.org/0000-0002-6892-6614'
identifiers:
- type: doi
value: https://doi.org/10.1093/nar/gkw287
license: GPL-2.0
repository-code: 'https://github.com/pharmit/pharmit'
preferred-citation:
title: 'Pharmit: interactive exploration of chemical space'
type: article
authors:
- given-names: Jocelyn
family-names: Sunseri
affiliation: University of Pittsburgh
- given-names: David Ryan
family-names: Koes
email: dkoes@pitt.edu
affiliation: University of Pittsburgh
orcid: 'https://orcid.org/0000-0002-6892-6614'
identifiers:
- type: doi
value: https://doi.org/10.1093/nar/gkw287
abstract: >-
Pharmit (http://pharmit.csb.pitt.edu) provides an online, interactive environment for the virtual screening of large compound databases using pharmacophores, molecular shape and energy minimization. Users can import, create and edit virtual screening queries in an interactive browser-based interface. Queries are specified in terms of a pharmacophore, a spatial arrangement of the essential features of an interaction, and molecular shape. Search results can be further ranked and filtered using energy minimization. In addition to a number of pre-built databases of popular compound libraries, users may submit their own compound libraries for screening. Pharmit uses state-of-the-art sub-linear algorithms to provide interactive screening of millions of compounds. Queries typically take a few seconds to a few minutes depending on their complexity. This allows users to iteratively refine their search during a single session. The easy access to large chemical datasets provided by Pharmit simplifies and accelerates structure-based drug design. Pharmit is available under a dual BSD/GPL open-source license.
keywords:
- Pharmacophore search
- Structure-based drug design
journal: "Nucleic acids research"
volume: 44
number: W1
start: W442
end: W448
year: 2016
doi: https://doi.org/10.1093/nar/gkw287
publisher: "Oxford University Press"
GitHub Events
Total
- Issues event: 5
- Watch event: 15
- Issue comment event: 12
Last Year
- Issues event: 5
- Watch event: 15
- Issue comment event: 12
Dependencies
- jquery >=1.7.0
- jquery >=1.7
- coveralls github:phated/node-coveralls#2.x development
- eslint ^2.13.1 development
- eslint-config-gulp ^3.0.1 development
- expect ^1.20.2 development
- mkdirp ^0.5.1 development
- mocha ^3.0.0 development
- nyc ^10.3.2 development
- rimraf ^2.6.3 development
- glob-watcher ^5.0.3
- gulp-cli ^2.2.0
- undertaker ^1.2.1
- vinyl-fs ^3.0.0
- gulp ^3.8.7 development
- gulp-sourcemaps ^2.2.0 development
- istanbul ^0.4.5 development
- mocha ^3.0.0 development
- mocha-lcov-reporter ^1.2.0 development
- should ^11.0.0 development
- stream-array ^1.0.1 development
- stream-assert ^2.0.1 development
- concat-with-sourcemaps ^1.0.0
- through2 ^2.0.0
- vinyl ^2.0.0
- gulp ^3.8.10 development
- jshint ^2.9.4 development
- mocha ^3.0.0 development
- should ^11.0.0 development
- vinyl ^2.1.0 development
- lodash ^4.12.0
- minimatch ^3.0.3
- plugin-error ^0.1.2
- rcloader ^0.2.2
- through2 ^2.0.0
- gulp ^4.0.0 development
- gulp-sourcemaps ^2.6.4 development
- jscs ^3.0.0 development
- jshint ^2.0.0 development
- map-stream ^0.0.7 development
- mocha ^5.0.0 development
- should ^13.0.0 development
- vinyl ^2.0.0 development
- eslint ^3.18.0 development
- eslint-config-prettier ^2.1.0 development
- eslint-config-xo ^0.18.1 development
- eslint-plugin-no-use-extend-native ^0.3.12 development
- eslint-plugin-prettier ^2.0.1 development
- eslint-plugin-unicorn ^2.1.0 development
- power-assert ^1.4.1 development
- prettier ^1.1.0 development
- source-list-map ^1.1.2 development
- tape ^4.9.1 development
- tape-catch ^1.0.6 development
- testdouble ^2.1.2 development
- vinyl ^2.0.0 development
- array-each ^1.0.1
- extend-shallow ^3.0.2
- gulplog ^1.0.0
- has-gulplog ^0.1.0
- isobject ^3.0.1
- make-error-cause ^1.1.1
- safe-buffer ^5.1.2
- through2 ^2.0.0
- uglify-js ^3.0.5
- vinyl-sourcemaps-apply ^0.2.0
- gulp ^4.0.2 development
- gulp-cli ^2.3.0 development
- gulp-concat ^2.5.2 development
- gulp-jshint ^2.1.0 development
- gulp-rename ^1.2.0 development
- gulp-uglify ^3.0.2 development
- jshint ^2.11.1 development