https://github.com/ammar257ammar/molvec

A feeble attempt at molecular recognition (in the literal sense)

https://github.com/ammar257ammar/molvec

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary
Last synced: 9 months ago · JSON representation

Repository

A feeble attempt at molecular recognition (in the literal sense)

Basic Info
  • Host: GitHub
  • Owner: ammar257ammar
  • License: lgpl-2.1
  • Language: Java
  • Default Branch: master
  • Homepage: https://molvec.ncats.io
  • Size: 268 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of ncats/molvec
Created about 5 years ago · Last pushed about 4 years ago

https://github.com/ammar257ammar/molvec/blob/master/

# MolVec
NCATS (chemical) ocr engine that can vectorize
chemical images into Chemical objects preserving the 2D layout as much as 
possible. The code is still very raw in terms of utility. Check
out [https://molvec.ncats.io](https://molvec.ncats.io) for a
demonstration of MolVec. 

## Molvec is on Maven Central

The easiest way to start using Molvec is to include it as a dependency in your build tool of choice.
For Maven:

```

  gov.nih.ncats
  molvec
  0.9.8

``` 
   
## Example Usage: convert image into mol file format
```java
    File image = ...
    String mol = Molvec.ocr(image);
```
    
## Async Support

  MolVec supports asynchronous calls
```java
    CompleteableFuture future = Molvec.ocrAsync( image);
    String mol = future.get(5, TimeUnit.SECONDS);
```
  
## Support for producing molfile and SDfiles
  Since version 0.9.8, Molvec has a more robust API that allows for creating both molfiles and SDfiles and adding properties to the SDfiles.
  
```java
MolvecOptions options = new MolvecOptions()
                                    .setName(name)
                                    .center(true)
                                    .averageBondLength(2);

MolvecResult result = Molvec.ocr(f, options);

//write out a SDfile without any properties
String sdfile = result.getSDfile().get());

//create a map of key-value pairs and include as properties in an SDfile
Map props = new HashMap<>();
 
props.put("File Name", f.getName());

String sdfileWithProperties = result.getSDfile(props).get();
```
## Commandline interface
  The Molvec jar has a runnable Main class with the following options
  
    usage: molvec ([-gui],[[(-f  [-o ]) | (-dir  [[-outDir  | -outSdf ]],[-parallel ])]],[-scale
                  ],[-h])
    Image to Chemical Structure Extractor Analyzes the given image and tries to find the chemical structure drawn and
    convert it into a Mol format.
    
    options:
         -dir          path to a directory of image files to process. Supported formats include png, jpeg, tiff and
                             svg. Each image file found will be attempted to be processed. If -out or -outDir is not
                             specified then each processed mol will be put in the same directory and named
                             $filename.molThis option or -f is required if not using -gui
    
         -f,--file     path of image file to process. Supported formats include png, jpeg, tiff.  This option or -dir
                             is required if not using -gui
    
         -gui                Run Molvec in GUI mode. file and scale option may be set to preload file
    
         -h,--help           print helptext
    
         -o,--out      path of output processed mol. Only valid when not using gui mode. If not specified output is
                             sent to STDOUT
    
         -outDir       path to output directory to put processed mol files. If this path does not exist it will e
                             created
    
         -outSdf       Write output to a single sdf formatted file instead of individual mol files
    
         -parallel    Number of images to process simultaneously, if not specified defaults to 1
    
         -scale       scale of image to show in viewer (only valid if gui mode AND file are specified)
    
    Examples:
    
          $molvec -f /path/to/image.file
    
       parse the given image file and print out the structure mol to STDOUT
    
          $molvec -dir /path/to/directory
    
       serially parse all the image files inside the given directory and write out a new mol file for each image named
       $image.file.mol the new files will be put in the input directory
    
          $molvec -dir /path/to/directory -outDir /path/to/outputDir
    
       serially parse all the image files inside the given directory and write out a new mol file for each image named
       $image.file.mol the new files will be put in the directory specified by outDir
    
          $molvec -dir /path/to/directory -outSdf /path/to/output.sdf
    
       serially parse all the image files inside the given directory and write out a new sdf file to the given path that
       contains all the structures from the input image directory
    
          $molvec -dir /path/to/directory -outSdf /path/to/output.sdf -parallel 4
    
       parse in 4 concurrent parallel thread all the image files inside the given directory and write out a new sdf file to
       the given path that contains all the structures from the input image directory
    
          $molvec -dir /path/to/directory -parallel 4
    
       parse in 4 concurrent parallel threads all the image files inside the given directory and write out a new mol file for
       each image named $image.file.mol the new files will be put in the directory specified by outDir
    
          $molvec -gui
    
       open the Molvec Graphical User interface without any image preloaded
    
          $molvec -gui -f /path/to/image.file
    
       open the Molvec Graphical User interface  with the given image file preloaded
    
          $molvec -gui -f /path/to/image.file -scale 2.0
    
       open the Molvec Graphical User interface  with the given image file preloaded zoomed in/out to the given scale
    
    Developed by NIH/NCATS
                    
### GUI
  Molvec Comes with a Swing Viewer you can use to step
  through each step of the structure recognition process

![Primitives](sample1.png)

## Publications that Mention Molvec
* [Rajan, K., Brinkhaus, H.O., Zielesny, A. et al. A review of optical chemical structure recognition tools.
 J Cheminform 12, 60 (2020).](https://doi.org/10.1186/s13321-020-00465-0)
 
 
## How to Report Issues

  You can report issues or feature requests either by creating issue tickets on our github page or
   by forwarding questions and/or problems to [daniel.katzel@nih.gov](mailto:daniel.katzel@nih.gov)

Owner

  • Name: Ammar Ammar
  • Login: ammar257ammar
  • Kind: user
  • Location: The Netherlands
  • Company: Maastricht University

GitHub Events

Total
Last Year