https://github.com/broadinstitute/wdl4s

Scala bindings for WDL

Keywords

cwl scala wdl

Last synced: 6 months ago · JSON representation

Repository

Scala bindings for WDL

Basic Info

Host: GitHub
Owner: broadinstitute
Language: HTML
Default Branch: develop
Size: 3.56 MB

Statistics

Stars: 3
Watchers: 64
Forks: 5
Open Issues: 0
Releases: 0

Archived

Topics

cwl scala wdl

Created about 10 years ago · Last pushed about 8 years ago

Metadata Files

Readme Changelog

The WDL4S repository is now under the Cromwell repository. Click here for the Cromwell repo.

Scala binding API for WDL

This repository provides scala tools to parse a WDL file and transform it into a scala object hierarchy.

Installation

wdl4s is hosted on The Broad Institute's Artifactory Repository

resolvers ++= Seq( "Broad Artifactory Releases" at "https://broadinstitute.jfrog.io/broadinstitute/libs-release/" )

Add the following to libraryDependencies:

"org.broadinstitute" %% "wdl4s" % version,

The latest version is

Or add a snapshot release in the format <version>-<git-hash7>-SNAP:

"org.broadinstitute" %% "wdl4s" % "0.123-fff0246-SNAP",

To use in your Maven project add the following dependency

xml <dependency> <groupId>org.broadinstitute</groupId> <artifactId>wdl4s_2.12</artifactId> <version>${version}</version> </dependency>

Scaladoc

Usage

All examples are located in src/main/scala/wdl4s/examples and particular examples can be run via sbt:

$ sbt "run-main wdl4s.examples.ex1"

Loading WDL Code

The main entry point into the parser is the WdlNamespace object. A WDL file is considered a namespace, and other namespaces can be included by using the import statement (but only with an as clause).

the WdlNamespace object has a few load…() functions for turning WDL source into WdlNamespace objects.

If the workflow being loaded contains a workflow definition, then the load…() functions will return a WdlNamespaceWithWorkflow and otherwise they will return a WdlNamespaceWithoutWorkflow.

Example src/main/scala/wdl4s/examples/ex1.scala

```scala val wdl = """ |task a { | command { ps } |} |workflow wf { | call a |}""".stripMargin

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq.empty).get

println(s"Workflow: ${ns.workflow.unqualifiedName}") ns.workflow.calls foreach {call => println(s"Call: ${call.unqualifiedName}") }

ns.tasks foreach {task => println(s"Task: ${task.name}") println(s"Command: ${task.commandTemplate}") } ```

Using An Import Resolver

WDL code can have import statements but wdl4s does not know how to resolve these import statements into WDL source code.

When using WdlNamespace.load(), one can pass an optional import resolver which is a String => WdlSource function (WdlSource is a type alias for String). If the import resolver cannot resolve the import string to WDL source, then it is expected to throw an exception.

Example src/main/scala/wdl4s/examples/ex2.scala

```scala val wdl = """ |import "some_string" |task a { | command { ps } |} |workflow wf { | call a |}""".stripMargin

def resolver(importString: String): WdlSource = { importString match { case "some_string" => "task imported { command {ps} }" case s if s.startsWith("http://") => // issue HTTP request throw new NotImplementedError("not implemented") } }

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq(resolver _)).get

ns.tasks foreach {task => println(s"Task: ${task.name}") } ```

Since the resolver is set up to resolve some_string to some static WDL code (task imported { command {ps} }, the output of this program will show two tasks in this namespace:

Task: a Task: imported

WDL also supports import "something" as namespace_name format for imports. In this case, a sub-namespace will be created where the tasks live

Example src/main/scala/wdl4s/examples/ex3.scala

```scala val wdl = """ |import "somestring" as mynamespace |task a { | command { ps } |} |workflow wf { | call a |}""".stripMargin

def resolver(importString: String): WdlSource = { importString match { case "some_string" => "task imported { command {ps} }" case _ => throw new NotImplementedError() } }

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq(resolver _)).get

ns.tasks foreach {task => println(s"Task: ${task.name}") }

ns.namespaces foreach { n => n.tasks.foreach { t => println(s"Imported Task: ${t.name} (from ${n.importedAs})") } } ```

Since the WDL import statement now has an as clause, the top-level namespace only has one task. The top level namespace also has a sub-namespace called my_namespace which has one task. The output of the program will be:

Task: a Imported Task: imported (from my_namespace)

Resolving Fully-Qualified Names

WdlNamespace has a resolve method which takes a fully-qualified name string and returns back the object that it refers to.

Example src/main/scala/wdl4s/examples/ex4.scala

```scala val wdl = """ |task a { | command { ps } |} |workflow wf { | call a | call a as b |}""".stripMargin

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq.empty).get

println(ns.resolve("wf.a")) // resolves to Call object for call a println(ns.resolve("wf.b")) // resolves to Call object for call a as b println(ns.findTask("a")) // resolves to Task object for task a ```

Getting Dependencies

Call objects can have prerequisites: other Calls that need to be completed in order to start executing the Call in question.

Example src/main/scala/wdl4s/examples/ex5.scala

```scala val wdl = """ |task a { | command { ps } | output { File procs = stdout() } |} | |task b { | File s | command { wc -l ${s} } |} | |workflow wf { | call a | call b {input: s=a.procs} |}""".stripMargin

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq.empty).get

Seq(ns.resolve("wf.a"), ns.resolve("wf.b")) foreach { case Some(c: TaskCall) => println(s"Call '${c.fullyQualifiedName}' prerequisites: ${c.upstream}") case _ => } ```

Since call b depends on call a, the set of prerequisites for b has one element and a has zero elements as seen from the standard output:

Call 'wf.a' prerequisites: Set() Call 'wf.b' prerequisites: Set([Call name=a, task=[Task name=a commandTemplate=Vector( ps )}]])

Evaluating Expressions

WDL has its own expression language. The unevaluated expressions are stored in a WdlExpression object, which has an evaluate() method.

The evaluate() method takes two parameters: a lookup function (String => WdlValue) and an implementation of the standard library functions (see trait WdlStandardLibraryFunctions)

The lookup function is called for each variable that is encountered during expression evaluation. The corresponding method in the standard library functions implementation is called for function invocations

Example src/main/scala/wdl4s/examples/ex6.scala

```scala val wdl = """ |workflow wf { | String a = "foo" + "bar" | String b = "hello " + variable | String c = "hello " + other_variable |}""".stripMargin

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq.empty).get def lookup(name: String): WdlValue = { name match { case "variable" => WdlString("world") case _ => throw new NoSuchElementException } } ns.workflow.declarations foreach { decl => val value = decl.expression.get.evaluate(lookup, NoFunctions) println(s"Declaration '${decl.toWdlString}' evaluates to: $value") } ```

This evaluates each of the declarations in the workflow. The last one fails because other_variable is not resolved with the lookup function that we defined

Declaration 'String a = "foo" + "bar"' evaluates to: Success(WdlString(foobar)) Declaration 'String b = "hello " + variable' evaluates to: Success(WdlString(hello world)) Declaration 'String c = "hello " + other_variable' evaluates to: Failure(java.util.NoSuchElementException)

Instantiating Commands

Each Task's command needs to be instantiated from the abstract form in the WDL file to a concrete form. That means that each expression inside of ${...} blocks needs to be evaluated.

Example src/main/scala/wdl4s/examples/ex7.scala

```scala val wdl = """ |task a { | String prefix | Array[Int] ints | command { | python script.py ${write_lines(ints)} > ${prefix + ".out"} | } |} |workflow wf { | call a |}""".stripMargin

val ns = WdlNamespaceWithWorkflow.load(wdl, Seq.empty).get val inputs = Map( "prefix" -> WdlString("someprefix"), "ints" -> WdlArray(WdlArrayType(WdlIntegerType), Seq(1,2,3,4,5).map(WdlInteger())) )

class CustomFunctions extends WdlFunctions[WdlValue] { def write_lines(params: Seq[Try[WdlValue]]): Try[WdlValue] = { // Validate params, write the result to a file, return file path Success(WdlFile("/tmp/array.txt")) } }

ns.taskCalls.find( _.unqualifiedName == "a") foreach { call => val wdlFunctions: CustomFunctions = new CustomFunctions val evaluatedInputs = call.evaluateTaskInputs(inputs, wdlFunctions).get println(call.task.instantiateCommand(evaluatedInputs, wdlFunctions).get) } ```

This will produce the following output:

python script.py /tmp/array.txt > some_prefix.out

Accessing the WDL Parser Directly

To access only the parser, use the AstTools library, as follows:

Example src/main/scala/wdl4s/examples/ex8.scala

```scala /* Create syntax tree from contents of file */ val ast = AstTools.getAst(Paths.get(args(0)))

/* Second parameter is a descriptor about where the first string came from. * Most of the time this would be the URI of where the text was loaded from, * but there are no restrictions on what the string can be. */ AstTools.getAst("workflow simple {}", "string")

/* Print the AST */ println(ast.toPrettyString)

/* Traverse the tree to find all Task definitions */ AstTools.findAsts(ast, "Task") foreach {ast => println(s"Task name: ${ast.getAttribute("name").sourceString}") } ```

Owner

Name: Broad Institute
Login: broadinstitute
Kind: organization
Location: Cambridge, MA

Website: http://www.broadinstitute.org/
Twitter: broadinstitute
Repositories: 1,083
Profile: https://github.com/broadinstitute

Broad Institute of MIT and Harvard

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/broadinstitute/wdl4s

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

The WDL4S repository is now under the Cromwell repository. Click here for the Cromwell repo.

Scala binding API for WDL

Installation

Scaladoc

Usage

Loading WDL Code

Using An Import Resolver

Resolving Fully-Qualified Names

Getting Dependencies

Evaluating Expressions

Instantiating Commands

Accessing the WDL Parser Directly

Owner

GitHub Events

Total

Last Year