Skip to content
andychu edited this page Feb 11, 2021 · 142 revisions

Resources on Unix Shells, Programming Language Design, and Implementation

You probably came here for one of these pages:

  • The BIG LIST of Alternative Shells -- fish, Elvish, NGS, etc.
  • Internal DSLs for Shell -- libraries in languages like Python, Perl, Ruby, JavaScript, Scheme, Racket, Common Lisp, Scala, OCaml, Haskell, etc.

Other Related Pages

Shell parsers

ShellCheck -- Written in Haskell, using the parser combinator style. (No seperate lexer).

shfmt -- Shell auto-formatter like gofmt, written in Go.

sh-parser -- parsing POSIX shell with Lua's LPEG

morbig -- from Colis project, parsing POSIX shell, FOSDEM Talk. alias makes parsing undecidable too!

https://github.com/idank/bashlex -- bashlex is a Python port of the parser used internally by GNU bash. For the most part it's transliterated from C ... I wrote this library for another project of mine, explainshell which needed a new parsing backend to support complex constructs such as process/command substitutions.

tree-sitter-bash -- grammar.js is 512 lines? There's also C++ code?

Academic Projects

Awk/Sed-Like Languages

Tab Language -- An interesting statically-typed, non Turing complete language that apparently fills the niche of Awk. Written in C++.

Miller (Language Reference) -- Miller is like awk, sed, cut, join, and sort for name-indexed data such as CSV, TSV, and tabular JSON. Written in C.

TXR -- TXR is a pragmatic, convenient tool ready to take on your daily hacking challenges with its dual personality: its whole-document pattern matching and extraction language for scraping information from arbitrary text sources, and its powerful data-processing language to slice through problems like a hot knife through butter. Many tasks can be accomplished with TXR "one liners" directly from your system prompt. There is a TXR Lisp and then it is embedded in a pattern language. Sort of like the reverse of a template language?

jq -- jq is like sed for JSON data - you can use it to slice and filter and map and transform structured data with the same ease that sed, awk, grep and friends let you play with text. Written in C.

Dormant

r17 (Github, written in C++) -- A flexible, scalable, relational data mining language. No releases since 2013.

  • r17's syntax is a cross between UNIX shell and SQL.
  • Built-in concurrency, including cross-machine concurrency.
  • Strong type checking at stream-header-read time.

streem (by Matz, creator of Ruby, written in C with yacc grammar) -- Streem is a stream based concurrent scripting language. It is based on a programming model similar to the shell, with influences from Ruby, Erlang, and other functional programming languages.

Internal Shell DSLs in Various Languages

Every language seems to have an internal DSL for shell commands. This approach is probably OK for small things, but I haven't seen it in the wild in major pieces of software.

EShell in Emacs Lisp -- A bash-like shell embedded in Emacs. Example Syntax, Mastering EShell.

plumbum in Python -- The motto of the library is "Never write shell scripts again", and thus it attempts to mimic the shell syntax ("shell combinators") where it makes sense, while keeping it all Pythonic and cross-platform.

python-mario -- Have you ever wanted to use Python functions directly in your Unix shell? Mario can read and write csv, json, and yaml; traverse trees, and even do xpath queries. Plus, it supports async commands right out of the box.. This looks a bit like perl -e for Python. That is, you're using Python from shell, not embedding shell-like code inside Python.

sh in Python -- sh is a full-fledged subprocess replacement for [multiple Python versions] that allows you to call any program as if it were a function. I wouldn't call this a shell because it doesn't support pipelines and such, but it's an example of programmers preferring the syntax of their language to the syntax of Unix shell.

pysh -- Dormant project where the author encountered problems in the approach of embedding shell in another language.

Shell in Ruby -- It provides users the ability to execute commands with filters and pipes, like sh/csh by using native facilities of Ruby. This is in the standard library?

psh in Perl -- Perl Shell (psh) combines aspects of bash and other shells with the power of Perl scripting. This one is notable because Perl has a heavy influence from shell, sed, and awk. It appears it's still not close enough.

scsh in Scheme -- Scsh has two main components: a process notation for running programs and setting up pipelines and redirections, and a complete syscall library for low-level access to the operating system. Oil also aims to have a complete syscall library.

inferior-shell in Common Lisp -- This CL library allows you to spawn local or remote processes and shell pipes. It lets me use CL in many cases where I would previously write shell scripts.

forsh in Forth -- forsh is a shell built on top of gforth. It allows one to easily operate a unix-like operating system without leaving the gforth environment

Ammonite-Ops in Scala -- a library to make common filesystem operations in Scala as concise and easy-to-use as from the Bash shell

HSH in Haskell -- HSH is designed to let you mix and match shell expressions with Haskell programs.

Caml-Shcaml: an OCaml Library for Unix Shell Programming

janestreet/shexp -- Shexp is composed of two parts: a library providing a process monad for shell scripting in OCaml as well as a simple s-expression based shell interpreter. Both provide good debugging support.

closh -- Bash-like shell based on Clojure. This may have some of its own syntax, but it also uses Clojure syntax.

rash -- Racket #lang for shell scripting and interaction. Allows pipelines to mix processes and Racket functions, has user-definable pipeline operators, lets you embed normal Racket and shell-style Rash code within each other, and inherits all of Racket's features.

xshell -- xshell makes it easy to write cross-platform "bash" scripts in Rust

Internal Awk DSLs

FuncShell – A Haskell-based alternative to awk -- Also has links to sqawk in SQL, luawk in Lua.

Internal Build Tool DSLs

As with shells, each language community has explored idea of using their language to express build rules.

SCons in Python -- Configuration files are Python scripts--use the power of a real programming language to solve build problems

Rake in Ruby -- Rakefiles (rake's version of Makefiles) are completely defined in standard Ruby syntax. The book Beautiful Code has an essay by Matz which discusses why this is possible and nice in Ruby.

Jake in JavaScript -- A Jakefile is just executable JavaScript. You can include whatever JavaScript you want in it.

Grunt in JavaScript -- This is called a "task runner" rather than a build tool. A shell is also a task runner! The Gruntfile can execute arbitrary code and do I/O, i.e. read package.json files.

Shake in Haskell -- Shake is implemented as a Haskell library, and Shake build systems are structured as Haskell programs which make heavy use of the Shake library functions

sbt in Scala -- Scala-based build definition that can use the full flexibility of Scala code

Shell Complements

ShellJs -- This is the opposite of a shell in JavaScript -- it's all the Unix utilities in JavaScript, and you use JS as your shell language.

Scientific Workflow Languages

Scientific Workflow System on Wikipedia has a huge list.

  • Cuneiform -- Cuneiform combines the strong points of functional programming languages, distributed databases, and workflow management systems.
  • Nextflow -- an external DSL, e.g. process { }. Nextflow enables scalable and reproducible scientific workflows using software containers. It allows the adaptation of pipelines written in the most common scripting languages.
  • Common Workflow Language -- The Common Workflow Language (CWL) is a specification for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.
  • SciPipe -- SciPipe is a library for writing scientific workflows (sometimes also called "pipelines") of shell commands that depend on each other, in the Go programming language
  • HN thread on dgsh mentions many similar systems.

POSIX Shells

See list starting at http://www.oilshell.org/cross-ref.html#bash

TODO: add links

GNU bash -- most popular shell in the world, on Linux, Mac, Windows

ksh / pdksh / mksh -- ksh was a proprietary extension of the original Bourne shell; pdksh was an open source clone of ksh; mksh is a fork of pdksh and used on Android.

dash / busybox ash (same lineage)

busybox hush -- shell in one file.

yash -- Yash, yet another shell, is a POSIX-compliant command line shell written in C99 (ISO/IEC 9899:1999). Yash is intended to be the most POSIX-compliant shell in the world while supporting features for daily interactive and scripting use. -- Has Debian and Fedora packages.

zsh (not POSIX compatible by default)

Publications and Academic Papers

TODO:

  • Thompson shell
  • Bourne Shell
  • ksh paper (Usenix)
  • bash papers

A Pipe Has Two Ends. Using APL in a multiprocess/multiprocessor environment. -- 1989. A proposal for a flexible but easy to use syntax. ... I offer for general consideration a device that allows data to be piped out of APL, through (a series of) shell commands, and back into APL.

TODO

Videos

Programming Design and Implementation

TODO

Books

Programming Design and Implementation

See http://www.oilshell.org/cross-ref.html

TODO

Tools / Software

See http://www.oilshell.org/cross-ref.html

Pages with Collections of Links

Unix Shell on Wikipedia -- Some useful links, most of which are on this page.

List of build automation software on Wikipedia

UNIX Shell Implementations on rosettacode.org

comp.unix.shell thread -- A huge list of links to shell implementations.

Other Interesting Links

libc manual: Implementing a Job Control Shell

Notes on coprocesses

Clone this wiki locally