seq is an actively used programming language created in 2019. A High-Performance Language for Bioinformatics. Here, we introduce Seq, the first language tailored specifically to bioinformatics, which marries the ease and productivity of Python with C-like performance. Seq is a subset of Python—and in many cases a drop-in replacement—yet also incorporates novel bioinformatics- and computational genomics-oriented data types, language constructs and optimizations. Seq enables users to write high-level, Pythonic code without having to worry about low-level or domain-specific optimizations, and allows for seamless expression of the algorithms, idioms and patterns found in many genomics or bioinformatics applications. On equivalent CPython code, Seq attains a performance improvement of up to two orders of magnitude, and a 175× improvement once domain-specific language features and optimizations are used. With parallelism, we demonstrate up to a 650× improvement. Compared to optimized C++ code, which is already difficult for most biologists to produce, Seq frequently attains up to a 2× improvement, and with shorter, cleaner code. Thus, Seq opens the door to an age of democratization of highly-optimized bioinformatics software.
- seq first appeared in 2019
- Have a question about seq not answered here? Email me and let me know how I can help.
Last updated June 22nd, 2020