Monday, November 23, 2015

execsql

execsql is a Python application that reads a text file of SQL commands and executes them against a database. The supported databases are:
  • PostgreSQL
  • MS-Access
  • MS-SQL Server
  • SQLite
  • MySQL or MariaDB
  • Firebird
  • ODBC DSN connections.
In addition to executing SQL statements, execsql also implements a number of metacommands that allow:
  • Import of data from text files and OpenDocument spreadsheets
  • Export of data to delimited text, HTML, JSON, LaTeX tables, and OpenDocument spreadsheets
  • Copying of data between databases, even of different DBMS types
  • Conditional execution of SQL code and metacommands based on data values or user input.

execsql allows variables to be defined and used to dynamically modify SQL code and metacommands. An INCLUDE metacommand can be used to modularize SQL scripts and facilitate code re-use. Variables can be used to parameterize included scripts. Conditional inclusion of scripts can be used to implement loops. Automatically incrementing counter variables can be used to control loops or generate unique values to be used in input or output.

execsql is fundamentally a command-line application that can be used to automate database operations, either by itself or as part of a toolchain that may include other steps to either pre-process data before loading, or post-process or analyze data that are extracted from a database. However, execsql also has the ability to present data to the user in dialog boxes, request several types of input from the user either on the terminal or in GUI dialogs, and display messages to the user either on the terminal or in a GUI console. These features allow flexible interactive applications to be created.

A guiding principle in the development of execsql is documentation of all data operations. The use of SQL scripts, rather than ad-hoc interactive operations in a GUI, is key to meeting that goal. In addition, execsql logs all usage information, including databases used, script files executed (including script file version information), variable assignments, user input, and errors. Chain-of-custody procedures are used in some disciplines to ensure traceability from collection through (typically) laboratory analysis. However, no formal chain of custody procedures ordinarily are applied to data during or after entry into a database. SQL script files and execsql log files can be used to produce a complete record of data operations so that traceability of data can be extended into the database environment.

execsql and its documentation are available from the Python Package Index (PyPI). Note that the search functionality in PyPI and in pip search is broken, and will return links to obsolete versions of execsql that are no longer available. The direct link will take you to the latest available version.