Monday, November 23, 2015

execsql

execsql is a Python application that reads a text file of SQL commands and executes them against a database. The supported databases are:
  • PostgreSQL
  • MS-Access
  • MS-SQL Server
  • SQLite
  • MySQL or MariaDB
  • Firebird
  • ODBC DSN connections.
In addition to executing SQL statements, execsql also implements a number of metacommands that allow:
  • Import of data from text files and OpenDocument spreadsheets
  • Export of data to delimited text, HTML, JSON, LaTeX tables, and OpenDocument spreadsheets
  • Copying of data between databases, even of different DBMS types
  • Conditional execution of SQL code and metacommands based on data values or user input.

execsql allows variables to be defined and used to dynamically modify SQL code and metacommands. An INCLUDE metacommand can be used to modularize SQL scripts and facilitate code re-use. Variables can be used to parameterize included scripts. Conditional inclusion of scripts can be used to implement loops. Automatically incrementing counter variables can be used to control loops or generate unique values to be used in input or output.

execsql is fundamentally a command-line application that can be used to automate database operations, either by itself or as part of a toolchain that may include other steps to either pre-process data before loading, or post-process or analyze data that are extracted from a database. However, execsql also has the ability to present data to the user in dialog boxes, request several types of input from the user either on the terminal or in GUI dialogs, and display messages to the user either on the terminal or in a GUI console. These features allow flexible interactive applications to be created.

A guiding principle in the development of execsql is documentation of all data operations. The use of SQL scripts, rather than ad-hoc interactive operations in a GUI, is key to meeting that goal. In addition, execsql logs all usage information, including databases used, script files executed (including script file version information), variable assignments, user input, and errors. Chain-of-custody procedures are used in some disciplines to ensure traceability from collection through (typically) laboratory analysis. However, no formal chain of custody procedures ordinarily are applied to data during or after entry into a database. SQL script files and execsql log files can be used to produce a complete record of data operations so that traceability of data can be extended into the database environment.

execsql and its documentation are available from the Python Package Index (PyPI). Note that the search functionality in PyPI and in pip search is broken, and will return links to obsolete versions of execsql that are no longer available. The direct link will take you to the latest available version.

5 comments:

  1. I have been playing with this application for about a week now and I absolutely LOVE IT!! Thank you SO MUCH for taking the time to develop something so awesome!! There is just one thing I am wondering about. All of the databases to which I need to connect require usernames and passwords to be supplied and the scripts need to run under 100% automation. No user interaction at all. At first I though I could choose to specify nothing but the script files as command line params and then use decrypted passwords via CONNECT commands inside said scripts. Unfortunately this does not appear to work. I've gone through the documentation over and over again, but I can't figure out how to do what I need to do. Am I missing something?

    ReplyDelete
    Replies
    1. I'm glad to hear you're finding it useful. The approach you describe should work if you make an initial connection a SQLite database (which need never be used thereafter) and then use the CONNECT metacommand with passwords for the databases you really want to use. Except unless they're SQL Server, because it doesn't yet support the PASSWORD argument to the CONNECT metacommand...but that's easily remedied, if you need it.

      Delete
    2. Haha!! Yup, most of the databases are SQL Server. The others are RedShift, which works perfectly via a postgres connection by the way.

      Delete
    3. WOW!! Thanks!! This is fantastic!

      Delete