how to combine Excel, Python and SQL using open source tools

how to combine Excel, Python and SQL using open source tools

Introduction

Joel Spolsky once said in his famous You Suck at Excel presentation that there are hundreds of commercial products that could be used instead of an Excel spreadsheet.

There is a lot of truth in this joke. Ennobling the intuitive UI, formulas and VBA, Excel enables users to solve a wide range of tasks themselves, eliminating the need to ask developers for help every time or to implement a new product in the company. Excel is used everywhere from small retailers to the Fortune 500 and CERN.

Excel’s user interface turned out to be so successful – none of the competitors moved away from the usual tables. For an Excel user, the Google Spreadsheet interface will be familiar and intuitive. But in terms of its functionality, Excel still surpasses most of its competitors.

How can two powerful open source tools combine a familiar user interface, the reliability and power of SQL, the flexibility of Python, and the teamwork of Google Spreadsheet? Let’s consider this with a simple example.

Grist

https://github.com/gristlabs/grist-core

It is an analogue of Google Spreadsheet that you can host on your own infrastructure. Grist’s interface is similar to Excel, but uses Python instead of VBA, and instead of Google infrastructure, each Grist document is SQLite.

You can enter Python formulas in each cell:

Formulas

https://support.getgrist.com/formulas

Essentially, the entire Grist document is a Python script that stores data in an SQLite database:

Python

This solution has a huge plus – your data always remains with you. SQLite is very easy to set up for backups – unlike “serious” databases, SQLite does not require much maintenance, but at the same time it is almost in no way inferior to the functionality of MySQL or Postgres.

Let’s not analyze in detail all the capabilities of Grist, the development of this product went much further than Google Spreadsheet and Excel. Essentially, Grist can be used as a mini CRM in a small organization. And this solution will not be much inferior to expensive (in terms of maintenance) solutions like Salesforce.

In terms of administration, Grist is very undemanding, the service can be deployed in a container as follows:

mkdir persist
docker run -p 8484:8484 -v $$PWD/persist:/persist -it gristlabs/grist

Data in SQLite will be stored in the persist directory – this is necessary so that they can survive a service restart.

Metabase

Girst is great for a small team of specialists, but what about a business analyst or data engineer who needs to analyze data from dozens of different sources and needs to create complex reports and data showcases?

Another open source tool, Metabase, is perfect for this.

https://www.metabase.com/data_sources

Metabase

Metabase has monitoring and mailing capabilities. Metabase graphs can be embedded on third-party resources. It is possible to configure interactive data showcases and heuristics (Query Builder) with which the user can ask questions that are automatically converted into SQL queries.

Charts

Metabase settings

You will need Docker compose to install. You need to create a docker-compose.yml file with the following content:

version: '3.9'
services:
  metabase:
    image: metabase/metabase:latest
    container_name: metabase
    hostname: metabase
    volumes:
    - /dev/urandom:/dev/random:ro
    - /srv/grist:/srv/grist/persist:ro
    ports:
      - 3000:3000
    environment:
      MB_DB_TYPE: postgres
      MB_DB_DBNAME: metabase
      MB_DB_PORT: 5432
      MB_DB_USER_FILE: /run/secrets/db_user
      MB_DB_PASS_FILE: /run/secrets/db_password
      MB_DB_HOST: postgres
    networks:
      - metanet1
    secrets:
      - db_password
      - db_user
    healthcheck:
      test: curl --fail -I http://localhost:3000/api/health || exit 1
      interval: 15s
      timeout: 5s
      retries: 5
  postgres:
    image: postgres:latest
    container_name: postgres
    hostname: postgres
    environment:
      POSTGRES_USER_FILE: /run/secrets/db_user
      POSTGRES_DB: metabase
      POSTGRES_PASSWORD_FILE: /run/secrets/db_password
    networks:
      - metanet1
    secrets:
      - db_password
      - db_user
networks:
  metanet1:
    driver: bridge
secrets:
   db_password:
     file: db_password.txt
   db_user:
     file: db_user.txt

Then create two files in the current directory:

  • db_user.txt

  • db_password.txt

In these files, you need to record the user name and password for the Metabase database.
Let it be postgres and postgrespassword.

The next config line is very important: we mount the directory where Grist stores its documents into the Metabase container. This path will then be used by Metabase to connect to the database:

    volumes:
    - /dev/urandom:/dev/random:ro
    - /srv/grist/persist:/srv/grist:ro

In the Metabase settings, in the “Admin Settings” section, you need to add a new SQLite database, specifying the path to the mounted Grist directory /srv/grist/doc/НАЗВАНИЕ_ДОКУМЕНТА_GRIST. In this section, you can set parameters for automatic scanning of the scheme and tables.


Related posts