Python with SQLite

TechTutor
Oct 16, 2023
3 min read

SQLite is a C library that provides a lightweight disk-based database that doesn’t require a separate server process and allows accessing the database using a nonstandard variant of the SQL query language. Some applications can use SQLite for internal data storage. It’s also possible to prototype an application using SQLite and then port the code to a larger database such as PostgreSQL or Oracle.

The sqlite3 module was written by Gerhard Häring. It provides an SQL interface compliant with the DB-API 2.0 specification described by PEP 249, and requires SQLite 3.7.15 or newer.

Parallel read and write in SQLite

In many cases, you must read from and write to a SQLite database at the same time. For example, you might want to sync data in the background with a transaction that is open while also allowing the user to view data in the UI. This article will demonstrate how to have one writing thread and multiple reading threads open at the same time. Unfortunately, concurrent writing from several threads is not permitted.

Why can't we read when we're writing in a transaction?

The short answer is locking. Longer answer is rollback journal mode. These are mechanisms ensuring atomic commit and rollback in transactions. Before digging deeper, let me remind you that every writing to SQLite has to be done in transaction, even if implicit or explicit.

SQLite indeed has a global write lock, which means that concurrent write transactions are serialized, and only one write transaction can occur at a time. However, SQLite provides a way to achieve parallelism in write operations through various techniques and strategies. Here are some approaches you can consider for parallel insert operations in SQLite:

Multi-Threading: Although SQLite enforces a global write lock, you can use multi-threading to parallelize read and write operations. SQLite allows multiple threads to read from the database simultaneously. You can set up separate threads to handle different insert operations, and each thread can open its SQLite connection and perform inserts independently.

import os
import threading
import sqlite3

def clear_screen():
    os.system('cls')

# Establish a connection to the database
def establish_database_connection(connection_string):
    try:
        connection = sqlite3.connect(connection_string)
        print(f"Connection established to Database '{connection_string}'")
        return connection
    except sqlite3.Error as e:
        print(f"Database error: {e}")
        return None

# Execute an INSERT query and commit the transaction
def insert_data(connection_string, query, thread_id):
    connection = establish_database_connection(connection_string)
    cursor = connection.cursor()
    try:
        cursor.execute(query)
        connection.commit()
        print(f"Thread {thread_id} - Data inserted successfully")
    except sqlite3.Error as e:
        print(f"Thread {thread_id} - Error inserting data: {e}")
    finally:
        connection.close()

# Connection string
connection_string = "files/sqlite/my_database.db"

# Clear the screen
clear_screen()

# List of threads
threads = []


# Create and start threads
for i in range(100):
    # Query to insert data
    insert_query = f"INSERT INTO student (studentname, age) VALUES ('John Doe', {i})"
    thread = threading.Thread(target=insert_data, args=(connection_string, insert_query, i))
    threads.append(thread)
    thread.start()

# Wait for all threads to finish
for thread in threads:
    thread.join()

Multiprocessing: Instead of multi-threading, you can use the multiprocessing module in Python to create separate processes, each with its SQLite connection. Each process can perform insert operations independently. This approach takes advantage of multiple CPU cores and can provide true parallelism

By wrapping the process creation and execution code inside the if __name__ == '__main__': block, you ensure that it's only executed in the main module, which should resolve the error you were encountering.

import os
import multiprocessing
import sqlite3

def clear_screen():
    os.system('cls')

# Function to insert data into the database within a process
def insert_data(connection_string, query, thread_id):
    try:
        connection = sqlite3.connect(connection_string)
        cursor = connection.cursor()

        cursor.execute(query)
        connection.commit()
        print(f"Thread {thread_id} - Data inserted successfully")
    except sqlite3.Error as e:
        print(f"Thread {thread_id} - Error inserting data: {e}")
    finally:
        connection.close()

if __name__ == '__main__':
    # Connection string
    connection_string = "files/sqlite/my_database.db"

    # Clear the screen
    clear_screen()

    # List of processes
    processes = []

    # Create and start processes
    for i in range(4):
        # Query to insert data
        insert_query = f"INSERT INTO student (studentname, age) VALUES ('John Doe', {i})"
        process = multiprocessing.Process(target=insert_data, args=(connection_string, insert_query, i))
        processes.append(process)
        process.start()

    # Wait for all processes to finish
    for process in processes:
        process.join()

Batch Insert: Another way to improve insert performance in SQLite is to use batch insert operations. Instead of inserting one row at a time, you can prepare and execute a single SQL statement that inserts multiple rows in a single transaction. This reduces the overhead of individual transactions.

import os
import multiprocessing
import sqlite3

def clear_screen():
    os.system('cls')

# Connection string
connection_string = "files/sqlite/my_database.db"

def batch_insert(connection_string,data):
    connection = sqlite3.connect(connection_string)
    cursor = connection.cursor()
    cursor.executemany("INSERT INTO student (studentname, age) VALUES (?, ?)", data)
    
    connection.commit()
    connection.close()

# Clear screen
clear_screen()

batch_data = [('value1', 'value2'), ('value3', 'value4'), ('value5', 'value6')]
batch_insert(connection_string,batch_data)

References

https://docs.python.org/3/library/sqlite3.html

https://www.skoumal.com/en/parallel-read-and-write-in-sqlite/

Python with SQLite

Parallel read and write in SQLite

Why can't we read when we're writing in a transaction?

References

Recent Posts

TECHTUTORTIPS.com