Skip to content

cnstlungu/pd2mssql

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pd2mssql

Pandas DataFrame to SQL Server tables

What is pd2mssql

pd2mssql is an utility enabling fast import of Pandas DataFrames into SQL Server tables.

Rationale

The built-in solution, pandas to_sql is slow in its current implementation, taking too long even for a modest dataframe.

Discussions with further details are available here.

How it works

In a nutshell, pd2mssql exports the dataframe into a temporary file and bulk inserts into SQL Server using the builtin SQL Server Bulk insert functionality.

Requirements

Given the fact that the package is dependent on SQL Server Bulk insert, the following conditions have to be met:

  • The account connecting to SQL Server should have the bulk permission on that particular server.
  • The SQL Server instance/user have to have access to CSV destination

See requirements for package requirements.

Quick start

Installation

cd pd2mssql

pip install .

Usage

import pd2mssql

#Set directory for temporary file. Should be accessible to SQL user.
pd2mssql.DIR = 'C://Users/MSSQLServer'

base = pd2mssql.BaseEngine('yourserver','yourdatabase','youruser','yourpassword')

engine = base.engine

#Creates the table. 
pd2mssql.create_table(dataframe, table_name, engine)

#Bulk inserts into the table
pd2mssql.bulk_insert(dataframe, table_name, engine)

Further reading

Further information is available in the documentation.

About

Pandas DataFrame to SQL Server tables

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published