The SQLAlchemy ORM gives us a powerful API for describing database tables, expressing the relationships between them, and building queries.
In this class we will:
-
start with some predefined Models and create and inspect the resulting tables
-
add data to our tables by instanciating the models and understand SQLAlchemy’s
session
object -
learn to make queries with our models that:
-
fetch records with simple conditions
-
fetch related records across relationships
We won’t:
-
Explain in detail relational database concepts
-
Explore the SQLAlchemy core
-
Investigate integration with other frameworks (eg flask-sqlalchemy).
-
Be able to answer every question. This is a big API!
Check out http://jjinux.blogspot.com/2014/04/pycon-notes-introduction-to-sqlalchemy.html - my friends notes on Mike Bayer’s PyCon tutorial with a link to the video.
Please clone https://github.com/simeonf/sqlalchemy-tutorial.git to get the class files and these slides.
Be sure you have Python 2.7 installed.
Install SQLAlchemy. In decreasing order of desirability:
$ virtualenv sqlalchemy-class
$ source sqlalchemy-class/bin/activate
$ pip install sqlalchemy
If you don’t know what a virtualenv is:
$ sudo pip install sqlalchemy
If you don’t know what a virtualenv is:
$ sudo pip install sqlalchemy
$ sudo easy_install sqlalchemy
If none of this works for you try installing Anaconda Python from https://store.continuum.io/cshop/anaconda/
It comes with SQLAlchemy preinstalled. Be sure to select Python 2.
Let’s get some necessary configuration boilerplate out of the way. To use sqlalchemy we need to create a database backend and start a session which we’ll use to manage transactions.
SQLAlchemy uses uri’s which embed the database type, database driver, and connection details like username, password, and database when connecting to a server.
SQLAlchemy suppports a variety of common relational databases (mysql,
postgres, oracle, mssql, etc). For this class we’ll use sqlite3
which doesn’t require a separate server or python db driver
installation.
Some necessary boilerplate:
link:config.py[role=include]
We can directly execute raw sql via the engine. You might be familiar with this sort of thing if you’ve ever used the DB-API directly.
>>> from config import * >>> engine.execute("""CREATE TABLE foo ( id integer primary key autoincrement, message text );""") >>> engine.execute("insert into foo (message) values('testing, testing, 123');") >>> list(engine.execute("select * from foo")) [(1, u'testing, testing, 123')]
We can use the sqlite3
command line client to inspect our
database. You may be familiar with using the mysql
or psql
cli:
$ sqlite3 db.sqlite3
SQLite version 3.7.12 2012-04-03 19:43:07
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
foo
sqlite> .schema foo
CREATE TABLE foo (
id integer primary key autoincrement,
message text
);
sqlite> select * from foo;
1|testing, testing, 123
SQL is a great language for doing explorations of our data. Every developer who uses databases should know sql. Feel free to use it in the database CLI - but we don’t want sql in our Python code.
The sort of engine.execute()
code is horrible to read and write -
composing sql statements in Python is cumbersome and susceptible to
all sorts of quoting bugs, sql syntax errors that don’t show up until
you execute them…
We can do better.
SQLAlchemy comes with a declarative layer. The declarative layer let’s us describe Python classes and SQLAlchemy maps them to databnase tables.
SQLAlchemy can generate the appropriate create table
statements for
our selected backend and create the tables for us.
Models inherit from a specially generated Base
class which does some
Python magic on our class definitions. The Base
also serves as a
ptable registry and can create the database tables. A sample model
looks like:
link:models1.py[role=include]
We can use the models and base to create our tables.
>>> from models1 import User, Base, engine
>>> User.__table__ # the generated Table instance
Table('user', MetaData(bind=None), Column('id', Integer(), table=<user>, primary_key=True, nullable=False), Column('email', String(length=255), table=<user>), Column('is_active', Boolean(), table=<user>, default=ColumnDefault(True)), schema=None)
>>> Base.metadata.create_all(engine)
... snip ...
CREATE TABLE user (
id INTEGER NOT NULL,
email VARCHAR(255),
is_active BOOLEAN,
PRIMARY KEY (id),
CHECK (is_active IN (0, 1))
)
-
Use the python shell to create your models.
-
Check out your
user
table definition in the sqlite3 shell.
We can add items to our table by instanciating our class, setting its attributes, and telling the session to add the object. Don’t forget to commit!
>>> from models1 import User, Session
>>> session = Session() # The session object is used to start/commit transactions
>>> bob = User(email='bob@aol.com') # We can pass keyword args to the constructor
>>> bob.email
'bob@aol.com'
>>> bob.id # No PK. This object hasn't been saved.
>>> bob.is_active = True # we can assign to object attributes
>>> session.add(bob) # Save the object
>>> bob.id # And now it has a PK
1
>>> session.commit() # don't quit without committing!
-
Interactively create several user objects.
-
Use
sqlite3
to see that the objects exist as rows in theuser
table.
SQLAlchemy’s declarative objects can also be used to make queries. For
now let’s just look at all()
and filter_by()
queries.
Queries are formed by calling the .query()
method of the session
object. Some libraries add the .query
methods to your models a la
Django’s Model.objects
pattern - we’ll see how to do that later.
We can pass a model to the .query
method and get back a Query
instance. This object represents an SQL query. The query doesn’t
actually run until you iterate over it or call a terminating method.
>>> from models1 import User, Session
>>> session = Session()
>>> q = session.query(User)
>>> q
<sqlalchemy.orm.query.Query object at 0x10b426350>
>>> list(q)
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active
AS user_is_active
FROM user
[<models1.User object at 0x10b421b90>]
The Query
object has a "fluent" style interface where many methods return
new Query
objects.
However .all()
and .first()
do not return new query objects. They
return lists of instances and instances respectively. There are other
such convenience methods (try count()
!)
>>> q = session.query(User)
>>> q.all() # like "select * from user"
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active FROM user
[<models1.User object at 0x10b421b90>]
>>> q.first()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user LIMIT ? OFFSET ?
INFO: (1, 0)
<models1.User object at 0x10b421b90>
filter_by()
does return a new Query
object. The filter_by
method
takes named arguments corresponding to column names and values to
search for.
>>> q = session.query(User)
>>> q.filter_by(id=1)
<sqlalchemy.orm.query.Query object at 0x10b429b50>
>>> q.filter_by(email='bob@hotmail.com')
<sqlalchemy.orm.query.Query object at 0x10b429ad0>
>>> q.filter_by(email='bob@hotmail.com').all() # You may have to call a terminating method
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user
WHERE user.email = ?
INFO: ('bob@hotmail.com',)
[]
>>> q.filter_by(no_such_column=1) # SQL errors are caught before they're run
Traceback (most recent call last):
...snip...
InvalidRequestError: Entity '<class 'models1.User'>' has no property 'no_such_column'
-
At the console lists of our User instances aren’t very helpful. Add a
repr
method to theUser
class to fix this. -
Interactively get the user with an id of 1.
-
Query objects always evaluate to True. The following pattern isn’t a good idea:
if session.query(User): print "There are users"
How can you tell if your query matched any records?
But we want to make more comlicated queries!
To do that we need the filter()
method. filter
takes positional
arguments that are sql expressions. To create these expressions we can
use Python operators with model fields.
>>> User.id > 1
<sqlalchemy.sql.elements.BinaryExpression object at 0x103de2f90>
>>> session.query(User).filter(User.id > 1).all()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user
WHERE user.id > ?
INFO: (1,)
Sometimes the thing we want to do doesn’t have a Python operator. The
fields on our models have methods that also create sql expressions. To
do a like
query we might try
>>> session.query(User).filter(User.email.like('%bob%')).all()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user
WHERE user.email LIKE ?
INFO: ('%bob%',)
We can use the order_by
method to modify a query’s ORDER BY
clause. order_by()
can take either string column names (placed into
the sql statement as is) or more safely can take a column expression or method.
>>> list(session.query(User).order_by('email'))
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user ORDER BY email
[<models1.User object at 0x103eb6310>]
>>> list(session.query(User).order_by(User.email.desc()))
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
FROM user ORDER BY user.email DESC
[<models1.User object at 0x103eb6310>]
We can also create limit queries by slicing. Note that this is not retrieving a list of instances and then slicing the list!
>>> session.query(User)[:10] INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active FROM user LIMIT ? OFFSET ? INFO: (10, 0)
-
How do you retrieve the first 10 users whose email ends in 'gmail.com'? How many are there?
-
How do you retrieve users whose email ends in 'gmail.com' and have an id < 100?
-
How do you retrieve users whose id is < 10 or greater than 100? SEE:
sqlalchemy.or_
But relational databases are all about relationships! Our models can define foreign key columns and SQLAlchemy is also happy to provide automatic interfaces to make it easy to access related objects.
link:models2.py[role=include]
We can create relationships by setting the user_id
field:
>>> from models2 import User, Page, Session, engine, Base
>>> session = Session()
>>> p = Page(url="/about")
>>> u = session.query(User).filter_by(id=1).first()
>>> p.user_id = u.id
>>> session.add(p)
>>> session.commit()
>>> p.user # And the user object is directly available
<User instance: 1>
Or we can set the user
object directly:
>>> u = session.query(User).filter_by(id=2).first() >>> p.user = u >>> session.add(p) >>> session.commit() >>> p.user <User instance: 2>
And we can access the relationship the other way thanks to our backref!
>>> u.pages INFO: SELECT page.id AS page_id, page.user_id AS page_user_id, page.url AS page_url, page.content AS page_content FROM page WHERE page.user_id = ? INFO: (2,) [<Page instance: 1>]
Note that this gets us the "many" side of our relationship. There could be many pages with user_id field pointing to the same user!
-
Use model2.py to create your new table. Add some pages by instanciating
Page
andUser
objects. -
Print all the users and the count of pages they own.
EXTRA: This isn’t the most efficient way to do it! Try:
from sqlalchemy import func
and pass func.count(Page.id)
to .query
along with the User
class.