Skip to content

Latest commit

 

History

History
469 lines (357 loc) · 12.6 KB

slides.asc

File metadata and controls

469 lines (357 loc) · 12.6 KB

SQLAlchemy ORM

What we’re going to cover

The SQLAlchemy ORM gives us a powerful API for describing database tables, expressing the relationships between them, and building queries.

In this class we will:

  • start with some predefined Models and create and inspect the resulting tables

  • add data to our tables by instanciating the models and understand SQLAlchemy’s session object

  • learn to make queries with our models that:

  • fetch records with simple conditions

  • fetch related records across relationships

We won’t:

  • Explain in detail relational database concepts

  • Explore the SQLAlchemy core

  • Investigate integration with other frameworks (eg flask-sqlalchemy).

  • Be able to answer every question. This is a big API!

Check out http://jjinux.blogspot.com/2014/04/pycon-notes-introduction-to-sqlalchemy.html - my friends notes on Mike Bayer’s PyCon tutorial with a link to the video.

Getting setup

Please clone https://github.com/simeonf/sqlalchemy-tutorial.git to get the class files and these slides.

Be sure you have Python 2.7 installed.

Installation

Install SQLAlchemy. In decreasing order of desirability:

Use a virtualenv
$ virtualenv sqlalchemy-class
$ source sqlalchemy-class/bin/activate
$ pip install sqlalchemy

If you don’t know what a virtualenv is:

Just use sudo
$ sudo pip install sqlalchemy

If you don’t know what a virtualenv is:

Just use sudo.
$ sudo pip install sqlalchemy
If you don’t have pip use easy_install
$ sudo easy_install sqlalchemy

If none of this works for you try installing Anaconda Python from https://store.continuum.io/cshop/anaconda/

It comes with SQLAlchemy preinstalled. Be sure to select Python 2.

Connecting to a database

Let’s get some necessary configuration boilerplate out of the way. To use sqlalchemy we need to create a database backend and start a session which we’ll use to manage transactions.

SQLAlchemy uses uri’s which embed the database type, database driver, and connection details like username, password, and database when connecting to a server.

SQLAlchemy suppports a variety of common relational databases (mysql, postgres, oracle, mssql, etc). For this class we’ll use sqlite3 which doesn’t require a separate server or python db driver installation.

Some necessary boilerplate:

config.py
link:config.py[role=include]

Using the engine

We can directly execute raw sql via the engine. You might be familiar with this sort of thing if you’ve ever used the DB-API directly.

>>> from config import *
>>> engine.execute("""CREATE TABLE foo (
                   id integer primary key autoincrement,
                   message text
                   );""")
>>> engine.execute("insert into foo (message) values('testing, testing, 123');")
>>> list(engine.execute("select * from foo"))
[(1, u'testing, testing, 123')]

Using the db client

We can use the sqlite3 command line client to inspect our database. You may be familiar with using the mysql or psql cli:

$ sqlite3 db.sqlite3
SQLite version 3.7.12 2012-04-03 19:43:07
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> .tables
foo
sqlite> .schema foo
CREATE TABLE foo (
  id integer primary key autoincrement,
  message text
);
sqlite> select * from foo;
1|testing, testing, 123

Do not want!

SQL is a great language for doing explorations of our data. Every developer who uses databases should know sql. Feel free to use it in the database CLI - but we don’t want sql in our Python code.

The sort of engine.execute() code is horrible to read and write - composing sql statements in Python is cumbersome and susceptible to all sorts of quoting bugs, sql syntax errors that don’t show up until you execute them…​

We can do better.

Model Basics

SQLAlchemy comes with a declarative layer. The declarative layer let’s us describe Python classes and SQLAlchemy maps them to databnase tables.

SQLAlchemy can generate the appropriate create table statements for our selected backend and create the tables for us.

Models inherit from a specially generated Base class which does some Python magic on our class definitions. The Base also serves as a ptable registry and can create the database tables. A sample model looks like:

config.py
link:models1.py[role=include]

We can use the models and base to create our tables.

>>> from models1 import User, Base, engine
>>> User.__table__  # the generated Table instance
Table('user', MetaData(bind=None), Column('id', Integer(), table=<user>, primary_key=True, nullable=False), Column('email', String(length=255), table=<user>), Column('is_active', Boolean(), table=<user>, default=ColumnDefault(True)), schema=None)
>>> Base.metadata.create_all(engine)
... snip ...
CREATE TABLE user (
        id INTEGER NOT NULL,
        email VARCHAR(255),
        is_active BOOLEAN,
        PRIMARY KEY (id),
        CHECK (is_active IN (0, 1))
)

Exercise

  1. Use the python shell to create your models.

  2. Check out your user table definition in the sqlite3 shell.

Populating our Table

We can add items to our table by instanciating our class, setting its attributes, and telling the session to add the object. Don’t forget to commit!

>>> from models1 import User, Session
>>> session = Session()  # The session object is used to start/commit transactions
>>> bob = User(email='bob@aol.com')  # We can pass keyword args to the constructor
>>> bob.email
'bob@aol.com'
>>> bob.id  # No PK. This object hasn't been saved.
>>> bob.is_active = True  # we can assign to object attributes
>>> session.add(bob)  # Save the object
>>> bob.id   # And now it has a PK
1
>>> session.commit() # don't quit without committing!

Exercise

  1. Interactively create several user objects.

  2. Use sqlite3 to see that the objects exist as rows in the user table.

query()

SQLAlchemy’s declarative objects can also be used to make queries. For now let’s just look at all() and filter_by() queries.

Queries are formed by calling the .query() method of the session object. Some libraries add the .query methods to your models a la Django’s Model.objects pattern - we’ll see how to do that later.

We can pass a model to the .query method and get back a Query instance. This object represents an SQL query. The query doesn’t actually run until you iterate over it or call a terminating method.

>>> from models1 import User, Session
>>> session = Session()
>>> q = session.query(User)
>>> q
<sqlalchemy.orm.query.Query object at 0x10b426350>
>>> list(q)
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active
       AS user_is_active
       FROM user
[<models1.User object at 0x10b421b90>]

all and first

The Query object has a "fluent" style interface where many methods return new Query objects.

However .all() and .first() do not return new query objects. They return lists of instances and instances respectively. There are other such convenience methods (try count()!)

>>> q = session.query(User)
>>> q.all() # like "select * from user"
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active FROM user
[<models1.User object at 0x10b421b90>]
>>> q.first()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user LIMIT ? OFFSET ?
INFO: (1, 0)
<models1.User object at 0x10b421b90>

filter_by

filter_by() does return a new Query object. The filter_by method takes named arguments corresponding to column names and values to search for.

>>> q = session.query(User)
>>> q.filter_by(id=1)
<sqlalchemy.orm.query.Query object at 0x10b429b50>
>>> q.filter_by(email='bob@hotmail.com')
<sqlalchemy.orm.query.Query object at 0x10b429ad0>
>>> q.filter_by(email='bob@hotmail.com').all() # You may have to call a terminating method
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user
      WHERE user.email = ?
INFO: ('bob@hotmail.com',)
[]
>>> q.filter_by(no_such_column=1)  # SQL errors are caught before they're run
Traceback (most recent call last):
  ...snip...
InvalidRequestError: Entity '<class 'models1.User'>' has no property 'no_such_column'

Exercise

  1. At the console lists of our User instances aren’t very helpful. Add a repr method to the User class to fix this.

  2. Interactively get the user with an id of 1.

  3. Query objects always evaluate to True. The following pattern isn’t a good idea:

       if session.query(User):
           print "There are users"

    How can you tell if your query matched any records?

filter

But we want to make more comlicated queries!

To do that we need the filter() method. filter takes positional arguments that are sql expressions. To create these expressions we can use Python operators with model fields.

>>> User.id > 1
<sqlalchemy.sql.elements.BinaryExpression object at 0x103de2f90>
>>> session.query(User).filter(User.id > 1).all()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
         FROM user
         WHERE user.id > ?
INFO: (1,)

Sometimes the thing we want to do doesn’t have a Python operator. The fields on our models have methods that also create sql expressions. To do a like query we might try

>>> session.query(User).filter(User.email.like('%bob%')).all()
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user
      WHERE user.email LIKE ?
INFO: ('%bob%',)

ORDER BY

We can use the order_by method to modify a query’s ORDER BY clause. order_by() can take either string column names (placed into the sql statement as is) or more safely can take a column expression or method.

>>> list(session.query(User).order_by('email'))
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user ORDER BY email
[<models1.User object at 0x103eb6310>]
>>> list(session.query(User).order_by(User.email.desc()))
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user ORDER BY user.email DESC
[<models1.User object at 0x103eb6310>]

LIMIT

We can also create limit queries by slicing. Note that this is not retrieving a list of instances and then slicing the list!

>>> session.query(User)[:10]
INFO: SELECT user.id AS user_id, user.email AS user_email, user.is_active AS user_is_active
      FROM user
      LIMIT ? OFFSET ?
INFO: (10, 0)

Exercise

  1. How do you retrieve the first 10 users whose email ends in 'gmail.com'? How many are there?

  2. How do you retrieve users whose email ends in 'gmail.com' and have an id < 100?

  3. How do you retrieve users whose id is < 10 or greater than 100? SEE: sqlalchemy.or_

Relationships

But relational databases are all about relationships! Our models can define foreign key columns and SQLAlchemy is also happy to provide automatic interfaces to make it easy to access related objects.

config.py
link:models2.py[role=include]

We can create relationships by setting the user_id field:

>>> from models2 import User, Page, Session, engine, Base
>>> session = Session()
>>> p = Page(url="/about")
>>> u = session.query(User).filter_by(id=1).first()
>>> p.user_id = u.id
>>> session.add(p)
>>> session.commit()
>>> p.user  # And the user object is directly available
<User instance: 1>

Or we can set the user object directly:

>>> u = session.query(User).filter_by(id=2).first()
>>> p.user = u
>>> session.add(p)
>>> session.commit()
>>> p.user
<User instance: 2>

And we can access the relationship the other way thanks to our backref!

>>> u.pages
INFO: SELECT page.id AS page_id, page.user_id AS page_user_id, page.url AS page_url, page.content AS page_content
       FROM page
       WHERE page.user_id = ?
INFO: (2,)
[<Page instance: 1>]

Note that this gets us the "many" side of our relationship. There could be many pages with user_id field pointing to the same user!

Exercise

  1. Use model2.py to create your new table. Add some pages by instanciating Page and User objects.

  2. Print all the users and the count of pages they own.

EXTRA: This isn’t the most efficient way to do it! Try:

from sqlalchemy import func

and pass func.count(Page.id) to .query along with the User class.