Skip to content
This repository has been archived by the owner on Dec 13, 2023. It is now read-only.

Commit

Permalink
Merge pull request #84 from fastly/zma/add_journal
Browse files Browse the repository at this point in the history
Initial commit for build journal. We will use tools/saftw.py to gener…
  • Loading branch information
zmallen committed Feb 14, 2017
2 parents 057af7f + 9cfaaf4 commit 7db7c22
Show file tree
Hide file tree
Showing 12 changed files with 355 additions and 44 deletions.
12 changes: 12 additions & 0 deletions MANIFEST
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# file GENERATED by distutils, do NOT edit
setup.cfg
setup.py
ftw/__init__.py
ftw/errors.py
ftw/http.py
ftw/logchecker.py
ftw/ruleset.py
ftw/testrunner.py
ftw/util.py
test/test_default.py
test/test_modsecurityv2.py
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ Goals / Use cases include:
## Provisioning Apache+Modsecurity+OWASP CRS
If you require an environment for testing WAF rules, there has been one created with Apache, Modsecurity and version 3.0.0 of the OWASP core ruleset. This can be deployed by:

* Checking out the repository: ``git clone https://github.com/fastly/waf_testbed.git```
* Typeing ```vagrant up```
* Checking out the repository: ``git clone https://github.com/fastly/waf_testbed.git``
* Typing ```vagrant up```

## Running Tests while overriding destination address in the yaml files to custom domain
* *start your test web server*
Expand Down
106 changes: 106 additions & 0 deletions docs/Journaling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
===============
Journaling
===============

FTW supports the process of creating journal entries for your HTTP tests. The idea behind this stems from the need to decouple the sending of attacks with testing the responses. This might be better explained with the following use cases:

1. A pentester needs to issue attacks against a WAF but does not have access to the logs at the time of the test/series of attacks. A journal of attack requests and responses will help the pentester by correlating a database of FTW requests and responses with customer logs at a later time.

2. A security engineer integrating FTW into their WAF environment does not want to check each FTW attack/response pair against a log. This is especially painful for cases where logs are sent to a network service and the tool _beats_ the log service by checking for a log as its being sent, being indexed etc. This is not ideal because we run into a halting problem where we cannot guess ahead of time how long to wait before we check the log service for the existence of a log. With this method, the security engineer can fire off an attack and then batch check the logs at a later date when he or she knows they can query a window of time without having to worry about network latency.

Workflow
==================
The workflow is twofold. Run the FTW tool `build_journal.py` against a service with a WAF in front of it and collect response data. Once all of the response data is retrieved, run FTW as you would in any other integration scenario, but write a plugin that opens the sqlite database to retrieve logs instead of a file or a network API.

Usage - Build the Journal
==================

1. `git clone git@github.com:fastly/ftw.git`
2. `virtualenv ftwenv`
3. `./ftwenv/bin/activate`
4. `pip install -r requirements.txt`
5. `./tools/build_journal.py --ruledir=dir`
* This will produce `journal.sqlite`
* Check out the options in `build_journal.py` for specifying journal files, table names

Once these steps are complete, you will have a `sqlite` file that you can explore and query by rule-id, time etc.


Usage - Using the Journal
==================

Because FTW was built with intention of custom integrations, testers can follow similar steps of found in Step 4 of `ExtendingFTW.md`.

A new API in the rulerunner was created to pass in journal files to run FTW against. The testrunner will still need the `logchecker_obj` to call `get_logs()`, since it is correlating sqlite output with log output. Implement a logchecker just like the ones outlined in `ExtendingFTW.md`, and FTW will handle retrieving the correct logs from sqlite for you.

We will use an example from `SpiderLabs/OWASP-CRS-regressions` as the example:

```python
from ftw import ruleset, logchecker, testrunner
import pytest
import sys
import re
import os
import ConfigParser

def test_crs(ruleset, test, logchecker_obj, with_journal, tablename):
runner = testrunner.TestRunner()
for stage in test.stages:
runner.run_stage_with_journal(test.ruleset_meta['name'], test, with_journal, tablename, logchecker_obj)

class FooLogChecker(logchecker.LogChecker):

def reverse_readline(self, filename):
with open(filename) as f:
f.seek(0, os.SEEK_END)
position = f.tell()
line = ''
while position >= 0:
f.seek(position)
next_char = f.read(1)
if next_char == "\n":
yield line[::-1]
line = ''
else:
line += next_char
position -= 1
yield line[::-1]

def get_logs(self):
import datetime
config = ConfigParser.ConfigParser()
config.read("settings.ini")
log_location = config.get('settings', 'log_location')
our_logs = []
pattern = re.compile(r"\[([A-Z][a-z]{2} [A-z][a-z]{2} \d{1,2} \d{1,2}\:\d{1,2}\:\d{1,2}\.\d+? \d{4})\]")
for lline in self.reverse_readline(log_location):
# Extract dates from each line
match = re.match(pattern,lline)
if match:
log_date = match.group(1)
# Convert our date
log_date = datetime.datetime.strptime(log_date, "%a %b %d %H:%M:%S.%f %Y")
ftw_start = self.start
ftw_end = self.end
# If we have a log date in range
if log_date <= ftw_end and log_date >= ftw_start:
our_logs.append(lline)
# If our log is from before FTW started stop
if(log_date < ftw_start):
break
return our_logs

@pytest.fixture
def logchecker_obj():
return FooLogChecker()
```

Some notes here:
* The FooLogChecker inherits logcherk.LogChecker so FTW knows it can call the `get_logs()` method
* We initiate a decorated `@pytest.fixture` so we can pass in a `logchecker_obj` when `test_crs` is called
* The `test_crs()` method looks similar to most FTW integrations, except it has two extra fixtures: `with_journal` and `tablename`
* When running `py.test`, pass in `with_journal=/path/to/journal` and `tablename=name` so it can be passed to the testrunner correctly. This will ensure FTW will query the correct journalfile and tablename for the FTW response data
* Since each stage must be tested and queried, we pass in the `test` fixture and run through each stage with `for stage in test.stages`
* `runner.run_stage_with_journal` requires the name of the test, the test object, the with_journal path, tablename and the corresponding logchecker_obj

Once you adhere to the new API call for the testrunner, that should be it! FTW will handle querying the sqlite table to get the correct rule-ids, stage-ids and times and return those log lines back to `get_logs()` to test on your log file.
45 changes: 21 additions & 24 deletions ftw/pytest_plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,27 +3,6 @@
import util
import os

def get_rulesets(ruledir, recurse):
"""
List of ruleset objects extracted from the yaml directory
"""
if os.path.isdir(ruledir) and recurse:
yaml_files = []
for root, dirs, files in os.walk(ruledir):
for name in files:
filename, file_extension = os.path.splitext(name)
if file_extension == '.yaml':
yaml_files.append(os.path.join(root, name))
if os.path.isdir(ruledir) and not recurse:
yaml_files = util.get_files(ruledir, 'yaml')
elif os.path.isfile(ruledir):
yaml_files = [ruledir]
extracted_files = util.extract_yaml(yaml_files)
rulesets = []
for extracted_yaml in extracted_files:
rulesets.append(ruleset.Ruleset(extracted_yaml))
return rulesets

def get_testdata(rulesets):
"""
In order to do test-level parametrization (is this a word?), we have to
Expand Down Expand Up @@ -65,6 +44,20 @@ def http_serv_obj():
"""
return HTTPServer(('localhost', 80), SimpleHTTPRequestHandler)

@pytest.fixture
def with_journal(request):
"""
Return full path of the testing journal
"""
return request.config.getoption('--with-journal')

@pytest.fixture
def tablename(request):
"""
Set table name for journaling
"""
return request.config.getoption('--tablename')

def pytest_addoption(parser):
"""
Adds command line options to py.test
Expand All @@ -77,6 +70,10 @@ def pytest_addoption(parser):
help='fully qualified path to one rule')
parser.addoption('--ruledir_recurse', action='store', default=None,
help='walk the directory structure finding YAML files')
parser.addoption('--with-journal', action='store', default=None,
help='pass in a journal database file to test')
parser.addoption('--tablename', action='store', default=None,
help='pass in a tablename to parse journal results')

def pytest_generate_tests(metafunc):
"""
Expand All @@ -87,11 +84,11 @@ def pytest_generate_tests(metafunc):
# Check if we have any arguments by creating a list of supplied args we want
if [i for i in options if i in args and args[i] != None] :
if metafunc.config.option.ruledir:
rulesets = get_rulesets(metafunc.config.option.ruledir, False)
rulesets = util.get_rulesets(metafunc.config.option.ruledir, False)
if metafunc.config.option.ruledir_recurse:
rulesets = get_rulesets(metafunc.config.option.ruledir_recurse, True)
rulesets = util.get_rulesets(metafunc.config.option.ruledir_recurse, True)
if metafunc.config.option.rule:
rulesets = get_rulesets(metafunc.config.option.rule, False)
rulesets = util.get_rulesets(metafunc.config.option.rule, False)
if 'ruleset' in metafunc.fixturenames and 'test' in metafunc.fixturenames:
metafunc.parametrize('ruleset,test', get_testdata(rulesets),
ids=test_id)
97 changes: 97 additions & 0 deletions ftw/testrunner.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import datetime
from dateutil import parser
import errors
import http
import pytest
import ruleset
import util
import re
import sqlite3

class TestRunner(object):
"""
Expand Down Expand Up @@ -54,6 +56,101 @@ def test_response(self, response_object, regex):
else:
assert False

def test_response_str(self, response, regex):
"""
Checks if the response response contains a regex specified in the
output stage. It will assert that the regex is present.
"""
if regex.search(response):
assert True
else:
assert False

def query_for_stage_results(self, tablename):
"""
Construct query for sqlite database for a specific stage run from a journal
Possible SQL injection here, but since its sqlite and if someone had control of the python script
and the sqlite database, they can just open the database/modify it without using our program
"""
q = 'SELECT * FROM %s WHERE stage = ? AND test_id = ?' % tablename
return q

def run_stage_with_journal(self, rule_id, test, journal_file, tablename, logger_obj):
"""
Compare entries and responses in a journal file with a logger object
This will follow similar logic as run_stage, where a logger_obj.get_logs()
MUST be implemented by the user so times can be retrieved and compared
against the responses logged in the journal db
"""
assert logger_obj is not None
conn = sqlite3.connect(journal_file)
conn.text_factory = str
cur = conn.cursor()
for i, stage in enumerate(test.stages):
'''
Query DB here for rule_id & test_title
Compare against logger_obj
'''
q = self.query_for_stage_results(tablename)
results = cur.execute(q, [i, test.test_title]).fetchall()
if len(results) == 0:
raise errors.TestError(
'SQL Query did not return results for test',
{
'rule_id': rule_id,
'test': test.test_title,
'query': q,
'stage_num': i,
'function': 'testrunner.TestRunner.run_stage_with_journal'
})
result = results[0]
start = parser.parse(result[2])
end = parser.parse(result[3])
response = result[4]
status = result[5]
if (stage.output.log_contains_str or stage.output.no_log_contains_str):
logger_obj.set_times(start, end)
lines = logger_obj.get_logs()
if stage.output.log_contains_str:
self.test_log(lines, stage.output.log_contains_str, False)
if stage.output.no_log_contains_str:
# The last argument means that we should negate the resp
self.test_log(lines, stage.output.no_log_contains_str, True)
if stage.output.response_contains_str:
self.test_response_str(response,
stage.output.response_contains_str)
if stage.output.status:
self.test_status(stage.output.status, status)

def run_test_build_journal(self, rule_id, test, journal_file, tablename):
"""
Build journal entries from a test within a specified rule_id
Pass in the rule_id, test object, and path to journal_file
DB MUST already be instantiated from util.instantiate_database()
"""
conn = sqlite3.connect(journal_file)
conn.text_factory = str
cur = conn.cursor()
for i, stage in enumerate(test.stages):
response = None
status = None
try:
print 'Running test %s from rule file %s' % (test.test_title, rule_id)
http_ua = http.HttpUA()
start = datetime.datetime.now()
http_ua.send_request(stage.input)
response = http_ua.response_object.response
status = http_ua.response_object.status
except errors.TestError as e:
print '%s got error. %s' % (test.test_title, str(e))
response = str(e)
status = -1
finally:
end = datetime.datetime.now()
ins_q = util.get_insert_statement(tablename)
cur.execute(ins_q, (rule_id, test.test_title, start, end, response, status, i))
conn.commit()

def run_stage(self, stage, logger_obj=None, http_ua=None):
"""
Runs a stage in a test by building an httpua object with the stage
Expand Down

0 comments on commit 7db7c22

Please sign in to comment.