Skip to content

Acosix/alfresco-audit

Repository files navigation

About

This addon aims to provide some general use functionality and utilities relating to the Auditing feature within Alfresco Content Services.

Compatbility

This module is built to be compatible with Alfresco 5.0 and above. It may be used on either Community or Enterprise Edition.

Features

User Login auditing (aka active user audit log)

In some use cases (e.g. Alfresco Enterprise license management) it is important to know which users actually use an Alfresco system. While auditing actions or changes to nodes (default alfresco-access audit application) can provide that information, it has potential gaps: it may be filtered based on user names and logins only show password-based login without supporting SSO.

This addon provides a purely authentication-focused audit application (acosix-audit-activeUsersLogin) that records each instance of a successful password-based authentication, as well as any authentication that properly informs authentication listeners (web script authenticators and NTLM / Kerberos SSO for WebDAV / Alfresco Office Services). Only the user name and type of web credentials (if any) are recorded.

Support for SSO and web script authenticators is dependent on custom authentication listeners that plug into pre-defined beans. This requires that the module de.acosix.alfresco.utility.repo has been installed and configured to augment the default no-op listeners with a multiple listeners-aware facade (active by default, configured via acosix-utility.web.auth.multipleAuthenticationListeners.enabled). The listeners of this addon must also be enabled (active by default, configured via acosix-audit..auth.listener.enabled).

Depending on the authentication method and frequency of HTTP calls to the Repository, quite a lot of audit entries may be created in a short duration. Entries are only kept for a limited time (14 days, defined as ISO 8601 period via acosix-audit.job.activeUserLoginCleanup.cutOffPeriod) and cleared at specific intervals (1 AM every day, defined as CRON via acosix-utility.job.activeUserLoginCleanup.cron). For long-term use the audit data is regularly consolidated (5 minutes, defined as CRON via acosix-audit.job.consolidateActiveUsersAudit.cron) into a separate audit application (acosix-audit-activeUsers). This stores the user name and the start/end of a reporting time frame (1 hour, defined as number of hours (divisors of 24) via acosix-utility.job.consolidateActiveUsersAudit.timeframeHours) in which the user has logged into Alfresco at least once.

Incremental cleanup of alf_prop_* tables

When Auditing entries or AttributeService entries are being deleted, Alfresco does not actually delete all of the associated data. The structures of the alf_prop_* tables are designed to heavily reuse individual textual or numerical data elements, much to the point that cascade deletion upon removal of audit entries or attributes is no longer possible as the same values could be referenced in other elements.

Since Alfresco 4.1.9, 4.2.2 (Enterprise) and 5.0 (Community), Alfresco includes a default job to clean up dangling data in alf_prop_* tables. This job is disabled via a CRON expression that is guaranteed to never run and must be re-configured to be able to run. This job is a brute-force approach to deleting dangling data - it tries to clear all entries in one single transaction. This may be inappropriate for constellations where data has accumulated over many years or the system cannot be bogged down by expensive database operations.

The incremental cleanup provided by this addon is composed of multiple jobs that iteratively check sub-sets of data entries for being actively referenced. By default they are configured to run between 9 PM and 5 AM in a staggered pattern. The following jobs are part of this feature:

  • propertyRootsCleanup
  • propertyValuesCleanup
  • propertyStringValuesCleanup
  • propertySerializableValuesCleanup
  • propertyDoubleValuesCleanup

Each job can be configured via alfresco-global.properties using the key pattern _acosix-audit.<jobName>.>setting<. The following settings are supported:

  • cron - the CRON expression determining the time to run
  • batchSize - the amount of sub-sets of entries to process in a batch
  • workerCount - the number of parallel threads to process the job
  • idsPerWorkItem - the size of entry sub-sets to process as an individual work item
  • checkItemsLimit - the number of entries to check in one run of the job to limit the execution time / time of load on the database

Web Scripts to query active / inactive users

The Repository-tier web scripts at URLs /alfresco/s/acosix/api/audit/activeUsers and /alfresco/s/acosix/api/audit/inactiveUsers provide reports about (in)active users based on audit data. These web scripts check each user that exists as a cm:person node against the audit data within a particular time frame and include them in the report when they can / cannot be associated with a single audit entry in that time frame. The web scripts utilise batch execution to avoid issues with overflowing transactional caches.

Parameters:

  • lookBackMode - mode/unit for defining the time frame; default value: months, allowed values: days, months, years
  • lookBackAmount - number of units for defining the time frame, default value: 1 (mode=years), 3 (mode=months), 90 (mode=days)
  • workerThreads - the amount of parallel execution, default value: 4
  • batchSize - the size of individual batches, default value: 20

By default the web scripts will use the acosix-audit-activeUsers audit application as the source of data. This can be reconfigured to use any audit application, e.g. the default alfresco-access. All configuration properties share the same prefix of acosix-audit.web.script.activeUser.. The following properties are supported:

  • auditApplicationName - the name of the audit application to use (default: acosix-audit-activeUsers)
  • userAuditPath - the path to the user name within the audit data to filter queries against; if empty, the user name associated with the audit entry will be used to query
  • dateFromAuditPath - the path to a date or ISO 8601 string value within the audit data that denotes the start of a timeframe in which the user was active; must be set together with dateToAuditPath (default: /acosix-audit-activeUsers/timeframeStart)
  • dateToAuditPath - the path to a date or ISO 8601 string value within the audit data that denotes the end of a timeframe in which the user was active; must be set together with dateFromAuditPath (default: /acosix-audit-activeUsers/timeframeEnd)
  • dateAuditPath - the path to a date or ISO 8601 string value within the audit data that denotes an effective date at which the user was active
  • defaultLookBackMode - the default lookBackMode if no parameter is provided in the web script call (default: months)
  • defaultLookBackDays - the amount of days to look back if lookBackMode is "days" and no parameter is provided in the web script call (default: 90)
  • defaultLookBackMonths - the amount of months to look back if lookBackMode is "months" and no parameter is provided in the web script call (default: 3)
  • defaultLookBackYears - the amount of years to look back if lookBackMode is "years" and no parameter is provided in the web script call (default: 1)
  • defaultBatchSize - the size of an atomic batch of users to process if no parameter is provided in the web script call (default: 10)
  • defaultWorkerThreads - the number of parallel worker threads to use if no parameter is provided in the web script call (default: 4)
  • defaultLoggingInterval - the number of processed users after which to log process information (default: 50)

If none of the date-related configuration properties are set to a valid constellation, the date of the audit entries will be used as input to the report of the web scripts.

Reports are provided in JSON or CSV format, with JSON being the default if a specific format is not reqeusted by using the URL parameter ?format=xxx or adding a file extension to the URL. The report of active users will include the earliest and latest date within the reporting time frame at which the user was active - this may be the abstract boundaries of "user interaction time frames" if defined and extracted from the underlying audit application.

Build

This project uses a Maven build using templates from the Acosix Alfresco Maven project and produces module AMPs, regular Java classes JARs, JavaDoc and source attachment JARs, as well as installable (Simple Alfresco Module) JAR artifacts for the Alfresco Content Services and Share extensions. If the installable JAR artifacts are used for installing this module, developers / users are advised to consult the 'Dependencies' section of this README.

Maven toolchains

By inheritance from the Acosix Alfresco Maven framework, this project uses the Maven Toolchains plugin to allow potential cross-compilation against different Java versions. This plugin is used to avoid potentially inconsistent compiler and library versions compared to when only the source/target compiler options of the Maven compiler plugin are set, which (as an example) has caused issues with some Alfresco releases in the past where Alfresco compiled for Java 7 using the Java 8 libraries. In order to build the project it is necessary to provide a basic toolchain configuration via the user specific Maven configuration home (usually ~/.m2/). That file (toolchains.xml) only needs to list the path to a compatible JDK for the Java version required by this project. The following is a sample file defining a Java 7 and 8 development kit.

<?xml version='1.0' encoding='UTF-8'?>
<toolchains xmlns="http://maven.apache.org/TOOLCHAINS/1.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/TOOLCHAINS/1.1.0 http://maven.apache.org/xsd/toolchains-1.1.0.xsd">
  <toolchain>
    <type>jdk</type>
    <provides>
      <version>1.8</version>
      <vendor>oracle</vendor>
    </provides>
    <configuration>
      <jdkHome>C:\Program Files\Java\jdk1.8.0_112</jdkHome>
    </configuration>
  </toolchain>
  <toolchain>
    <type>jdk</type>
    <provides>
      <version>1.7</version>
      <vendor>oracle</vendor>
    </provides>
    <configuration>
      <jdkHome>C:\Program Files\Java\jdk1.7.0_80</jdkHome>
    </configuration>
  </toolchain>
</toolchains>

The master branch requires Java 8.

Docker-based integration tests

In a default build using mvn clean install, this project will build the extension for Alfresco Content Services, executing regular unit-tests without running integration tests. The integration tests of this project are based on Docker and require a Docker engine to run the necessary components (PostgreSQL database as well as Alfresco Content Services). Since a Docker engine may not be available in all environments of interested community members / collaborators, the integration tests have been made optional. A full build, including integration tests, can be run by executing

mvn clean install -Ddocker.tests.enabled=true

This project currently does not contain any integration tests, but may do so in the future.

Dependencies

This module depends on the following projects / libraries:

  • Acosix Alfresco Utility (Apache License, Version 2.0) - core extension

When the installable JAR produced by the build of this project is used for installation, the developer / user is responsible to either manually install all the required components / libraries provided by the listed projects, or use a build system to collect all relevant direct / transitive dependencies.

An example with all dependencies and the artefacts delivered as amps looks like this in the pom.xml file in the *-platform-docker project.

...
  <dependency>
    ...
    <dependency>
        <groupId>de.acosix.alfresco.utility</groupId>
        <artifactId>de.acosix.alfresco.utility.core.repo</artifactId>
        <version>1.3.2</version>
        <type>amp</type>
    </dependency>
    <dependency>
        <groupId>de.acosix.alfresco.audit</groupId>
        <artifactId>de.acosix.alfresco.audit.repo</artifactId>
        <version>1.1.0</version>
        <type>amp</type>
    </dependency> 
    ...
  </dependencies>
  ...

Note: The Acosix Alfresco Utility project is also built using templates from the Acosix Alfresco Maven project, and as such produces similar artifacts. Automatic resolution and collection of (transitive) dependencies using Maven / Gradle will resolve the Java classes JAR as a dependency, and not the installable (Simple Alfresco Module) variant. It is recommended to exclude Acosix Alfresco Utility from transitive resolution and instead include it directly / explicitly.

Note: The feature to audit user login events requires the full extension of Acosix Alfresco Utility, which adds a patch to support more than one authentication listener to Alfresco.