Skip to content
Lisa Zhang edited this page Dec 3, 2013 · 2 revisions

WikiAPI ReferenceAesthetics

The term aesthetics is used throught the library to denote properties of an object on the screen that we can see. For example, the colour of a bar, the y-position of a circle, the numerical value displayed in a text object, and the column in a table are all aesthetics. Throughout the Polychart.js library, aesthetics can be mapped to a data column, a constant, or some data series derived from the existing data. This page outlines the possibilities by defining the attributes of the aesthetic mapping object o.

Mapping to Data Column

The simplest and most usual case is mapping an aesthetic to an existing data column. For example, mapping the x-axis to a column year in a time series data set. This can be done by setting the attribute o.var.

o = {
  'var': 'year'
}

Equivalently and as a shortcut, you can also directly assign the aesthetic mapping object o to be a string representing the data column.

o = 'year'

In case that the name of a data column contains a space or some other special character, the data column name should be wrapped in square brackets. If there are square brackets or backslashes in the name, they should be escaped with a backslash. (e.g. '[Year Of Birth]', '[This Has A Square Bracket [ Here]').

NOTE: In Polychart.js version 1.1 and earlier, the data column name was to be wrapped in quotes. However this will no longer work in version 1.2 and later.

Mapping to Constant

Sometimes, we would like to set an aesthetic to a constant value. For example, we may wish to colour all the points in a scatter plot blue. This can be done by setting the o.const attribute. Note that only one of o.var and o.const should be set.

o = {
  'const': 'blue'
}

Functions/Transforms

In addition to mapping aesthetics to data columns, one can also map an aesthetics to a data series derived from the original data set. Specifically, one can map aesthetics to a function of one or more data columns. For example:

o = {
  'var': 'log([Gross Domestic Product] + 1)'
}

or equivalently

o = 'log([Gross Domestic Product] + 1)'

calculates the logarithm of one more than the data column "Gross Domestic Product".

Functions/Transforms can be as simple or nested as necessary. A list of all available functions are below.

One particular function of note is the bin function, which bins continuous (numerical and date) columns into various widths. This function is required if you wish to map an aesthetic which is discrete in nature to continuous data. For example, if one wish to map the x-axis of a bar chart to a numerical column (e.g. to create a histogram), then that numerical column needs to be binned to help determine how wide each bar should be. In this case, one would instead of mapping x: 'numeric column', map x: 'bin(numeric_column, 10'.

Other examples of aesthetics for which binning is required are pivot table columns and rows, x- and y-position of tile charts, and x-position of box plots.

Note: If you have pre-binned values or discrete numeric values (e.g. integers) to be plotted in a chart, then you should either bin the values as if it is continuous, or provide the appropriate binwidth when setting the guides. (See the binwidth section of the guides page)

Aggregate Statistics

Polychart.js can also compute aggregate statistics functions such as sum, mean, and count, making it easier create charts with normalized data sources.

Syntaxically, aggregate statistics look exactly like function transforms described above. However they are different in that function transforms always takes one data point and returns one data point, whereas statistics performs aggregation over data points in some grouping. Transforms can be thought of as mathematical operations, where as statistics are aggregate measures.

Statistics Grouping

Statistic calculation produces one aggregated vale per group. Groups are imputed automatically based on the aesthetic mappings applied. Specifically, data points are grouped based on the values of its non-aggregated mappings. For example, both of the following sets of mappings in a chart:

x: 'bin(date, week)',
y: 'sum(num_events)',
color: 'event_type'

and

x: 'bin(date, week)',
y: 'event_type',
color: 'sum(num_events)'

will have the summation statistics calculated for each unique values of bin(date, week) and event_type. As another example, in the below mappings

x: 'event_type',
y: 'sum(num_events)',
color: 'mean(num_events)',
size: 'sum(event_duration)'

will have all three aggregations sum(num_events), mean(num_events) and sum(event_duration) performed for each unique value of event_type.

In general, grouping will be performed using all non-statistical aesthetic mappings. This is true for both pivot tables and charts.

Sorting and Filtering

One can also sort one data series based on another data series. For example, sorting a data column called 'region' by the value of 'sum(sales)' would order the regions based on how much sales that region had achieved. One can also specify to only keep the top n regions based on sales.

o.sort

A different data series (i.e. data column, or some function transforms or statistics derived from one or more data columns) to sort by. (optional)

o.asc

Whether the sort should be ascending or descending. (optional)

o.limit

The number of values to keep. (optional)

Function Transforms List

Calling Functions

Call functions using the function name, and brackets enclosing parameters.

e.g. log([x]), indexOf([x], "a")

Control Flow

To evaluate one of two expressions based on a conditional, use the expression if [cond] then [conseq] else [altern]. Make sure that the [conseq] and [altern] are of the same type (both should be numbers, categories, or dates)

e.g. if indexOf([x], "a") == -1 then "aNotInX" else "aInX"

Numeric Functions

+, -, *, /, %

adding, subtracting, multiplying, dividing, and taking modulos of two numbers

<, <=, >, >=, !=, ==

evaluations of equalities and inequalities

log(num)

takes the logarithm of a number

bin(value, binwidth)

bins a number by the bin width. The binwidth should be a number

e.g.

bin('sales', 1000)
bin(10023, 1000) = 10000
bin(12523, 1000) = 12000

String Functions

++

concatenates two strings

e.g. "A" ++ "B" = "AB"

substr(str, start, length)

take a substring of a string at the desired starting location and length

e.g. substr("hello", 1, 2) = "el"

length(str)

finds the length of a string

e.g. length("hello") = 5

upper(str)

make all letters in a string upper case

e.g. upper("myString500") = "MYSTRING500"

lower(str)

make all letters in a string lower case

e.g. lower("myString500") = "mystring500"

indexOf(str, substr)

return the first index at which a substring occurs in a string, or -1 if it does not exist

e.g.

indexOf("myString500", "String") = 2
indexOf("haystack", "needle") = -1

parseNum(str)

turn a string into a number
e.g. parseNum("500") = 500

Date Functions

bin(value, binwidth)

bins a date by the bin width, which is a string representing a time period. The binwidth can be: second, minute, hour, day, week, month, twomonth, quarter, sixmonth, year, twoyear, fiveyear, decade

year(dt)

extracts the year from a date

month(dt)

extracts the month from a date, in integers, where January is month 1

dayOfMonth(dt)

extracts the day of the month from a date

dayOfYear(dt)

extracts the day of the year from a date

dayOfWeek(dt)

extracts the day of the week from a date, where Sunday is 0, Saturday is 6

hour(dt)

extracts the hour from a date, from 0 to 23

minute(dt)

extracts the minute from a date

second(dt)

extracts the second from a date

Statistics List

sum(col)

The sum of a numeric variable over some grouping.

mean(col)

The total number of defined, non-null values over some grouping.

count(col)

The mean of a numeric variable over some grouping.

unqiue(col)

The total number of unique defined, non-null values over some grouping.

min(col)

The minimum of a numeric variable over some grouping.

max(col)

The maximum of a numeric variable over some grouping.

median(col)

The median of a numeric variabel over some grouping.

box(col)

Calculate the quantiles and outliers required for box plot.

Note that box plot assumes that the box statistics is calculated for the y-mapping. Thus, the y-mapping for a box layer should be in the form box(var). The box statistics calculates the quantiles and outliers for each grouping. If you would like to specify the quantiles and outliers directly, you can use the following format

{
  q1: line_bottom,
  q2: box_bottom,
  q3: median,
  q4: box_top,
  q5: line_top,
  outliers: [list, of, outliers]
}