Aesthetics
Wiki ▸ API Reference ▸ Aesthetics
The term aesthetics is used throught the library to denote properties of an object on the screen that we can see. For example, the colour of a bar, the y-position of a circle, the numerical value displayed in a text object, and the column in a table are all aesthetics. Throughout the Polychart.js library, aesthetics can be mapped to a data column, a constant, or some data series derived from the existing data. This page outlines the possibilities by defining the attributes of the aesthetic mapping object o
.
The simplest and most usual case is mapping an aesthetic to an existing data column. For example, mapping the x-axis to a column year
in a time series data set. This can be done by setting the attribute o.var
.
o = {
'var': 'year'
}
Equivalently and as a shortcut, you can also directly assign the aesthetic mapping object o
to be a string representing the data column.
o = 'year'
In case that the name of a data column contains a space or some other special character, the data column name should be wrapped in square brackets. If there are square brackets or backslashes in the name, they should be escaped with a backslash. (e.g. '[Year Of Birth]', '[This Has A Square Bracket [ Here]').
NOTE: In Polychart.js version 1.1 and earlier, the data column name was to be wrapped in quotes. However this will no longer work in version 1.2 and later.
Sometimes, we would like to set an aesthetic to a constant value. For example, we may wish to colour all the points in a scatter plot blue. This can be done by setting the o.const
attribute. Note that only one of o.var
and o.const
should be set.
o = {
'const': 'blue'
}
In addition to mapping aesthetics to data columns, one can also map an aesthetics to a data series derived from the original data set. Specifically, one can map aesthetics to a function of one or more data columns. For example:
o = {
'var': 'log([Gross Domestic Product] + 1)'
}
or equivalently
o = 'log([Gross Domestic Product] + 1)'
calculates the logarithm of one more than the data column "Gross Domestic Product".
Functions/Transforms can be as simple or nested as necessary. A list of all available functions are below.
One particular function of note is the bin
function, which bins continuous (numerical and date) columns into various widths. This function is required if you wish to map an aesthetic which is discrete in nature to continuous data. For example, if one wish to map the x-axis of a bar chart to a numerical column (e.g. to create a histogram), then that numerical column needs to be binned to help determine how wide each bar should be. In this case, one would instead of mapping x: 'numeric column'
, map x: 'bin(numeric_column, 10'
.
Other examples of aesthetics for which binning is required are pivot table columns and rows, x- and y-position of tile charts, and x-position of box plots.
Note: If you have pre-binned values or discrete numeric values (e.g. integers) to be plotted in a chart, then you should either bin the values as if it is continuous, or provide the appropriate binwidth when setting the guides. (See the binwidth section of the guides page)
Polychart.js can also compute aggregate statistics functions such as sum
, mean
, and count
, making it easier create charts with normalized data sources.
Syntaxically, aggregate statistics look exactly like function transforms described above. However they are different in that function transforms always takes one data point and returns one data point, whereas statistics performs aggregation over data points in some grouping. Transforms can be thought of as mathematical operations, where as statistics are aggregate measures.
Statistic calculation produces one aggregated vale per group. Groups are imputed automatically based on the aesthetic mappings applied. Specifically, data points are grouped based on the values of its non-aggregated mappings. For example, both of the following sets of mappings in a chart:
x: 'bin(date, week)',
y: 'sum(num_events)',
color: 'event_type'
and
x: 'bin(date, week)',
y: 'event_type',
color: 'sum(num_events)'
will have the summation statistics calculated for each unique values of bin(date, week)
and event_type
. As another example, in the below mappings
x: 'event_type',
y: 'sum(num_events)',
color: 'mean(num_events)',
size: 'sum(event_duration)'
will have all three aggregations sum(num_events)
, mean(num_events)
and sum(event_duration)
performed for each unique value of event_type
.
In general, grouping will be performed using all non-statistical aesthetic mappings. This is true for both pivot tables and charts.
One can also sort one data series based on another data series. For example, sorting a data column called 'region' by the value of 'sum(sales)' would order the regions based on how much sales that region had achieved. One can also specify to only keep the top n regions based on sales.
A different data series (i.e. data column, or some function transforms or statistics derived from one or more data columns) to sort by. (optional)
Whether the sort should be ascending or descending. (optional)
The number of values to keep. (optional)
Call functions using the function name, and brackets enclosing parameters.
e.g. log([x]), indexOf([x], "a")
To evaluate one of two expressions based on a conditional, use the expression if [cond] then [conseq] else [altern]. Make sure that the [conseq] and [altern] are of the same type (both should be numbers, categories, or dates)
e.g. if indexOf([x], "a") == -1 then "aNotInX" else "aInX"
adding, subtracting, multiplying, dividing, and taking modulos of two numbers
evaluations of equalities and inequalities
takes the logarithm of a number
bins a number by the bin width. The binwidth should be a number
e.g.
bin('sales', 1000)
bin(10023, 1000) = 10000
bin(12523, 1000) = 12000
concatenates two strings
e.g. "A" ++ "B" = "AB"
take a substring of a string at the desired starting location and length
e.g. substr("hello", 1, 2) = "el"
finds the length of a string
e.g. length("hello") = 5
make all letters in a string upper case
e.g. upper("myString500") = "MYSTRING500"
make all letters in a string lower case
e.g. lower("myString500") = "mystring500"
return the first index at which a substring occurs in a string, or -1 if it does not exist
e.g.
indexOf("myString500", "String") = 2
indexOf("haystack", "needle") = -1
turn a string into a number
e.g. parseNum("500") = 500
bins a date by the bin width, which is a string representing a time period. The binwidth can be:
second
, minute
, hour
, day
, week
, month
, twomonth
, quarter
, sixmonth
, year
, twoyear
, fiveyear
, decade
extracts the year from a date
extracts the month from a date, in integers, where January is month 1
extracts the day of the month from a date
extracts the day of the year from a date
extracts the day of the week from a date, where Sunday is 0, Saturday is 6
extracts the hour from a date, from 0 to 23
extracts the minute from a date
extracts the second from a date
The sum of a numeric variable over some grouping.
The total number of defined, non-null values over some grouping.
The mean of a numeric variable over some grouping.
The total number of unique defined, non-null values over some grouping.
The minimum of a numeric variable over some grouping.
The maximum of a numeric variable over some grouping.
The median of a numeric variabel over some grouping.
Calculate the quantiles and outliers required for box plot.
Note that box plot assumes that the box statistics is calculated for the y-mapping. Thus, the y-mapping for a box layer should be in the form box(var). The box statistics calculates the quantiles and outliers for each grouping. If you would like to specify the quantiles and outliers directly, you can use the following format
{
q1: line_bottom,
q2: box_bottom,
q3: median,
q4: box_top,
q5: line_top,
outliers: [list, of, outliers]
}