Skip to content
MB6 edited this page Mar 3, 2016 · 4 revisions

Commons Control Data Scraping

This document describes the different data types and protocols of each of the building systems.

Overview

The server copy of this repository is in /home/dataupload/code/commonssite. The server uses Django to handle HTTP requests and serving pages. It was written with Django 1.6, which is of course no longer supported, but upgrading would not be straightforward.

Django uses the concept of 'apps' to manage sub-modules of a site. We have one app per building system, plus a generic timeseries app that defines base classes from which other systems' classes are defined. In an attempt to make things extensible (that may have backfired), each building system's classes derive from timeseries base classes, and the the base class uses introspection to automatically build web forms to access data from any system (models defined in other apps that derive from timeseries.models.TimeseriesBase inherit a the time, temporary, and error fields automatically). We use UTC time stamps in the database.

The timeseries base classes also handle the logic for averaging data; we keep one data point ever 30 or 60 seconds, flagged with a temporary=1 field. The ScraperBase class in timeseries.scrapers has a function to average together multiple of these temporary data points into a single aggregate point (one per 20 minutes). For numeric data, it integrates over time and takes the average. For non-numeric data it computes the plurality vote. The results of this averaging are saved as a new data point flagged with temporary=0. Data flagged with temporary=1 that are older than 24 hours are removed.

A more robust system would have MySQL do these averaging calculations for us instead of pulling the data into python, possibly storing the results into a table or maybe using a cache. It would also allow more flexible time spans (instead of 20min aggregates, allow 1-week or 1-month worth of data to be summarized).

Look at /commonssite/settings.py - it contains local IP addresses of each subsystem and table names.

HVAC

Mitsubishi provides a web interface using Java Applets. Safe to say it is not well-designed. Authentication on the home page is a simple redirect, so it's super easy to get into the system without a password if you know where to point your requests. This same security-less interface lets you request data AND change settings, so anyone with the right address could, in theory, set all the rooms to 80 degrees remotely. Consider yourself warned..? The bright side of this is that it makes it, in theory, easy to write our own temperature control scripts if we ever wanted to.

The HVAC system (at the time of writing) serves data at 10.1.6.200:80. Using packet sniffing and some online resources, we were able to reverse-engineer the HTTP+XML protocol for getting and setting system values. There is a protocol doc you can purchase from Mitsubishi, but we got a lot of our information by reading someone else's open-source project to scrape data. That project no longer exists..

To query data, POST to `10.1.6.200:80/servlet/MIMEReceiveServlet' with the body:

<?xml version="1.0" encoding="UTF-8"?>
<Packet>
	<Command>getRequest</Command>
	<DatabaseManager>
		<Mnet Group="1" Bulk="*" />
		<Mnet Group="2" Bulk="*" />
		...
		<Mnet Group="15" Bulk="*" />
	</DatabaseManager>
</Packet>

and headers:

{
	'Content-type': 'text/xml',
	'Accept': 'text/html, image/gif, image/jpeg, *;'
}

Each Mnet tag specifies an air handler in the building. We have 15 of them in the commons, and their ids are:

1  : CCE
2  : Classroom 202
3  : Workshop
4  : 3rd Floor Office
5  : CMEE
6  : 2nd Floor Hall
7  : Classroom 201
8  : Mezzanine
9  : Project Space
10 : Control Room
11 : Mechanical Room
12 : Workshop ERV
13 : Classroom 202 ERV
14 : Project Space ERV
15 : Classroom 201 ERV

(1-11 are VRF). The response of this request is also in XML. Looking at the Mnet tags we get in the response, each will also have a Group attribute. No guarantee that they come back in the same order they were requested. Things get really messy here. Each Mnet tag reports back its status as one big concatenated hexadecimal string inside an attribute called Bulk. (Can you see why /hvac/scrapers.py is a big file? We did our best to abstract away all of this XML and hex mess, and even unit conversions...)

Example Request body

<?xml version="1.0" encoding="UTF-8"?>
<Packet>
	<Command>getRequest</Command>
	<DatabaseManager>
		<Mnet Group="1" Bulk="*"/>
		<Mnet Group="2" Bulk="*" />
		<Mnet Group="12" Bulk="*" />
		<Mnet Group="15" Bulk="*" />
	</DatabaseManager>
</Packet>

Example Response

<?xml version="1.0" encoding="UTF-8"?>\\r\\n<Packet>\\r\\n  <Command>getResponse</Command>\\r\\n  <DatabaseManager>\\r\\n    <Mnet Group="1" Bulk="010102110500C8000100000100000001001F0100640101010100010100000000192626171926055005000101010000000000FFFF000000000000000001000000000000011C050063000000000000FF000100007FFF7FFF7FFF7FFF7FFF0000000000007FFF0000007FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF"  />\\r\\n    <Mnet Group="2" Bulk="010102120000C3010100000000000001001F0100640101010100010100000000192626171926055005000101010000000000FFFF000000000000000001000000000000011C050063000000000000FF000100007FFF7FFF7FFF7FFF7FFF0000000000007FFF0000007FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF    <Mnet Group="12" Bulk="010080000000FA04010000000000000000020100640100000000000000010101000000000000000000000100010000000000FFFF000000000000000000000000000000011C05006300000000000080000000007FFF7FFF7FFF7FFF7FFF0000000000007FFF0000007FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF" />\\r\\n    <Mnet Group="13" Bulk="010080000000FA04010000000000000000020100640100000000000000010101000000000000000000000100010000000000FFFF000000000000000000000000000000011C05006300000000000080000000007FFF7FFF7FFF7FFF7FFF0000000000007FFF0000007FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF7FFF"  />\\r\\n  </DatabaseManager>\\r\\n</Packet>

This is not the easiest protocol to describe concisely. Take a look at server/hvac/scrapers.py - it is full of lookup tables and the like and may actually clarify some things. Here is how to decode the 124 bytes (248 ascii chars) in Bulk.

Address: 0  1  2  3  4  5  6  7  8  9  10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
Value:   01 01 02 11 05 00 C8 00 01 00 00 01 00 00 00 01 00 1F 01 00 64 01 01 01 01 00 01 01 00 00 00 00 19 26 26 17 19 26 05 50 05 00 01 01 01 00 00 00 00 00 FF FF 00 00 00 00 00 00 00 00 01 00 00 00 00 00 00 01 1C 05 00 63 00 00 00 00 00 00 FF 00 01 00 00 7F FF 7F FF 7F FF 7F FF 7F FF 00 00 00 00 00 00 7F FF  00  00  00  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF  7F  FF

Address | Length | Value
------------------------
0       | 1B     | n/a
1       | 1B     | hex id of 'Drive' (lookup table from id to value)
2       | 1B     | hex id of 'Mode' (lookup table from id to value)
3       | 2B     | hex val of 'SetTemp', where the second byte is the 'tenths' place. 0x1105 is 17.5 degrees C
5       | 2B     | hex val of 'InletTemp' x 10. 0x00C8 is 20.0 degrees C.
7       | 1B     | hex id 'AirDirection' (lookup table from id to value)
8       | 1B     | hex id 'FanSpeed' (lookup table from id to value)
9       | 1B     | hex val of 'RemoteControl'; special case with 0 ENABLED and 1 DISABLED
10      | 1B     | hex val of 'DriveItem'; 0=OFF 1=ON
11      | 1B     | hex val of 'ModeItem'; 0=OFF 1=ON
12      | 1B     | hex val of 'SetTempItem'; 0=OFF 1=ON
13      | 1B     | hex val of 'FilterItem'; 0=OFF 1=ON
14      | 1B     | hex id of 'Ventilation' (lookup table from id to value)
15      | 1B     | hex val of 'FilterSign' 0=OFF 1=ON 2=RESET
16      | 1B     | hex val of 'ErrorSign' 0=OFF 1=ON 2=RESET
17      | 1B     | hex id of 'Model' (lookup table from id to value)
18      | 1B     | hex val of 'ModeStatus' 0=DISABLED 1=ENABLED
19      | 1B     | hex val of 'MidTemp' 0=DISABLED 1=ENABLED
20      | 1B     | the field is 'ControlValue' but we don't know how to use it
21      | 1B     | hex val of 'Timer' 0=OFF 1=ON
22      | 1B     | hex id of 'IcKind' (lookup table from id to value)
23      | 1B     | hex val of 'AutoModeSW' 0=DISABLED 1=ENABLED
24      | 1B     | hex val of 'DryModeSW' 0=DISABLED 1=ENABLED
25      | 1B     | hex id of 'FanSpeedSW' (lookup table from id to value)
26      | 1B     | hex val of 'AirDirectionSW' 0=DISABLED 1=ENABLED
27      | 1B     | hex val of 'SwingSW' 0=DISABLED 1=ENABLED
28      | 1B     | hex val of 'VentilationSW' 0=DISABLED 1=ENABLED
29      | 1B     | hex val of 'BypassSW' 0=DISABLED 1=ENABLED
30      | 1B     | hex val of 'LcAutoSW' 0=DISABLED 1=ENABLED
31      | 1B     | hex val of 'HeatRecoverySW' 0=DISABLED 1=ENABLED

Now 32-40 are especially weird. Address 32-37 are base 10 integers for the temperature limits (celcius, of course) in the order

32 CoolMin
33 HeatMax
34 CoolMax
35 HeatMin
36 AutoMin
37 AutoMax

then bytes 38-40 are each subdivided into two 4-bit values that are the tenths place for each of the above. Basically they interleaved the bytes of each value, as if they didn't want us reverse-engineering it..

38, 1st 4 bits: hex tenths place of CoolMin
38, 2nd 4 bits: hex tenths place of HeatMax
39, 1st 4 bits: hex tenths place of CoolMax
39, 2nd 4 bits: hex tenths place of HeatMin
40, 1st 4 bits: hex tenths place of AutoMin
40, 2nd 4 bits: hex tenths place of AutoMax

An example would help here. Take the values from above:

... 32 33 34 35 36 37 38 39 40 ...
... 19 26 26 17 19 26 05 50 05 ...
    ^^                ^        CoolMin
       ^^              ^       HeatMax

Means CoolMin has a value of 19 + 0.1 * 0 = 19.0 degrees and HeatMax has a value of 26 + 0.1 * 5 = 26.5 degrees. Now, continuing with the table above with address 41 and above:

Address | Length | Value
------------------------
41      | 1B     | hex val of 'TurnOff' 0=OFF 1=ON
42      | 1B     | hex val of 'TempLimit' 0=DISABLED 1=ENABLED
43      | 1B     | hex val of 'TempDetail' 0=DISABLED 1=ENABLED
44      | 1B     | hex val of 'FanModeSW' 0=DISABLED 1=ENABLED
45      | 1B     | hex val of 'AirStageSW' 0=DISABLED 1=ENABLED
46      | 1B     | hex val of 'AirAutoSW' 0=DISABLED 1=ENABLED
47      | 1B     | hex val of 'FanAutoSW' 0=DISABLED 1=ENABLED
48+     | ???    | ???

...That was a lot. See hvac/scrapers.py for the lookup tables. The example above is parsed by our python code as

{
	'AirAutoSW': 'DISABLED',
	'AirDirection': 'Swing',
	'AirDirectionSW': 'ENABLED',
	'AirStageSW': 'DISABLED',
	'AutoMax': 26.5,
	'AutoMin': 19.0,
	'AutoModeSW': 'ENABLED',
	'BypassSW': 'DISABLED',
	'ControlValue': '0x64 (Unknown)',
	'CoolMax': 26.5,
	'CoolMin': 19.0,
	'Drive': 'ON',
	'DriveItem': 'OFF',
	'DryModeSW': 'ENABLED',
	'ErrorSign': 'OFF',
	'FanAutoSW': 'DISABLED',
	'FanModeSW': 'ENABLED',
	'FanSpeed': 'Mid-Low',
	'FanSpeedSW': '2-Stage',
	'FilterItem': 'OFF',
	'FilterSign': 'ON',
	'HeatMax': 26.5,
	'HeatMin': 17.0,
	'HeatRecoverySW': 'DISABLED',
	'IcKind': 'Normal',
	'InletTemp': 20.0,
	'LcAutoSW': 'DISABLED',
	'MidTemp': 'DISABLED',
	'Mode': 'Heat',
	'ModeItem': 'ON',
	'ModeStatus': 'ENABLED',
	'Model': 'IC',
	'RemoteControl': 'ENABLED',
	'SetTemp': 17.5,
	'SetTempItem': 'OFF',
	'SwingSW': 'ENABLED',
	'TempDetail': 'ENABLED',
	'TempLimit': 'ENABLED',
	'Timer': 'ON',
	'TurnOff': 'OFF',
	'Ventilation': 'OFF',
	'VentilationSW': 'DISABLED'
}

All of this is collected into two tables; one for vrf and one for erv. We index the tables both on timestamp and on the location in the building (the name field is a foreign key to the hvac_rooms table), and require each such pair to be unique.

Veris Electric

We get data from the Veris system by making HTTP GET requests to the local address 10.1.6.202:80. Basic username and password authentication is required. The username and password are kept in a private.py file on the server and ignored by version control.

GET requests are made to the URL http://10.1.6.202:80/setup/devicexml.cgi?ADDRESS=XYZ&TYPE=DATA, where XYZ is 2, 3, or 4 depending on which circuit breaker box you want to query (I have no idea why they aren't 1, 2, and 3). What you get back is XML. Here is an example response from querying address 2:

<?xml version="1.0" encoding="US-ASCII" ?>
<DAS>
	<name>Subcircuit Power Monitor</name>
	<serial>001EC6051892</serial>
	<devices><device>
		<name></name>
		<address>2</address>
		<type>Veris Branch Circuit Monitor Model E30A142, S/N=0x4E3A4B1C, Location= Panel #1</type>
		<class>39</class>
		<status>Unconfigured</status>
		<numpoints>374</numpoints>
		<records><record>
			<time zone="UTC">2016-01-31 16:22:21</time>
			<age units="seconds">29</age>
			<error text="Ok">0</error>
			<point number="0" name="Channel #1" units="Amps" value="0"  /> 
			<point number="1" name="Channel #1 Max" units="Amps" value="0"  /> 
			<point number="2" name="Channel #1 Demand" units="Amps" value="0"  /> 
			<point number="3" name="Channel #1 Energy" units="kWh" value="0"  /> 
			<point number="4" name="Channel #1 Power" units="kW" value="0"  /> 
			<point number="5" name="Channel #1 Power Max" units="kW" value="0"  /> 
			<point number="6" name="Channel #1 Power Demand" units="kW" value="0"  /> 
			<point number="7" name="Channel #1 Power Factor" units="" value="0"  />
			...
			<point number="328" name="Channel #42" units="Amps" value="0.530" alarm="low" /> 
			<point number="329" name="Channel #42 Max" units="Amps" value="0.530"  /> 
			<point number="330" name="Channel #42 Demand" units="Amps" value="0.530"  /> 
			<point number="331" name="Channel #42 Energy" units="kWh" value="1362.501"  /> 
			<point number="332" name="Channel #42 Power" units="kW" value="0.050"  /> 
			<point number="333" name="Channel #42 Power Max" units="kW" value="0.050"  /> 
			<point number="334" name="Channel #42 Power Demand" units="kW" value="0.050"  /> 
			<point number="335" name="Channel #42 Power Factor" units="" value="0.706"  /> 
			<point number="336" name="Frequency" units="Hz" value="59.88"  /> 
			<point number="337" name="Volts L-N 3ph Ave" units="Volts" value="122.16"  /> 
			<point number="338" name="Volts L-L 3ph Ave" units="Volts" value="211.60"  /> 
			<point number="339" name="Volts A-N" units="Volts" value="121.84"  /> 
			<point number="340" name="Volts B-N" units="Volts" value="122.72"  /> 
			<point number="341" name="Volts C-N" units="Volts" value="121.93"  /> 
			<point number="342" name="Volts A-B" units="Volts" value="212.01"  /> 
			<point number="343" name="Volts B-C" units="Volts" value="212.00"  /> 
			<point number="344" name="Volts C-A" units="Volts" value="210.79"  /> 
			<point number="345" name="3ph kWh" units="kWh" value="41928.91"  /> 
			<point number="346" name="3ph Total kW" units="kW" value="0.78"  /> 
			<point number="347" name="3ph Total PF" units="" value="0.83"  /> 
			<point number="348" name="3ph Ave Current" units="Amps" value="2.56"  /> 
			<point number="349" name="kW Phase 1" units="kW" value="0.26"  /> 
			<point number="350" name="kW Phase 2" units="kW" value="0.25"  /> 
			<point number="351" name="kW Phase 3" units="kW" value="0.27"  /> 
			<point number="352" name="PF Phase 1" units="" value="0.86"  /> 
			<point number="353" name="PF Phase 2" units="" value="0.86"  /> 
			<point number="354" name="PF Phase 3" units="" value="0.78"  /> 
			<point number="355" name="Current Phase 1" units="Amps" value="2.49"  /> 
			<point number="356" name="Current Phase 2" units="Amps" value="2.39"  /> 
			<point number="357" name="Current Phase 3" units="Amps" value="2.81"  /> 
			<point number="358" name="Current Phase 4" units="Amps" value="0.32"  /> 
			<point number="359" name="Current Demand Phase 1" units="Amps" value="2.47"  /> 
			<point number="360" name="Current Demand Phase 2" units="Amps" value="2.41"  /> 
			<point number="361" name="Current Demand Phase 3" units="Amps" value="2.49"  /> 
			<point number="362" name="Current Demand Phase 4" units="Amps" value="0.32"  /> 
			<point number="363" name="Max Current Demand Phase 1" units="Amps" value="43.39"  /> 
			<point number="364" name="Max Current Demand Phase 2" units="Amps" value="50.98"  /> 
			<point number="365" name="Max Current Demand Phase 3" units="Amps" value="41.49"  /> 
			<point number="366" name="Max Current Demand Phase 4" units="Amps" value="0.35"  /> 
			<point number="367" name="3ph Present KW Total Demand" units="kW" value="0.73"  /> 
			<point number="368" name="3ph Max KW Total Demand" units="kW" value="16.24"  /> 
			<point number="369" name="Max Current Phase 1" units="Amps" value="2.51"  /> 
			<point number="370" name="Max Current Phase 2" units="Amps" value="2.46"  /> 
			<point number="371" name="Max Current Phase 3" units="Amps" value="4.07"  /> 
			<point number="372" name="Max Current Phase 4" units="Amps" value="0.32"  /> 
			<point number="373" name="3ph Max KW Total" units="kW" value="0.90"  /> 
		</record></records>
	</device></devices>
</DAS>

(where channels 2 through 41 were removed). Each "channel" data point corresponds to a single electrical line. 2- or 3-phase circuits will show up across 2 or 3 channels. Starting with "Frequency," we get summary data that describes the entire circuit box. In our database, there is one table for circuits and one for this summary data. The circuits table, analogous to the HVAC system, is indexed both by time and by channel number, and these must be jointly unique.

SMA Solar

SMA uses "RPC" (remote procedure call) to query data. Basically this means that we make HTTP POST requests to commonscontrol.harleyschool.org:80/solar/rpc, and the thing we're querying for is encoded in the POST body as RPC={...}, where we fill in the curly braces with a JSON object. Like the electric system, we have separate tables that distinguish between per-panel data and one that summarizes across all of them. In total, there are three tables containing data for this system: per-panel, overview, and weather (the solar panels came with their own wind-speed and temperature sensors, not to be confused with the standalone Weather system which is by Vantage).

Each JSON object begins with this basic content:

{
	'version' : '1.0',
	'proc' : '', # set by procedure functions
	'id' : '0',  # should probably set something smart
	'format' : 'JSON',
	'passwd' : <MD5 Hash of the password found in private.py>
	'params' : {}
}

..to which we add other fields depending on which type of data we are requesting. When it is initialized, our scraper starts by querying which "devices" are connected by adding proc=GetDevices and id=2 fields. SMA provides its own weather station in addition to power data, and these are the two "devices" we should see in the result of this query. In practice, we shouldn't need to make this query often since the result is always the following, assuming nothing has gone terribly wrong:

{
	"format" : "JSON",
	"result" : {
		"totalDevicesReturned" : 2,
		"devices" : [
			{"key" : "SENS0700 : 26877",
			 "name" : "SENS0700 : 26877"},
			{"key" : "WRHV5K84 : 191201464",
			 "name" : "WRHV5K84 : 191201464"}
		]
	},
	"proc" : "GetDevices",
	"version" : "1.0",
	"id" : null
}

Where device "SENS0700 : 26877" is weather data and device "WRHV5K84 : 191201464" is the panel power data. In the file solar/scrapers.py, we hard-coded these keys. To query the system for actual data, we make a very similar request.. the full request body looks like this:

RPC={
	'version' : '1.0',
	'proc' : 'GetProcessData',
	'id' : '1',
	'format' : 'JSON',
	'passwd' : <MD5 Hash of the password found in private.py>
	'params' : {
		'devices' : [{'key' : 'SENS0700 : 26877'},{'key' : 'WRHV5K84 : 191201464'}]
	}
}

(note that only 'id', 'proc', and 'params' have to be set and the rest is identical to the 'basic content' described above). Here is an example response to this query:

{"format":"JSON",
"result":{
	"devices":[
		{"key":"SENS0700:26877",
		"channels":[
			{"unit":"W/m^2","meta":"ExlSolIrr","name":"ExlSolIrr","value":""},
			{"unit":"W/m^2","meta":"IntSolIrr","name":"IntSolIrr","value":""},
			{"unit":"h","meta":"SMA-h-On","name":"SMA-h-On","value":""},
			{"unit":"\xb0C","meta":"TmpAmb C","name":"TmpAmb C","value":""},
			{"unit":"\xb0C","meta":"TmpMdul C","name":"TmpMdul C","value":""},
			{"unit":"m/s","meta":"WindVel m/s","name":"WindVel m/s","value":""}]},
		{"key":"WRHV5K84:191201464",
		"channels":[
			{"unit":"A","meta":"A.Ms.Amp","name":"A.Ms.Amp","value":"8.620"},
			{"unit":"V","meta":"A.Ms.Vol","name":"A.Ms.Vol","value":"348.200"},
			{"unit":"W","meta":"A.Ms.Watt","name":"A.Ms.Watt","value":"2999"},
			{"unit":"A","meta":"A1.Ms.Amp","name":"A1.Ms.Amp","value":"8.620"},
			{"unit":"A","meta":"B.Ms.Amp","name":"B.Ms.Amp","value":"6.578"},
			{"unit":"V","meta":"B.Ms.Vol","name":"B.Ms.Vol","value":"345.830"},
			{"unit":"W","meta":"B.Ms.Watt","name":"B.Ms.Watt","value":"2273"},
			{"unit":"A","meta":"B1.Ms.Amp","name":"B1.Ms.Amp","value":"8.620"},
			{"unit":"","meta":"Error","name":"Error","value":"-------"},
			{"unit":"kWh","meta":"E-Total","name":"E-Total","value":"47103.943"},
			{"unit":"","meta":"GM.TotS0Out","name":"GM.TotS0Out","value":"0"},
			{"unit":"Wh","meta":"GM.TotWhOut","name":"GM.TotWhOut","value":"0"},
			{"unit":"A","meta":"GridMs.A.phsA","name":"GridMs.A.phsA","value":"5.940"},
			{"unit":"A","meta":"GridMs.A.phsB","name":"GridMs.A.phsB","value":"5.996"},
			{"unit":"A","meta":"GridMs.A.phsC","name":"GridMs.A.phsC","value":"5.954"},
			{"unit":"Hz","meta":"GridMs.Hz","name":"GridMs.Hz","value":"59.960"},
			{"unit":"V","meta":"GridMs.PhV.phsA","name":"GridMs.PhV.phsA","value":"285.820"},
			{"unit":"V","meta":"GridMs.PhV.phsB","name":"GridMs.PhV.phsB","value":"285.070"},
			{"unit":"V","meta":"GridMs.PhV.phsC","name":"GridMs.PhV.phsC","value":"284.530"},
			{"unit":"","meta":"GridMs.TotPF","name":"GridMs.TotPF","value":"0.999"},
			{"unit":"VA","meta":"GridMs.TotVA","name":"GridMs.TotVA","value":"5101"},
			{"unit":"VAr","meta":"GridMs.TotVAr","name":"GridMs.TotVAr","value":"-5"},
			{"unit":"VA","meta":"GridMs.VA.phsA","name":"GridMs.VA.phsA","value":"1698"},
			{"unit":"VA","meta":"GridMs.VA.phsB","name":"GridMs.VA.phsB","value":"1709"},
			{"unit":"VA","meta":"GridMs.VA.phsC","name":"GridMs.VA.phsC","value":"1694"},
			{"unit":"VAr","meta":"GridMs.VAr.phsA","name":"GridMs.VAr.phsA","value":"-1"},
			{"unit":"VAr","meta":"GridMs.VAr.phsB","name":"GridMs.VAr.phsB","value":"-4"},
			{"unit":"VAr","meta":"GridMs.VAr.phsC","name":"GridMs.VAr.phsC","value":"0"},
			{"unit":"W","meta":"GridMs.W.phsA","name":"GridMs.W.phsA","value":"1698"},
			{"unit":"W","meta":"GridMs.W.phsB","name":"GridMs.W.phsB","value":"1709"},
			{"unit":"W","meta":"GridMs.W.phsC","name":"GridMs.W.phsC","value":"1694"},
			{"unit":"","meta":"Inv.TmpLimStt","name":"Inv.TmpLimStt","value":"NoneDrt"},
			{"unit":"","meta":"InvCtl.Stt","name":"InvCtl.Stt","value":"On"},
			{"unit":"A","meta":"Iso.FltA","name":"Iso.FltA","value":"0.022"},
			{"unit":"","meta":"Mode","name":"Mode","value":"Mpp"},
			{"unit":"h","meta":"Mt.TotOpTmh","name":"Mt.TotOpTmh","value":"8781.3257832"},
			{"unit":"h","meta":"Mt.TotTmh","name":"Mt.TotTmh","value":"9111.4599696"},
			{"unit":"","meta":"Op.BckOpStt","name":"Op.BckOpStt","value":"ModGri"},
			{"unit":"","meta":"Op.EvtCntIstl","name":"Op.EvtCntIstl","value":"24"},
			{"unit":"","meta":"Op.EvtCntUsr","name":"Op.EvtCntUsr","value":"4497"},
			{"unit":"","meta":"Op.EvtNo","name":"Op.EvtNo","value":"0"},
			{"unit":"","meta":"Op.EvtNoDvlp","name":"Op.EvtNoDvlp","value":"0"},
			{"unit":"","meta":"Op.GriSwCnt","name":"Op.GriSwCnt","value":"861"},
			{"unit":"","meta":"Op.GriSwStt","name":"Op.GriSwStt","value":"Cls"},
			{"unit":"","meta":"Op.Health","name":"Op.Health","value":"Ok"},
			{"unit":"","meta":"Op.Prio","name":"Op.Prio","value":"NonePrio"},
			{"unit":"s","meta":"Op.TmsRmg","name":"Op.TmsRmg","value":"0"},
			{"unit":"W","meta":"Pac","name":"Pac","value":"5101"},
			{"unit":"","meta":"PCM-DigInStt","name":"PCM-DigInStt","value":"D0"},
			{"unit":"","meta":"PlntCtl.Stt","name":"PlntCtl.Stt","value":"On"},
			{"unit":"kOhm","meta":"Riso","name":"Riso","value":"3000.000"},
			{"unit":"","meta":"Serial Number","name":"Serial Number","value":"191201464"}]
		}
	]
},
"proc":"GetProcessData",
"version":"1.0",
"id":null;}

We used these references to decode the result: http://files.sma.de/dl/1348/NG_PAR-TB-en-22.pdf, http://files.sma.de/dl/15330/SB3-6TL-21-Parameter-TI-en-10W.pdf, and the solar/scrapers.py script has a map of SMA keys to field names for the data objects (remember, django uses ORM). Not all of these fields are stored in our database (none of the "meta" ones, e.g.).

So that is how we query for per-panel and weather data. Solar overview data takes a separate query, in which we set proc="GetPlantOverview" and id="4" (params is left as an empty object). Here is an example response from this request:

{"format":"JSON",
"result":{
	"overview":[
		{"unit":"W","meta":"GriPwr","name":"GriPwr","value":"14496"},
		{"unit":"kWh","meta":"GriEgyTdy","name":"GriEgyTdy","value":"13.683"},
		{"unit":"kWh","meta":"GriEgyTot","name":"GriEgyTot","value":"47106.451"},
		{"unit":"","meta":"OpStt","name":"OpStt","value":""},
		{"unit":"","meta":"Msg","name":"Msg","value":""}
	]
},
"proc":"GetPlantOverview",
"version":"1.0",
"id":"4"}

Greenhouse Arduino

This is probably the easiest system to interface with because of how simple it is under the hood. Just point a GET request to the local address http://10.1.6.204:80, and you'll get a JSON response back:

{
	"HUM_EAST_CHIMNEY_LOW":{"t":1454261405,"v":"40.00"},
	"TEMP_EAST_CHIMNEY_LOW":{"t":1454261405,"v":"60.80"},
	"HUM_EAST_CHIMNEY_HIGH":{"t":1454261406,"v":"27.60"},
	"TEMP_EAST_CHIMNEY_HIGH":{"t":1454261406,"v":"71.06"},
	"HUM_WEST_CHIMNEY_LOW":{"t":1454261402,"v":"37.90"},
	"TEMP_WEST_CHIMNEY_LOW":{"t":1454261402,"v":"61.70"},
	"HUM_WEST_CHIMNEY_HIGH":{"t":1454261403,"v":"26.60"},
	"TEMP_WEST_CHIMNEY_HIGH":{"t":1454261403,"v":"72.50"},
	"HUM_GREENHOUSE_LOW":{"t":1454261403,"v":"33.30"},
	"TEMP_GREENHOUSE_LOW":{"t":1454261403,"v":"67.82"},
	"HUM_MEZZANINE":{"t":1454261404,"v":"35.10"},
	"TEMP_MEZZANINE":{"t":1454261404,"v":"68.00"},
	"HUM_GREENHOUSE_HIGH":{"t":1454261405,"v":"34.00"},
	"TEMP_GREENHOUSE_HIGH":{"t":1454261405,"v":"69.62"}
}

where each field tells you the timestamp ('t') of when it was last measured and a value ('v'). Humidity is in percent, and temperature is in degrees farenheit.

The catch is that the Arduino driving this system has no internal clock. If it is repowered, the timestamps will start counting from zero again. One workaround would be to just assume all measurements are current. Or, use the interface we added to tell the Arduino what time it is: just add /<timestamp of right now> to the URL (all timestamps in epoch seconds, not milliseconds). Since the Arduino knows how long it has been live, it can go back and update the timestamps. From there, it will try to keep track of time on its own, but it's not a bad idea to send a timestamp with every request in case it drifts over time (it's rather simple microcontroller, after all).

Note: I put a good deal of effort into making an Arduino library that abstracts away the messy details of handling requests, JSON formatting, and updating timestamps. The goal of making this simple is to make the process of adding new custom-built sensors to the building as easy as possible, provided you're willing to use an Arduino+ethernet. This special Arduino library is on the HarleySchool github, in which these greenhouse sensors are included as an example.

Weather Station

The weather station is a different beast. It rarely responds to requests quickly and this lag is responsible for crashing our scraper program a few times in its early days. Interfacing with the Vantage box ourselves would have been a big undertaking. Luckily, there is an open-source community called WeeWX that has built a very good high-level API for it. WeeWX is written in Python, which made it super easy to drop into our scraper program:

from weewx.drivers.vantage import Vantage

...

v = Vantage(type='ethernet', host=<HOST>, max_retries=2, wait_before_retry=2.4)
data = next(v.genDavisLoopPackets())

Where <HOST> is "10.1.6.203" locally. After waiting a few seconds for the protocol dance to finish, the data we get back is a simple python dictionary (key-value pairs). We then just map the WeeWX keywords to our own (ok, we just set them all to lower case), then create and store a row. Easy.

Given the large number of problems, we wrapped this in a try-catch block that closed and reopened connections whenever we had a "Wakeup" error (i.e. couldn't connect).

Solar Hot Water

The solar hot water system logs data onto an SD card. We wrote a scraper to process the data files that are on it and add the rows to the database, but this is rarely done (nobody was put in charge of physically taking the SD card and running this script on it once a month). Potentially someday we could use a wifi SD card to make data "live" like the other systems.