Plugin Results Format

Initial stage

Currently only Python 3.7 code is supported, so this will be the first thing to get installed on your local workstation. Then download this requirements.txt file so you have the correct libraries and their version available. Install using pip install -r requirements.txt.

A plugin receives an input manifest as first command line argument, and the output file path via the second arguments. A report plugin run would look like this:

python main.py manifest.json results.json

A basic python plugin would look like this:

import sys
import json
with open(sys.argv[1]) as data_file:
manifest = json.load(data_file)
output = {
"data": {
"greeting": "Hello",
"name": "world"
},
"status": {
"code": "success",
"title": None,
"explanation": None,
"backtrace": None
},
"js": "return {greetWith: results.data.greeting}",
"jsx": '''
<Insight>
{ results.helpers.renderHello(data.greetWith, results.data.name) }
</Insight>
''',
"helper": '''
class {
constructor(results) {
}
renderHello(greeting, name) {
return <p>{greeting} {name}!</p>;
}
}
'''
}
f = open(sys.argv[2], 'w')
f.write(json.dumps(output))

It's important to note that the output JSON should always be written to argv[2]; in production this location will be writeable. In general, any files during your plugin run should written to the current path, thus no absolute or other paths, because they will not exist or not be writeable when your plugin runs on the platform in production.

First we read the JSON manifest file. Then we construct a basic JSON plugin results object, shown in the output variable in the example above. The data key can contain anything you like and will be available as-in in the context of the js, jsx and helper keys.

A plugin has different JavaScript contexts to use code within, each defined by their respective keys as a string in the plugin JSON result object. When a plugin runs as stand-alone, it first evaluates the js context and then jsx to display the visualizations — each of which can call any optionally defined helpers:

  1. js — here we have access to a global results variable; an object containing the plugin results. This results object contains two properties: data which is whatever data was returned from Python in the output JSON and helpers is a helper class that can be optionally outputted in the output JSON.
    The object returned in js will be available in the jsx context as the data variable. Normally the js property will not be used; instead it's best to use the helper instead because that will allows for better re-use of any helper code. If used, it's important that the JS code has code at the end of the file, and not comments. See the third item below.

  2. jsx — here we render any JSX components/tags to visualize our plugin. For a full overview of JSX components and API functions see the see the API library. Here we have access to the same global results variable. In addition we also have access to a variable called data, which contains whatever was returned from the js code. In most cases js should not be used, and there is thus no returned data available. Instead we use the helper class instead, as this will allow for easier code re-use, especially when this plugin is available to third-parties.

  3. helper — this is the main glue of the plugin, and defines functions that can be re-used in the jsx or js code parts. Within the helper code, you can access this to call another helper function. All helpers are also pre-initialized with this.results, which gives you access to the plugin results object. The helper functions can also be documented with a JSON format, making them available to third-parties where they can be re-used in the Insight API, with fully ready to use code examples. See the Plugin Helpers documentation for more details.

Finally a plugin ouput should always contain a valid status object. In case of no errors, it can simply look like this:

"status": {
"code": "success"
},

In case of an error, it's most basic format would be:

"status": {
"code": "error",
},

You can also provide a friendly title for end-users in the title key, together with a more detailed error message for the end-user in explanation. Any backtrace can be put into the backtrace. All fields are optional, so you can supply whatever you have available. A full error would look like this:

"status": {
"code": "error",
"title": "Not enough days of data.",
"explanation": "You need at least 14 days of data to make a forecast",
"backtrace": '''
Traceback (most recent call last):
File "main.py", line 4, in <module>
make_forecast(manifest)
File "main.py", line 2, in forecast
forecast(data, 20)
DataError: less than 20 samples, cannot predict.
'''
},

While the plugin code above thus shows how to use the full js, jsx and helper formats, normally you would only use the jsx and helper parts. The code would then be structure more like this:

{
...,
"data": {
"greeting": "Hello",
"name": "world",
"count": 1
},
"jsx": '''
<Insight>
{ results.helpers.renderHello(data.greetWith, results.data.name) }
</Insight>
''',
"helper": '''
class {
constructor(results) {
this.someCalculation = results.data.count * 2;
}
multiplySomeCalculation(factor) {
return this.someCalculation * factor;
}
renderHello() {
return <p>
{this.results.data.greeting} {this.results.data.name}!
My result is {this.multiplySomeCalculation(5)}.
</p>;
}
}
'''
}

Helper functions should not really be used to do intensive work, which is done instead in the plugin itself. The helpers should be a glue layer that render components, visualizes data — by mapping this custom object to the format for different visualization components, and to expose convenience functions for users of the plugin, which could be yourself or third-parties in case you publish the template.

Within js, jsx and helper there is access to a range of visualization components, utility methods and libraries to make developing plugins as easy as possible. See the API library for more details.

Additional stages

Specifying datasets

If you need to run additional stages, the plugin results JSON of the initial stage is the place to indicate so. You can use a process object on the root object to indicate what additional stages are needed, and what datasets each of them needs.

If you don't provide a process object the plugin finished after the initial stage is executed.

In the example below we want to process one more stage named trainMore, using a dataset that is using the latest state of users, accessible via its latestData when the plugin receives the JSON manifest in the next trainMore run:

{
...,
"process": {
"trainMore": {
"dataSets": {
"latestData": {"type": "latest"}
}
}
}
}

We can specify other moment types for datasets. Below we show all possible combinations, and all datasets under that stage will have access to all those four datasets. Dataset of type latest means the current state of users. Type since can have seconds and states the number of seconds since user creation.

pctOfConvertedToMeasure means that we first calculate for all users the time in seconds it took them to convert (since their creation), then take the 5th percentile of that (1.0 - 0.95). Then we use the resulting 5th percentile as number of seconds since user creation for the dataset, to measure only actions done up to that number of seconds. Conversion is by default "where": "y_value='true'", but where can be omitted in most cases. If you have a special query you want to run on the dataset in terms of what users to filter for you can do that in there. The where is being run on the table and columns as described in Dataset and features. Finally be aware that the pctOfConvertedToMeasure and its "where": "y_value='true'" filter is always being run against whatever the initial dataset was (0 seconds, latest or 95%).

{
...,
"process": {
"trainMore": {
"dataSets": {
"latestData": {"type": "latest"},
"5pctData": {"type": "since", "pctOfConvertedToMeasure": 0.95, "where": "y_value='true'"},
"60secData": {"type": "since", "seconds": 60}
"0secData": {"type": "since", "seconds": 0}
}
}
}
}

Specifying stages

After the initial stage, we can run a few more stages. Each additional key under process means one additional stage, and for each stage we can have a number of (different) dataSets. Each of the additional stages run in parallel, and do not have access to data from other addtional stages. It can only access data from the initial stage, more on that in the next section on Plugin Storage.

To know which stage a current plugin run is in, parse the JSON manifest and read the stage key. It returns a string, set to initial for the initial stage and for any additional stages whatever has been set as stage key on the process object. This can be used to do a different analysis, with a different datasets, depending on the stage. For a prediction plugin, the initial stage can be used to figure out what additional datasets are needed, then we have an actual training stage that trains the model. Finally the server stage can be used to run a http server that responds to realtime prediction requests or sets a prediction score for all users in the dataset &mdash& more on that in the Deployment section and Batch updating section.

Below is an example that runs three additional stages, where each one has access to its own dataset plus the latest one.

Note that each process key needs to be unique. Within dataSets if a key is re-used in another stage, it means that same dataset is re-used in another stage. If you need a different dataset specification, always make sure the key under dataSets is unique for the whole JSON object.

{
...,
"process": {
"stage1": {
"dataSets": {
"60secData": {"type": "since", "seconds": 60},
"latestData": {"type": "latest"}
}
},
"stage2": {
"dataSets": {
"180secData": {"type": "since", "seconds": 180},
"latestData": {"type": "latest"}
}
},
"stage3": {
"dataSets": {
"300secData": {"type": "since", "seconds": 300},
"latestData": {"type": "latest"}
}
}
}
}

Each additional stage JSON manifest that the plugin receives as input, has the specified dataUrls accessible under the keys, outputted in the process under datasets. In addition the initial dataset is always available too.

Preparing stages and datasets during development

Normally your plugin will be run on the platform, and the preparation of datasets for additional stages is done automatically. During local development of your plugin, you need to trigger the dataset preparation and get the new JSON manifest for the next stages.

Send the plugin JSON output file from the plugin as JSON body, like shown below, where we first run the plugin initial stage. Note again that these two steps only need to be run manually when developing your plugin locally — once your plugin is imported in a template onto the platform, these steps are executed automatically:

python main.py manifest.json results.json
curl -X "POST" "https://www.stormly.com/api/developer/process_result/initial" \
-H 'X-Project-Key: abcd12345' \
-H 'X-Dataset-Key: abcdefghjklm12346789' \
-H 'Content-Type: application/json' \
-d @results.json

This returns {"status": "preparing"} or {"status": "ready"}. This step may take anywhere from a few seconds to a few minutes, depending on the size of your dataset.

Note that the last part of the url is used to indicate what stage we are processing for, in this case initial, because we are processing results for the initial plugin run stage. But later we can POST results for other stages — as specified under the process object — mostly used for deployable and batching plugins. More on this later in their respective section.

The next step will be to simulate a plugin run of any additional stage, like stage1 for example. To get your JSON manifest for stage1 make a GET request to https://www.stormly.com/api/developer/get_manifest/stage1. You will get a status message in case the dataset is still being prepared. For example with curl:

curl "https://www.stormly.com/api/developer/get_manifest/stage1" \
-H 'X-Project-Key: abcd12345' \
-H 'X-Dataset-Key: abcdefghjklm12346789'

Below is shown how an additional stage JSON manifest could look like — the same as for the initial stage, with the difference that we now have "stage": "stage1", and dataUrls containing initial plus any additional datasets requested for stage160secData and latestData in this case:

{
"stage": "stage1",
"dataUrls": {
"initial": "https://data.stormly.com/api/plugin/query_dataset/abcdef123456",
"60secData": "https://data.stormly.com/api/plugin/query_dataset/ghijkl789012",
"latestData": "https://data.stormly.com/api/plugin/query_dataset/mnopqrst345678",
},
"downloadUrls": {
"initial": "https://s3.eu-central-1.amazonaws.com/storage.stormly.com/abcd/1234",
"stage1": "https://s3.eu-central-1.amazonaws.com/storage.stormly.com/abcd/5678"
},
"getUploadUrls": {
"initial": "https://www.stormly.com/api/developer/upload_url/abcd/1234",
"stage1": "https://www.stormly.com/api/developer/upload_url/abcd/5678"
},
"inputData": {},
"inputParams": {
"max_items": 4.0
}
"metadata": {
"datasets": {
"initial": ...,
"60secData": ...,
"latestData": ...,
},
"goal": ...,
"features": ...,
}
}

Once you're ready to test to plugin run for that stage, execute with manifest for the new additional stage:

python main.py manifest-stage1.json stage1.json

Then POST your results for that stage. This is only necessary if your plugin supports being deployed or can do batching updates of existing user data. More on this in the deployable and batching plugin sections.

curl -X "POST" "https://www.stormly.com/api/developer/process_result/stage1" \
-H 'X-Project-Key: abcd12345' \
-H 'X-Dataset-Key: abcdefghjklm12346789' \
-H 'Content-Type: application/json' \
-d @stage1.json

Success of stages

The initial stage always has to succeeed, as indicated by a success status code returned, for the plugin run to be considered successful. All additional stages by default also need to have a successful run, for the whole plugin run to be successful. If one of the additional stages has an error, for example because the dataset requested doesn't have enough conversion samples, the whole plugin run is unsuccessful.

For any additional stage, we can indicate that they don't have to be successful by simply setting on that stage object under process the field successrequired to false. An example below where only stage1 needs to success, while stage2 and stage3 can have an error, and the plugin run is still considered successful.

{
...,
"process": {
"stage1": {
"dataSets": {
"60secData": {"type": "since", "seconds": 60},
"latestData": {"type": "latest"}
}
},
"stage2": {
"dataSets": {
"180secData": {"type": "since", "seconds": 180},
"latestData": {"type": "latest"},
"successRequired": false
}
},
"stage3": {
"dataSets": {
"300secData": {"type": "since", "seconds": 300},
"latestData": {"type": "latest"},
"successRequired": false
}
}
}
}

When we set all additional stages to "successRequired": false, we only need the initial stage to succeed for a successful plugin run.

When there are multiple stage and all of them including the initial ones has an error, it shows the end-user only the error for the initial stage. For warnings, the explanation and backtrace are collected for all where successRequired is not false, and then concated using newlines; note that the title of the warning is the first one available, first from initial then additional stages.

Accessing results in JS, JSX and helpers

As described at the beginning of this page, within the js and jsx code the results from the plugin run can be accessed under the variable results, and in the helper via this.results. But when there are multiple stages, the results object has one extra layer, where we first have a key indicating the stage key as indicated in process. The initial stage is always accessible on the results via initial. Even if all but the initial stage fail, we still have this format with initial and keys for additional stages.

The results object for the process example above will roughly look like this:

{
...,
"initial": {
// same as 'data' for the plugin result JSON of the initial stage plugin run.
},
"stage1": {
// same as 'data' for the plugin result JSON of 'stage1' plugin run.
},
"stage2": {
// same as 'data' for the plugin result JSON of 'stage2' plugin run.
},
"stage3": {
// same as 'data' for the plugin result JSON of 'stage3' plugin run.
}
}

So to access the initial stage results in the JSX code, use results.initial.something, while in the helper code we use this.results.initial.something. For stage3 we use results.stage3.something in JS/JSX, while in the helper we use this.results.stage3.something.

Limitations

  • The number of additional stages is limited to a maximum of 25.
  • The number of unique datasets that can be requested among all dataSets is limited to a maximum of 25. Take note of this mostly when using hyper-parameters to experiment with a large number of datasets.
  • The js, jsx and helpers are only taken from the initial plugin run, never from any additional stages.

User Segments

The Insight API has a <Segment ... /> component that allows any end-user to quickly save a segment of users, such as Country is US AND Number of photos uploaded > 10 AND Came back in 2nd week.

While <Segment ... /> is strictly an Insight API component, it's used so commonly and depends more on Intelligence Plugins than other components, that it will be described here too.

A Segment can be specified by a simple nested array format. Each element within the array should be an array containing three elements. The first element is the name of the feature, as found in the Dataset and features]. The second element contains an operator such as = != == > < >= <=. The third element contains the value to compare on — where "n/a" is used to indicate missing values.

A few examples:

  • ["feature_e9dea1034", ">", 1.0]
  • ["feature_e9dea1034", "=", "[n/a]"]
  • ["feature_f9e8jf", "=", "US"].

There is also a shortcut the negate the condition, by adding "NOT" as first element like this:

  • ["NOT", "feature_f9e8jf", "=", "US"].

These parts can be joined together with conjunctions and parenthese to make more complex segments. The coinjunctions and parenthese can be ( ) AND OR.

A full example could look like this:

output = {
"data": {
"segmentArray": [
"(",
["feature_e9dea1034", ">", 1.0],
"AND",
`["feature_f9e8jf", "=", "US"]`,
")",
"OR",
["feature_e9dea1034", "=", "[n/a]"],
")"
]
}
}

Parenthese can be nested as many levels as you like, but the array must stay flat, so no nested arrays:

output = {
"data": {
"segmentArray": [
"(",
"(",
["feature_e9dea1034", ">", 1.0],
"AND",
`["feature_f9e8jf", "=", "US"]`,
")",
"OR",
["feature_e9dea1034", "=", "[n/a]"],
")"
"AND"
"("
...
")"
]
}
}

Then inside your JS/JSX or plugin helper code, supply the filter like this via a utility function called createUserFilter:

<Segment
name="Segment XYZ"
filter={ Utils.createUserFilter(results.data.segmentArray, results) }>
Save Segment
</Segment>