serverless-sagemaker-groundtruth
This serverless plugin includes a set of utilities to implement custom workflow for AWS Sagemaker Groundtruth
Currently includes :
- Serve liquid template from manifest file + prelambda the same it is done on AWS Sagemaker Groundtruth
- Run End to end test pre-lambda -> labelling simulation -> post lambda
Any Pull request will be warmly welcome !
Ideas for future implementation :
- Create Tasks from serverless CLI
- Test Chained tasks
- Expose nodejs api to integrate with testing suites
Installation
npm install --save-dev serverless-sagemaker-groundtruth
Usage as a serverless plugin
Example serverless.yml
In order to use this module, you need to add a groundtruthTasks
key into your serverless.yml
file
...plugins: - serverless-sagemaker-groundtruthfunctions: pre-example: handler: handler.pre name: pre post-example: handler: handler.postObjectDetection name: postgroundtruthTasks: basic: pre: pre-example post: post-example template: app/templates/object-detection/basic.liquid.html
Serve a liquid template against a manifest file
serverless groundtruth serve \ --groundtruthTask <groundtruthTask-name> \ --manifest <s3-uri or local file> \ --row <row index>
Test e2e behavior of sagemaker groundtruth workflow
The puppeteer module example
Here, we create a puppeteer module which is doing random bounding boxes (using hasard library) :
const BbPromise = require('bluebird')const h = require('hasard');/*** This function is binding a sequence of actions made by the user before submitting the form* This is an example showing how to simulate a use bounding box actions* @param {Page} page puppeteer page instance see https://github.com/puppeteer/puppeteer* This page is open and running in the annotation page* @param {Object} manifestRow the object from the manifest file row* @param {Object} prelambdaOutput the output object from the prelambda result* @returns {Promise} the promise is resolved once the user has done all needed actions on the form*/module.exports = function({
page,
manifestRow,
workerId
}){ // we draw 5 boxes for each worker const nBoxes = 5; // Cat and Dog const nCategories = 2; // Using the technic from https://github.com/puppeteer/puppeteer/issues/858#issuecomment-438540596 to select the node return page.evaluateHandle(`document.querySelector("body > crowd-form > form > crowd-bounding-box").shadowRoot.querySelector("#annotation-area-container > div > div > div")`) .then(imageCanvas => { return imageCanvas.boundingBox() }).then(boundingBox => { // define a random bounding box over the image canvas using hasard library // see more example in https://www.npmjs.com/package/hasard const width = h.reference(h.integer(0, Math.floor(boundingBox.width))); const height = h.reference(h.integer(0, Math.floor(boundingBox.height))); const top = h.add(h.integer(0, h.substract(Math.floor(boundingBox.width), width)), Math.floor(boundingBox.x)); const left = h.add(h.integer(0, h.substract(Math.floor(boundingBox.height), height)), Math.floor(boundingBox.y)); const randomAnnotation = h.object({ box: h.array([ top, left, width, height ]), category: h.integer(0, nCategories-1) }); const workerAnnotations = randomAnnotation.run(nBoxes) return BbPromise.map(workerAnnotations, ({box, category}) => { return page.evaluateHandle(`document.querySelector("body > crowd-form > form > crowd-bounding-box").shadowRoot.querySelector("#react-mount-point > div > div > awsui-app-layout > div > div.awsui-app-layout__tools.awsui-app-layout--open > aside > div > span > div > div.label-pane-content > div:nth-child(${category+1})")`) .then(categoryButton => categoryButton.click()) .then(() => page.mouse.move(box[0], box[1])) .then(() => page.mouse.down()) .then(() => page.mouse.move(box[0]+box[2], box[1]+box[3])) .then(() => page.mouse.up()); }, {concurrency: 1}) }).then(() => { console.log(`${workerId} actions simulation done on ${JSON.stringify(manifestRow)}`) // at the end we return nothing, serverless-sagemaker-groundtruth will automatically request the output from the page })}
The end to end command
serverless groundtruth test e2e \ --groundtruthTask <groundtruthTask-name> \ --manifest <s3-uri or local file> \ --puppeteerModule <path to the module> \ --workerIds a,b,c
Usage programmatically
You can use serverless-sagemaker-groundtruth
functions in your nodejs code by using
const gtLibs = require('serverless-sagemaker-groundtruth/lib')
endToEnd
/*** @param {String} template path to the liquid template file* @param {String} labelAttributeName labelAttributeName to use as output of the postLambda function* @param {Object} manifestRow js object reproesnting the manifest row* @param {Function} preLambda js function to use as pre lambda function* @param {Number} [port=3000] port to use to serve the web page* @param {Function} postLambda js function to use as post lambda function* @param {Array.<String>} workerIds js function to use as post lambda function* @param {PuppeteerModule} puppeteerMod module that simulate the behavior of a worker* @returns {Promise.<PostLambdaOutput>}*/return gtLibs.endToEnd({ template, labelAttributeName, manifestRow, preLambda, port, postLambda, workerIds, puppeteerMod});
Remarks
Local consolidation request file compatibilty
You need to make sure that you post lambda function is compatible with using local filename in event.payload.s3Uri
.
You can use gtLibs.loadFile
if you need such a function
Template
Your template should be submited using a button that can match with button.awsui-button[type="submit"]
selector.