Skip to content
This repository has been archived by the owner on May 24, 2019. It is now read-only.
Thomas J. Leeper edited this page Jan 9, 2015 · 2 revisions

MTurk centers around Human Intelligence Tasks (HITs), which workers complete. Thus, creating HITs is synonymous with using MTurk. This page walks through how to create HITs.

Some important terminology and associated distinctions are worth noting up front. A HIT is a single task on MTurk. A worker completing a HIT is taking on an assignment; HITs can be configured to have multiple assignments (i.e., multiple workers completing an identical task). If you need construct versions of a task (e.g., coding different pictures, finding information for different websites), each of those versions is a separate HIT. The MTurk Requester User Interface (RUI) calls these versions of a task batches but the API does not operate in terms of batches (they simply don't exist except in the RUI). If you need workers to complete multiple similar tasks, it is necessary to create a separate HIT for each task, even if all of them are built on an identical template. They can still be grouped visually in the MTurk worker interface by attaching each HIT in a batch to the same HITType. A HITType is the title, description, keywords, qualification requirements, and reward amount attached to a HIT; HITs are visually grouped on the worker interface by HITType. This means that creating a "batch" involves registering a single HITType for the common title, description, reward, etc. for a set of HITs and then creating several HITs attached to that HITType. This distinction between HITTypes and individual HITs is absent from the RUI and is one of the challenging parts of using MTurkR, but the package is designed to simplify the process as much as possible.

Registering a HITType

There are two ways to register a HITType: calling RegisterHITType directly or by specifying the requisite parameters (title, description, reward, HIT duration) atomically in CreateHIT. If creating multiple HITs, it is better to use the former method. If you are creating a single HIT (e.g., a survey link), the latter approach is easier since it handles the registration internally.

Creating a HIT

From the perspective of the MTurk API (and thus MTurkR), a HIT is simply some content to display to the worker (the question content) and either a HITType or the parameters necessary to register one.

The confusing part of this is that there are four different ways to specify the question content of a HIT:

  1. A QuestionForm data structure (a propriety XML markup for designing HIT content)
  2. An HTMLQuestion data structure (a HIT markup that uses XHTML elements)
  3. An ExternalQuestion data structure (essentially just a link to another website that is designed to handle receiving a redirect from MTurk)
  4. A HITLayoutId (a code referring to a HIT template created in the RUI)

The Amazon Mechanical Turk blog includes some advice on how to effectively design HITs, regardless of which of these methods you use.

The easiest of these methods is 4, which involves setting up a HIT template in the requester user interface and then accessing its HITLayoutId. The other methods require manually creating some kind of question content, but provide an array of flexibility.

The QuestionForm method is the most restrictive because it involves use of a relatively strict and limited XML schema.

The HTMLQuestion method is best for requesters who are familiar with XHTML markup (indeed, this is what the RUI generates), but is probably somewhat difficult for new users.

The ExternalQuestion method provides the cleanest interface for MTurk workers but requires the question content to be handled on a server that can operate SSL (HTTPS) and redirect the worker to a particular URL. (This is discussed in some detail on the Surveys page.)

The question argument in CreateHIT accepts any a data structure of types 1-3. Appropriate data structures can be created using GenerateQuestionForm, GenerateHTMLQuestion, and GenerateExternalQuestion, respectively. If using method 4 (HITLayoutId), question should be NULL in CreateHIT and the hitlayoutid argument should be specified instead.