Skip to content
This repository has been archived by the owner on May 24, 2019. It is now read-only.

Batches

Thomas J. Leeper edited this page Jun 8, 2015 · 7 revisions

Mimicking the RUI Features

MTurkR supplies many features that the Requester User Interface (RUI) does not offer. But the RUI also has a few features that are hard to emulate in R. This page describes some techniques for mimicking a couple of those features. You may also be interested in the Bulk HIT Creation page, which describes how to create multiple HITs from a template (similar to the batch creation feature in the RUI).


Performing MTurkR Operations on RUI Batches

As of v0.5.32, MTurkR can perform many operations at the batch level. A batch is a feature unique to the RUI that groups a set of HITs of the same HITType that were created at the same time. It is not possible to create batches via MTurkR (because batches are not part of the MTurk Requester API) but it is possible to perform some operations on batches that were created in the RUI. Specifically, HITStatus, ApproveAllAssignments, ChangeHITType, DisposeHIT, DisableHIT, ExpireHIT, ExtendHIT, GetAssignment, GetBonuses, and SetHITAsReviewing now include an optional annotation argument that can be used to apply the function to all HITs of a given batch. This works just like supplying a hit.type argument worked in earlier version of MTurkR.

For example, to add two assignments to every HIT in an RUI-created batch, simply determine the "batch number" by clicking on a batch from the "Manage HITs" page on the RUI and copying the batch number from the URL, create a character string of the form "BatchId:78382;", and supply it as the annotation argument to ExtendHIT:

ExtendHIT(annotation = "BatchId:78382;", add.assignments = 2)

The same format applies to other MTurkR functions, for example:

ExpireHIT(annotation = "BatchId:78382;")
GetAssignments(annotation = "BatchId:78382;")
DisposeHIT(annotation = "BatchId:78382;")

This allows MTurkR to perform many operations equivalent to those performed in the RUI without having to apply an operation to an entire HITType (for which there might be many distinct batches).

The annotation field can also be used to create batch-like groups of HITs of the same HITType using MTurkR. CreateHIT also has an annotation field that can be used to provide a unique identifier to one or more HITs. As with a batch, this annotation field can be used to perform operations on all HITs that have the same annotation field. (Note: Unfortunately, specifying the same annotation field for multiple HITs will not create a batch in the RUI. Batches can only be created via the RUI.)


HIT Layout Parameter Inputs and Outputs

Through HITLayout parameters, MTurkR can mimick the batch HIT creation features of the RUI. But, because the MTurk application does not preserve HITLayout inputs (at least not in a way that is accessible via the API), it is not straightforward to match HITLayout parameter input values to the assignment results of HITs.

We can mimic this using the following workflow:

  1. Store the HITLayout parameters in a dataframe.
  2. Create each HIT with the HITLayoutID and HITLayout parameters and then store the HITId (from CreateHIT or BulkCreateFromHITLayout) for each new HIT into that dataframe.
  3. Save the dataframe locally.
  4. After retrieving assignments with GetAssignments, reload your original dataframe.
  5. Then, merge your assignment dataframe with the original dataframe to combine input values and results.

In MTurkR, we could obtain this as follows:

# first load credentials with `credentials()`
# create a dataframe of HITLayout parameters:
inputvalues <- 
data.frame(hitvar1 = c("Input for HIT 1 for var1","Input for HIT 2 for var1","Input for HIT 3 for var1"),
           hitvar2 = c("Input for HIT 1 for var2","Input for HIT 2 for var2","Input for HIT 3 for var2"),
           hitvar3 = c("Input for HIT 1 for var3","Input for HIT 2 for var3","Input for HIT 3 for var3"))

# initialize a HITId variable:
inputvalues$HITId <- NA

# Create each HIT:
h <- 
BulkCreateFromHITLayout(hitlayoutid = "ANEXAMPLEHITLAYOUTID",
                        input = inputvalues,
                        annotation = paste("Bulk From Layout", Sys.Date()),
                        title = "Categorize an image",
                        description = "Categorize this image",
                        reward = ".05",
                        expiration = seconds(days = 4),
                        duration = seconds(minutes = 5),
                        keywords = "categorization, image, moderation, category")
inputvalues$HITId <- do.call(rbind, h)$HITId

# Save the `inputvalues` dataframe:
save(inputvalues, file='inputvalues.RData')

# Later, load the `inputvalues` dataframe:
load(file='inputvalues.RData')

# Then, get assignments:
assignmentresults <- GetAssignment(hit.type="ANEXAMPLEHITTYPEID",return.all=TRUE)

# Then, merge `inputvalues` and `assignmentresults`:
merge(inputvalues,assignmentresults,all=TRUE,by="HITId")

HIT Pay and Efficiency Statistics

Another nice feature of the RUI is the ability to quickly see how long workers are spending on HITs and how that work translates into dollars/hour figures (see also: Wages). The data necessary to calculate this is all returned by MTurkR, but it needs a little bit of simple wrapping to output it nicely. Here's a function that calculates all of the relevant information for a HIT:

hitstats <- function(hit){
    info <- status(hit=hit)
    assign <- assignments(hit=hit,return.all=TRUE)
    out <- list(    HITId=info$HITId,
                    HITTypeId=info$HITTypeId,
                    CreationDate=info$CreationTime,
                    Title=info$Title,
                    Description=info$Description,
                    RewardAmount=info$Amount,
                    Assignments=info$NumberOfAssignmentsCompleted,
                    MeanTimeOnHIT=mean(assign$SecondsOnHIT),
                    MedianTimeOnHIT=median(assign$SecondsOnHIT))
    out$MeanHourlyWage <- round(as.numeric(info$Amount)/(out$MeanTimeOnHIT/3600),2)
    out$MedianHourlyWage <- round(as.numeric(info$Amount)/(out$MedianTimeOnHIT/3600),2)
    return(out)
}

The result is a list containing basic details of the HIT, along with the number of completed assignments, mean and median time spent on the HIT, and the translation of those times into average hourly wages. Obviously, the median time and median wage will be less influenced by outliers (e.g., individuals that take a very long time on a HIT).

While minimizing hourly wage may be a strong business goal, in other settings it make be reasonable to use this information to target a "fair" wage for workers. For example, in academic research, it is probably ethical to design HITs that pay at least minimum wage for the target population in order to avoid compensation-related coercion.