Refonte AI

Text Collection Attachment

An array of TextCollectionAttachment objects to be labeled.

Video Support

The video attachment should have content that is a link. Supported media types are listed on the MDN Web Docs.

HTML Support in TextCollection Attachments:

Customers can pass Markdown as the string content when creating a job in TextCollection. The Markdown syntax supports the use of HTML tags as well.

However, we use the HTML-sanitize JavaScript package to sanitize all HTML tags given within the Markdown syntax in order to protect the security of the TextCollection platform. With the exception of the particular set of permitted HTML tags listed in the table on the right, this package removes all tags.

We guarantee the security and compliance of the content presented to the tasker by permitting just these particular HTML tags to be transmitted through the string. During the sanitization process, any HTML tags that are not on the list of permitted tags will be eliminated from the string. We maintain a high level of security on our platform by cleaning the HTML tags to avoid any potential security issues that may come from the use of unapproved HTML tags.

HTML tags allowed:

Content sectioning'address', 'article', 'aside', 'footer', 'header','h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'hgroup', 'main', 'nav', 'section'.

Inline text semantics'a', 'abbr', 'b', 'bdi', 'bdo', 'br', 'cite', 'code', 'data', 'dfn', 'em', 'i', 'kbd', 'mark', 'q', 'rb', 'rp', 'rt', 'rtc', 'ruby', 's', 'samp', 'small', 'span', 'strong', 'sub', 'sup', 'time', 'u', 'var'

Table content 'caption', 'col', 'colgroup', 'table', 'tbody', 'td', 'tfoot', 'th', 'thead', 'tr'

Additional Tags 'img', 'iframe'

Parameter

Type

Description

type*

sting

One of pdf, image, text, video, website, or audio.

content*

string

Content or link to relevant file.

forms

array

Array of field_id strings from FormField. If this value is set, only show the corresponding attachment if one of the referenced form fields is active.

Unit Field

UnitField objects define simple components for data collection.

Conditional Fields

There are situations where a field should only appear if certain options are chosen for other fields. In these situations, the conditions—the dependent questions and matching sets of options—can be specified.

The conditions property should have the following structure: an array of objects, which define one set of conditions allowing the field to be shown. The operators AND (), OR ( [ ] ), and NOT ( not ) are supported, so you could specify an arbitrary set of fields and choices. Each set may contain objects or arrays with the following:

  • Key: the field_id of the dependent field
  • Value: an object specifying the desired choices for the dependent field.

See the code on the right for examples of conditions. As of right now, only dependent fields of type CategoryField are compatible with conditions. On other fields, the syntax is correct, although it might cause problems or undefined behavior.

Parameters

type string required

One of text, boolean, number, datetime, or category, select, time_range.

field_id string required

Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project. Must not be an empty string.

description string

A brief description about what the response should be. This may change among tasks within a project.

hint string

Longer explanation of why the field exists and how it should be used. Renders as a tooltip.

required boolean

Determines whether or not a response for this field is required. The default is false.

min_responses_required integer

The minimum number of separate annotations allowed for this field. Must be larger than 0. The default is 1.

max_responses_required integer

The maximum number of separate annotations allowed for this field. Must be larger than or equal to min_responses_required, with an upper bound of 100. The default is 1.

conditions array of objects

A set of conditions which must be satisfied for this field to be shown. Default is undefined.

Additional Fields objects

See the Text Field, Boolean Field, Number Field, Datetime Field, and Category Field sections.

Example

// Example of UnitField with conditions
{
  type: "category",
  field_id: "occlusion",
  title: "Is there occlusion in the image?",
  choices: [{label: 'None', value: '0' },
            {label: 'A little', value: '1'},
            {label: 'A lot', value: '2'}],
  conditions: [{}],
},
{
  type: "category",
  field_id: "occlusion_detail",
  title: "What is the cause of the occlusion?",
  choices: [{label: 'Rain', value: 'rain'},
            {label: 'Shadow', value: 'shadow'}],
  conditions: [{
    occlusion: ['1', '2'], // show if 1 or 2 are selected
    // equivalently {not: [[], ['0']}
    // equivalently [{not: []}, {not: ['0']}]
    // equivalently [['1'],['2']]
  }],
},
{
  type: "text",
  field_id: "a_lot_of_shadow",
  title: "Please describe why there is so much shadow.",
  conditions: [{
    // show if 2 and shadow are selected in their respective fields
    occlusion: ['2'], 
    occlusion_detail: ['shadow'],
  }],
},

Text Field

Subclass of UnitField and returns a string response.

Parameters

max_character integer

The maximum number of characters allowed in the field.

show_word_counter boolean

To display word count in text fields, we can include `show_word_count = true` in the text field's object.

show_markdown_preview boolean

To enable a markdown preview for the text field, we can include `show_markdown_preview = true` in the text field's object.

max_tokens integer

To enable maximum word counts to a specific text field, we can include `max_tokens = 1000` to set the maximum words in a text response to be 1000 words.

min_token integer

To enable minimum and maximum word counts to a specific text field, we can include `min_tokens = 100` to set the minimum words in a text response to be 100 words.

disable_pasting boolean

To disable copying and pasting to a specific text field, we can include `disable_pasting = true`.

Example

{
  "type": "text",
  "field_id": "summary",
  "title": "Summary",
  "min_responses_required": 1,
  "max_responses_required": 3,
  "max_characters": 500,
  "required": true
}

Boolean Field

Subclass of UnitField and returns a boolean response. Has no additional parameters.

Example

{
  "type": "boolean",
  "field_id": "availability",
  "title": "Item Availability",
  "description": "Choose true if available."
}

Number Field

Subclass of UnitField and returns a string response based on the annotated number.

Parameters

use_slider boolean

Set to true to use a slider instead of textbox.

min float

Sets the minimum value of the slider.

max float

Sets the maximum value of the slider.

step float

Sets the step value of the slider.

prefix string

A string label for the lowest numerical value response.

suffix string

A string label for the greatest numerical value.

mid_label string

A string label for the middle numerical value.

Example

{
  "type": "number",
  "field_id": "item_price",
  "title": "Item Price",
  "description": "Leave empty if not applicable.",
  "required": false,
  "use_slider": true,
  "min": 0,
  "max": 100
}

Datetime Field

Subclass of UnitField and returns a DatetimeAnnotation response.

Definition: DatetimeSpec

An enum that consists of year, month, day, hour, and minute.

Definition: DatetimeAnnotation

An interface that contains optional number fields including year, month, day, hour, and minute.

Parameters

include array of objects required

An array of DatetimeSpec elements. Must contain at least one element.

Example

{
  "type": "datetime",
  "field_id": "release_date",
  "title": "Date of Product Release",
  "description": "Leave empty if not applicable.",
  "include": ["year", "month", "day"],
  "defaults": {
    "year": 2021,
    "month": 4,
    "day": 13
  }
}

Category Field

Subclass of UnitField and returns an array of selected CategoryChoiceValue elements in its response. CategoryChoice elements with subchoices are only used for navigation. The only selectable CategoryChoice elements are those with no subchoices.

Parameters

choices array of objects required

An array of CategoryChoice elements to define the relevant choice.

min_choices integer

Minimum number of choices to select.

max_choices integer

Maximum number of choices to select. If this value is greater than 1, the form renders a checkbox. Otherwise, it renders a radio button.

CategoryChoice

label string required

The label of the choice field. This description may change among tasks within a project.

CategoryChoiceValue array of objects

The value of the choice field. Must be a string, number, or boolean.

hint string

The tooltip text shown for this choice.

subchoices array of objects

An array of CategoryChoice elements to define the relevant subchoices.

Example

{
  "type": "category",
  "field_id": "genre",
  "title": "Select all genres that apply.",
  "choices": [
    {
      "label": "Hip-Hop/Rap",
      "value": "hip-hop-rap",
      "hint":
        "It consists of a stylized rhythmic music
        that commonly accompanies rapping, a rhythmic
        and rhyming speech that is chanted.",
      "subchoices": [
        { "label": "Dirty South", "value": "dirty-south" },
        { "label": "Industrial Hip Hop", "value": "industrial-hip-hop" },
        { "label": "Nerdcore", "value": "nerdcore" },
        { "label": "Rap", "value": "rap" },
      ]
    },
        {
      "label": "R&B/Soul",
      "value": "rb-soul",
      "subchoices": [
        { "label": "Disco", "value": "disco" },
        { "label": "Funk", "value": "funk" },
        { "label": "Motown", "value": "motown" },
      ]
        },
  ],
  "min_choices": 1,
  "max_choices": 5
}

Timerange Field

Subclass of UnitField.

Parameters

default_seconds array of integers required

Must have length 2, and be in range [0, 24 * 60 * 60]

increment_seconds integer

Must be between 1 and 60 * 60

default_from_field string

Must be a valid field_id

Example

{
  "type": "time_range",
  "field_id": "hours",
  "title": "Store Hours",
  "defaults_seconds": [
    28800,
    72000
  ],
  "increment_seconds": 300,
  "max_responses_required": 2, 
  "min_responses_required": 0
}

Select Field

Subclass of UnitField.

Parameters

choices array of objects

An array of selectable options, choices is not required if choices_from_field is present.

choices_from_field string

Must be a valid field_id

Example

{
  "type": "select",
	"field_id": "sentiment",
  "title": "Sentiment",
  "description": "Choose a sentiment that best describes this text",
  "required": True,
  "choices_from_field": "Options",
}

Ranking Field

RankingField objects allow you to define task to rank task attachments. Returns a list response with ordered options.

Parameters

title string

A brief description about what the response should be. This may change among tasks within a project.

hint string

An array of child UnitField and FieldSet objects. Must contain at least 2 elements.

first_label string

Determines whether or not all.

num_items_to_rank integer

The number of options required to rank (can be less than number of attachments).

required boolean

Determines whether or not all num_items_to_rank fields should filled.

Example

{
  "type": "ranking_order",
  "field_id": "relevance_ranking",
  "title": "Rank titles based on their relevance to the article",
  "hint": "From the most relevant to the least one",
  "first_label": "Best",
  "last_label": "Worst",
  "num_items_to_rank": 3
}

Form Field

You can create many mini-forms with varied attachments by using FormField objects. The child fields of the object will populate these mini-forms. Provides key-value pairs defined by its child fields as a dictionary response.

Parameters

type string required

A brief description about what the response should be. This may change among tasks within a project.

field_id string required

A unique identifier for the field, which should not change among tasks within a project.

title string required

Field title to be displayed to taskers. This should be short and singular. This may change among tasks within a project.

description string

A brief description about what the response should be. This may change among tasks within a project.

fields array of objects required

An array of child UnitField and FieldSet objects. Any FieldSet objects here must have incline set to true

Example

{
  "type": "form",
  "field_id": "form_query",
  "title": "Query Intention",
  "fields": [
    {
      "type": "text",
      "field_id": "query_intention",
      "title": "Query Intention",
      "hint": "Please investigate the search links."
    },
  ]
}

Text Collection Response Format

An annotations field will be present in the response object, which is a component of the callback POST request and is saved permanently as a part of the task object. The annotations object is a dictionary with the corresponding annotation for each field as its value and the field_id given in the job parameters as its key.

Every annotation will belong to the type specified in the corresponding field above. If the value of max_responses_required exceeds 1, the annotation will take the form of an array of that kind.

Example

{
  "response": {
    "annotations": {
      "category_name": "Soup", //TextField
      "category_items": [ //FieldSet with max_responses_required greater than one
        {
          "item_name": "Tom Yum Chicken Soup", //TextField
          "item_price": "11.79" //NumberField
        },
        {
          "item_name": "Tom Yum Beef Soup", //TextField
          "item_price": "11.79" //NumberField
        }
      ],
      "category_metadata": { //FieldSet
        "gluten_friendly": true, //BooleanField
        "labels": [ //TextField with max_responses_required greater than one
          "Free Range", 
          "All Natural"
        ] 
      }
    }
  },
  "task_id": "5774cc78b01249ab09f089dd",
  "task": {
    // populated task for convenience
  }
}

Text Collection Hypothesis

In order to save workers' time while annotating an image, prelabels can be included in the hypothesis field when constructing a text collection assignment.

To add pre-labels to a task using hypothesis, you must supply them in the task's hypothesis payload field at task creation. The task response's schema and the hypothesis object's schema must coincide.

  • Check the chosen task type's task response field schema.
  • Examine the taxonomy of your project (label names, attribute criteria, types of annotations, etc.).
  • Produce pre-labels in the format specified by the previously mentioned taxonomy and schema.
  • Make a task with the pre-labels in the hypothesis field at the same top level as the project and instructions task fields.

The format for the hypothesis will be similar to that of Refonte.Ai's task response. Within the hypothesis object, the annotations field is required for this specific job type. The response format and hypothesis differ simply in that you must include two more field fields inside each field that you wish to pre-annotate. field type (category, select, text, etc.) is described by type. The identification assigned to this field for tracking (field name) is described by field_id.

You can find these two fields in your task taxonomy

Note: The response format for text fields is different from those of other types. Rather of an array of arrays containing strings, the response field for this specific field type will be an array of a single string. duties, payloads, and hypotheses

task_payload_with_hypothesis

{
 ...
 "batch": "regular_batch_name",
 "hypothesis": {
   "annotations": {
     "(EXAMPLE) Multiple Choice Question": {
       "type": "category",
       "field_id": "(EXAMPLE) Multiple Choice Question",
       "response": [
         [
           "B"
         ]
       ]
     }
   }
 },
 ...
}

task_taxonomy

{
   "fields": [
     {
       "type": "category",
       "field_id": "(EXAMPLE) Multiple Choice Question",
       "title": "Which option best fits this task?",
       "choices": [
         {
           "label": "A",
           "value": "A"
         },
         {
           "label": "B",
           "value": "B"
         },
         {
           "label": "C",
           "value": "C"
         }
       ],
       "min_choices": 1,
       "max_choices": 1,
       "description": "Select one of the following. "
     }
   ]
 }

task_payload_with_hypothesis_text_field

{
   ...
   "hypothesis": {
      "annotations": {
        "Product Description": {
            "type": "text",
            "field_id": "(EXAMPLE) Text Input Field",
            "response": [
                "Dolore in dolor occaecat deserunt ex in qui non amet est."
               ]
           }
       }
   }
   ...
}

Named Entity Recognition Label

NamedEntityRecognitionLabel objects define the taxonomy of labels to use to annotate spans of text.

Parameters

name string required

A unique identifier for this label.

display_name string

An alias for this label to display to taskers.

description string

A description of what this label should represent. Displayed to taskers to improve quality.

children array of objects

An array of NamedEntityRecognitionLabel objects to group underneath this label. Specifying this field causes this label itself to no longer be used for labeling text spans.

NamedEntityRecognitionAttribute object

define form fields for individual annotations.

Parameters

type string

Only 'select' for now.

options array of objects

List of select option objects.

display_name string

Optional display name.

description string

Optional description.

AttributeSelectOption objects

objects define possible values for select attributes.

Parameters

value string

The value that will show up in the response if this option is selected.

display_name string

Optional display name if different from the value.

Named Entity Recognition Relationship Definition

NamedEntityRecognitionRelationshipDefinition objects specify the types of relationship that can exist between two text spans. There are two types of relationships: named and unnamed. If you need to differentiate between several kinds of relationships that could occur between the same two text spans, a named connection can be helpful. For example, you may want to distinguish between a "child of" and a "sibling of" relationship when annotating a description of someone's family history.

A task can only specify one type of relationship. Either all the relationships in a task must be named, or all must be unnamed.

Parameters

name string

A unique identifier for this type of relationship. Required for named relationships; disallowed for unnamed relationships.

display_name string

A description for this relationship to display to taskers. Should be able to be used to construct a short phrase describing the relationship. For example, a relationship between two text spans "A" and "B" with display_name "is parent of" would be rendered to taskers as "A is parent of B". Required for named relationships; disallowed for unnamed relationships.

is_directed boolean

A field indicating whether the directionality of this relationship matters. For example, a "is parent of" relationship would likely be directed, whereas a "is sibling of" relationship would likely not be directed. Optional for named relationships; disallowed for unnamed relationships.

source_label string

A string referencing the name field of a NamedEntityRecognitionLabel object. If set, mandates that the source text span of this field must be labeled with the corresponding NamedEntityRecognitionLabel, or one of its children. Optional for both named and unnamed relationships.

target_label string

A string referencing the name field of a NamedEntityRecognitionLabel object. If set, mandates that the target text span of this field must be labeled with the corresponding NamedEntityRecognitionLabel, or one of its children. Optional for both named and unnamed relationships.

Named Entity Recognition Callback Format

The answer object is saved permanently as a component of the task object and is included in the callback POST request. Response to NamedEntityRecognitionResponse is a named entity recognition response object is composed of two arrays: one for the entity annotations and another for the relationships between these entities.

NamedEntityRecognitionAnnotation The structure for a single entity annotation in the named entity recognition answer; includes information about the recognized text span's position, content, and unique identifier; additionally, it includes its label and any optional attributes.

NamedEntityRecognitionRelationship

In tasks with undirected relationships, the source_ref and target_ref fields are interchangeable. In tasks with links that do not have relationship names, the name field will be left blank.

Example

{
  "annotations": [
    {
      "id": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
      "start": 10,
      "end": 17,
      "text": "Alex Wang",
      "label": "person",
    },
    {
      "id": "a76da53e-4ebd-4466-aed7-80db6fb98329",
      "start": 22,
      "end": 31,
      "text": "Transform",
      "label": "conference",
    }
  ],
  "relationships": [
    {
      "id": "ade8e9e9-ef9c-4fc7-9517-62d79a15c1cb",
      "source_ref": "b86c22a3-1f7c-4be2-bb8f-899ee9324c0b",
      "target_ref": "a76da53e-4ebd-4466-aed7-80db6fb98329",
      "name": "speaker_at",
    }
  ]
}

NamedEntityRecognitionResponse

Field

Type

Description

annotations

object

array List of NamedEntityRecogntionAnnotation objects.

relationships

object

array List of NamedEntityRecognitionRelationship objects.

NamedEntityRecognitionAnnotation

Field

Type

Description

id

string

Unique identifier.

start

number

Start index of the text span.

end

number

End index of the text span.

text

string

Text of the text span.

label

string

References the name field of a label in the task params.

Updated about 2 months ago