Annotations - Ragnerock Docs

Annotations are Ragnerock’s mechanism for turning raw data into structured, queryable results. Define how your data should be processed, run workflows, and query the outputs with SQL.

What Are Annotations?

The annotation system has three components:

Operators: Define how to process your data and what outputs to produce
Workflows: Orchestrate operators into processing pipelines
Annotations: The structured results attached to your data sources

The result is durable, table-like data you can query with SQL, export to your data warehouse, or use in your quantitative models.

Operators

Operators define how Ragnerock should process and analyze your data. Each operator specifies an output schema (what columns and fields to produce) and AI instructions (how to fill them in).

Every operator has:

Name: Identifier that becomes the SQL table name
JSON Schema: The structure of the output data (columns, types, constraints)
Generation Prompt: Instructions for the AI model
Scope: What unit of data to process

Creating Operators

Navigate to the Workflows section in the sidebar, click Operators, then click New Operator. The operator editor lets you configure all fields:

Enter a Name and optional Description
Select the Scope from the dropdown (Document, Page, Paragraph, Sentence, Sheet, or Row)
Write a Generation Prompt with instructions for the AI
Define the JSON Schema using the visual schema builder or the raw JSON editor

The operator editor form showing all fields filled in: name, description, scope, generation prompt, and JSON schema editor

Example Schema

{
  "type": "object",
  "properties": {
    "overall_sentiment": {
      "type": "string",
      "enum": ["very_negative", "negative", "neutral", "positive", "very_positive"],
      "description": "Overall sentiment of the content"
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score for the sentiment"
    },
    "key_themes": {
      "type": "array",
      "items": {"type": "string"},
      "maxItems": 5,
      "description": "Main themes discussed"
    }
  },
  "required": ["overall_sentiment", "key_themes"]
}

For a complete guide to schema design, prompts, scopes, batch processing, and multi-annotation mode, see Operators.

Workflows

Workflows chain multiple operators together into processing pipelines. They’re built using a visual graph editor and can be triggered manually or automatically when new data is uploaded.

Building Workflows

Navigate to the Workflows section in the sidebar and click New Workflow. The visual editor lets you:

Add operator nodes by dragging them from the palette onto the canvas
Connect nodes by drawing edges to define data dependencies
Configure each node with conditions, error handling, and persistence settings
Enable auto-run to trigger the workflow automatically when new data is uploaded

The visual workflow editor showing an operator node on the canvas

For detailed workflow features including conditional execution, error handling, ephemeral annotations, and YAML import/export, see Workflows.

Running Workflows

To run a workflow:

Open the workflow in the editor and click Run, or right-click a workflow in the sidebar and select Run
Select the data sources to process from the dialog
Click Run Workflow to start execution
Monitor progress in the Jobs dashboard

The Jobs dashboard showing a completed workflow with per-document subtasks and status indicators

Querying Annotations

After workflows run, the results are queryable with SQL. Open the Query Explorer from the sidebar and use the operator name as the table name:

SELECT
    document_name,
    overall_sentiment,
    confidence,
    key_themes
FROM sentiment_analysis
WHERE overall_sentiment IN ('positive', 'very_positive')
ORDER BY confidence DESC
LIMIT 20

Click Execute to run the query. Results appear in a sortable, filterable table.

The Query Explorer showing a sentiment analysis query with results in a table below

Aggregations

SELECT
    overall_sentiment,
    COUNT(*) as count,
    AVG(confidence) as avg_confidence
FROM sentiment_analysis
GROUP BY overall_sentiment
ORDER BY count DESC

Provenance

Every annotation maintains full provenance:

Source: The original data source
Location: The specific page, paragraph, or row
Operator: Which operator produced the result

This enables complete audit trails and verification of any data point.

Best Practices

Start specific, then generalize: Begin with a focused schema, expand as needed
Use enums for categorical data: Enables consistent filtering and aggregation
Test on sample data: Validate operator design before batch processing
Query incrementally: Start with simple queries, add complexity as needed
Track provenance: Use annotation provenance for audit trails

Next Steps

Learn about Data Sources for upload and processing
Dive deeper into Operators design and configuration
Understand Workflow pipeline orchestration
Explore SQL Queries for advanced syntax
Use the Research Agent for interactive exploration