Instructions:Create/Source/Ingest

From Encyclopedia Ephemera
Revision as of 20:31, 7 May 2026 by EphemeraAdmin (talk | contribs) (Source ingestion workflow — graph enrichment only, no article edits)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Instruction Metadata
id create-source-ingest
type workflow
applies_to Sources
task_type source_ingest
priority high
status active
canonical true
include_by_default no
requires Instructions:World Bible,Instructions:Create/Source (Base Workflow),Instructions:Schema/Source Page,Instructions:Schema/Source Talk Page
tags source,ingest,create,graph-enrichment


Purpose

This workflow governs the creation of a new Sources: page from raw ingested material — a pasted document, transcript, article, report, or other primary text. It is the entry point for the source ingestion pipeline.

Core rule: Source ingestion enriches the wiki graph. It does not trigger article edits. Do not modify encyclopedia pages during ingestion. Do not create Project: queue tasks unless explicitly instructed. Create the Sources: page and its links, then stop.

Scope

This instruction applies when:

  • The agent receives raw source text to process
  • The task type is source_ingest
  • The Maintenance tab Source Ingestion tool submits a document

It does not apply to:

  • Editing existing Sources: pages
  • Creating encyclopedia articles
  • Running integration reviews

Step 1 — Classify the Source

Determine the source subtype from the content and any user-provided hint. Available subtypes:

  • News Article — journalism, press coverage, media reporting
  • Interview — Q&A, transcript, recorded conversation
  • Personal Log — diary, journal, first-person account
  • Official Statement — press release, public announcement, formal declaration
  • Academic Paper — research, analysis, scholarly work
  • Corporate Advertisement — marketing material, promotional content
  • Government Resolution — legislation, policy, formal resolution
  • Government Report — official findings, agency report, census
  • Propaganda Broadcast — biased mass communication, state media
  • Legal Document — contract, ruling, deposition, legal filing

If the subtype is ambiguous, choose the closest match. Record your reasoning in the source summary.

Step 2 — Extract Metadata

From the raw text, extract:

  • Title — a descriptive title for the Sources: page. Format: Sources:Publication or Author – Subject. Example: Sources:New Troy Tribune – Yuèmin District Unrest
  • Author — individual or organisation responsible for the document
  • Affiliation — the author's employer, faction, or institutional context
  • Date — publication or creation date. Use in-universe dates where applicable
  • Location — where the document originates or was published
  • Reliability — your assessment: high / medium / low / unknown
  • Bias — brief characterisation of the author's likely perspective or agenda
  • Canon status — primary / secondary / disputed / non-canon

For reliability and bias, reason from the affiliation and subtype. A corporate advertisement has inherent promotional bias. A government report from a faction with known interests has institutional bias. State this plainly.

Step 3 — Extract Entities

Identify named entities in the source text:

  • People (named individuals)
  • Places (locations, habitats, regions, stations)
  • Organisations (corporations, governments, factions, institutions)
  • Events (named incidents, treaties, conflicts, discoveries)
  • Technologies or artefacts (named systems, ships, technologies)

These become the Related Pages links. For each entity, determine whether an encyclopedia page exists. Red links are expected and correct — they become stub generation candidates.

Step 4 — Write the Sources: Page

Create the page at the title determined in Step 2. Use this structure:

Template block

Place the

Source Metadata
id
type
subtype
author
affiliation
date
location
canonical true
reliability
bias
status published
related
tags
template at the top of the page:
{{Source
|id=
|type=<subtype from Step 1>
|author=
|affiliation=
|date=
|location=
|canonical=true
|reliability=<high|medium|low|unknown>
|bias=
|canon_status=<primary|secondary|disputed|non-canon>
|related=<comma-separated entity names>
|tags=<comma-separated lowercase tags>
}}

Page sections

After the template, write these sections in order:

== Source Summary ==
A 2–4 sentence neutral description of what this document is, who created it,
and what it covers. Written out-of-universe (editorially), not in-universe voice.

== Document Information ==
; Type: <subtype>
; Author: <name>
; Affiliation: <organisation>
; Date: <date>
; Location: <place of origin>
; Reliability: <assessment and brief reason>
; Bias: <characterisation of perspective>

== Related Pages ==
* [[Entity One]]
* [[Entity Two]]
* [[Entity Three]]
(List all named entities from Step 3. Red links are correct and expected.)

== Content ==
The source document text, reproduced faithfully.
Preserve the in-universe voice and perspective of the original.
Do not editorially correct the content — bias and inaccuracy are features, not errors.

Step 5 — Do Not Edit Encyclopedia Pages

After creating the Sources: page, stop. Do not:

  • Edit or expand encyclopedia articles
  • Create new encyclopedia stubs (unless separately instructed)
  • Append citations to existing articles
  • Create Talk page entries for integration tasks

These are Stage B operations handled by the integration review workflow (Instructions:Maintenance/Source Integration Review) after human or agent review of the candidate list.

Step 6 — Report Back

After page creation, return:

  • The title of the created Sources: page
  • The list of related pages extracted (distinguishing red links from blue links)
  • The reliability and bias assessment
  • Any ambiguities or decisions made during classification

This output feeds the deterministic candidate discovery step in the UI.

Constraints

  • Sources pages are immutable records once created. Do not alter the Content section after initial creation. Corrections belong in the Talk page.
  • Write the Content section in the in-universe voice of the original document. The source may be wrong, biased, or propaganda. Preserve this faithfully.
  • The Source Summary and Document Information sections are written out-of-universe (editorially).
  • Do not invent metadata not present in or clearly inferable from the source text. Use "unknown" where necessary.
  • Related Pages must be real entity names from the text, not thematic associations.

Quality Check

Before submitting the page, verify:

Source Metadata
id
type
subtype
author
affiliation
date
location
canonical true
reliability
bias
status published
related
tags
template is populated with no empty required fields (use "unknown" not blank)
  • Related Pages contains at least one link
  • Content section reproduces the source faithfully without editorial correction
  • Source Summary is written out-of-universe, not in the source's voice
  • Page title follows the Sources:Author/Publication – Subject format