top of page
Search

Components of an AI Pipeline

Writer: Tim BurnsTim Burns

Soggy Titmouse
Soggy Titmouse

Good architecture has consistent themes that can apply to many different systems. As I'm focusing on AI pipelines, here are some basic steps to a generic system


Input Collection and Preprocessing

What kinds of technologies do we employ here? What are some advantages of each?


  • Web Forms


This path leads to the standard CRUD SSAS application - as a user, you log into a web page endpoint, specify your data, and submit. Technologies (React, jQuery, HTML forms)


  • Documents in specific folders


Documents in specific folders is a more substantial basis for an accurate AI-driven system. What AI can do is to read the inputs that we have for it in our own words, using our organizational structure, and from tools that we (as humans) find easy to use. Documents, meeting transcripts, verbal recordings


  • Bringing Structure to Unstructured Text


OpenAI, Claude, or other LLMs can interpret the raw text and give it structure.


Goal: Turn Fuzzy Ideas into Clean, Structured Inputs


Reference and Existing State Searching

When we build a data system, the pipeline can inform itself if it represents the existing system well.


We should be able to

  • Detect changes

  • Ignore or acknowledge duplicate (existing) data

  • Highlight the anomaly for the user to examine more closely


Goal: Understand the Event in the Context of Previous, Similar Events


Drafting and Rules Engine

The next stage is to produce a drafting engine. It relies on previous rules and template outputs to generate clear instances for individual prompts for the final document.


  1. Produce LLM output incorporating all of the previous stats.e

  2. Drive formats using specific document output templates


Goal: Draft a good document tailored to the use case


Document and Abstract Writer

The drafting engine will take the draft's output and create a document and abstract that a human can read. It contains what is needed for complete validation, including.

  • History and metadata

  • A template-based format

  • Specific style and voice instructions


Goal: Generate a complete document set that feels human-written and addresses the use case


Human-in-the-Loop Review

We need a web or word-based interface that allows humans to review, annotate, and suggest rules-based enhancements.


Version control and comment threading. Candidates would be Word or Google Docs integrations, and some integration to Gitlab or other version control.


Goal: Keep humans in control while reducing the grunt work


Feedback Loop, and Model Refinement

Taking the actual results back into the system is essential to any automated document or data generation app. The purpose here is to improve the RAG, prompts, and rule-based input to improve the documents we send out.


Goal: Continuously improve accuracy, style matching, and risk detection.

 
 
 

留言


  • Facebook
  • Twitter
  • LinkedIn

©2019 by Owl Mountain Software, LLC. Proudly created with Wix.com

bottom of page