In Part 1 of this series, we explored how agentic systems serve as an operational layer between foundational models and business outcomes. We examined patterns across industries and how these systems help companies automate tasks, analyze data, and solve complex problems through orchestrated workflows.
A common question arises: Is applied AI exclusively about building agents? The answer is no. While agentic systems represent one significant opportunity, there exists another equally important dimension: vertical adoption through domain-specific models. This second pathway focuses on companies building and owning their own specialized AI capabilities rather than relying solely on general-purpose foundation models.
Agentic Systems vs. Vertical Adoption
Agentic systems excel in general-purpose domains such as mathematical reasoning, logical analysis, data interpretation, and code generation. These systems function as intelligent generalists, applying broad knowledge to solve common problems across various contexts.
However, challenges emerge when companies operate in specialized verticals with unique requirements. Consider a healthcare organization with proprietary datasets, compliance constraints (such as HIPAA), and domain-specific workflows. In such cases, general-purpose models may not suffice. Companies face three primary motivations for developing custom models:
- Intellectual Property Ownership — retaining full control over the model's intelligence and capabilities
- Regulatory Compliance — meeting industry-specific regulations that general models cannot guarantee
- Competitive Advantage — creating differentiated capabilities that competitors cannot easily replicate
This represents a fundamental shift in strategy: from "using intelligence" to "owning intelligence." When organizations use general models like GPT or Claude, they own the application logic but not the underlying intelligence. Custom models enable complete ownership of the entire stack.
Companies Building Domain-Specific Models
Several companies exemplify this vertical adoption approach. Let's examine concrete examples:
Voice Intelligence
Companies like Smallest AI and ElevenLabs build specialized models for voice interaction. Rather than integrating third-party voice APIs, they develop proprietary models optimized for specific use cases—customer service, voice agents, or real-time conversation. This vertical integration allows them to:
- Optimize latency for real-time applications
- Reduce operational costs through efficient model design
- Maintain quality control across the entire pipeline
- Customize behavior for specific industry requirements
Healthcare Decision Support
In my previous work, we developed a custom model for medical document synthesis and healthcare decision support. The motivation extended beyond compliance—we needed domain expertise embedded directly into the model's weights. This approach ensures that the model understands medical terminology, clinical workflows, and evidence-based reasoning without requiring extensive prompt engineering for each query.
Applied Research vs. Fundamental Research
It is important to clarify expectations. Most organizations cannot conduct fundamental AI research—the development of entirely new architectures or pre-training models from scratch. This requires resources available primarily to large research laboratories and frontier AI companies.
Instead, organizations engage in applied research, which involves:
- Architecture Adaptation — selecting and modifying existing architectures (e.g., transformers, state-space models) for specific domains
- Continued Pre-training — further training foundation models on domain-specific data
- Fine-tuning — adapting pre-trained models to specific tasks with supervised or reinforcement learning
- Alignment Training — using techniques like Direct Preference Optimization (DPO) or Group Relative Policy Optimization (GRPO) to align model behavior with desired outcomes
Applied research requires deep understanding of architectures, training paradigms, dataset preparation, and evaluation methodologies. It is not merely following tutorials—it demands deliberate practice and domain expertise.
Unlike the agentic systems work discussed in Part 1 (which emphasizes software engineering), vertical adoption demands machine learning expertise. Companies need professionals who understand training dynamics, can design experiments, and can iterate on model architectures.
Essential Skills for Domain-Specific AI
Building and deploying domain-specific models requires a distinct skill set. Let's break down the core competencies:
1. Understanding Modern Architectures
You should be familiar with:
- Transformer models — the foundation of most modern language models (GPT, BERT, T5)
- Vision-Language Models (VLMs) — architectures that process both images and text
- State Space Models — emerging alternatives to transformers for certain applications
- Mixture of Experts (MoE) — architectures that activate different sub-networks for different inputs
- Specialized Encoders/Decoders — for audio, video, or other modalities
2. Training Paradigms
Understanding when and how to apply different training approaches:
- Full Fine-tuning — updating all model parameters (requires significant computational resources)
- Parameter-Efficient Fine-Tuning (PEFT) — techniques like LoRA or QLoRA that update only a fraction of parameters
- Continual Pre-training — extending a foundation model's knowledge with domain-specific data
- Alignment Training — teaching models to follow instructions and align with human preferences
3. Data Preparation and Curation
The quality of your model depends fundamentally on your data. This includes:
- Collecting relevant domain-specific datasets
- Cleaning and preprocessing data appropriately
- Creating high-quality training examples
- Balancing datasets to avoid bias
- Versioning and managing data pipelines
Two Career Paths: Training vs. Optimization
The field naturally divides into two complementary specializations. Most professionals will focus on one path, though some knowledge of both remains valuable.
Path 1: Training and Validation
This path focuses on:
- Designing and selecting appropriate architectures
- Preparing training datasets
- Running experiments and comparing approaches
- Evaluating model performance
- Iterating on model design based on results
Path 2: Scaling and Optimization
This path emphasizes:
- GPU optimization and efficient compute utilization
- Kernel optimization for specific operations
- Inference optimization (reducing latency and cost)
- Distributed training across multiple machines
- Model quantization and compression
You do not need to master both paths simultaneously. Choose one as your primary focus, then expand your knowledge into the other area as needed.
Applied AI vs. Applied ML: A Critical Distinction
Understanding the difference between Applied AI and Applied ML helps clarify career opportunities and skill requirements.
Applied AI: Owning Intelligence
Applied AI involves building systems that exhibit human-like cognitive capabilities:
- Understanding — comprehending user intent, context, and nuance
- Reasoning — performing logical analysis and multi-step problem-solving
- Generation — creating coherent text, images, or other content
- Interaction — engaging in dialogue and adapting to user needs
Examples include voice agents, chatbots, document synthesis systems, and creative AI tools. These systems aim to replicate or augment human intelligence for specific tasks.
Applied ML: Owning Algorithms
Applied ML focuses on developing algorithms for specific, well-defined tasks:
- Prediction — forecasting demand, prices, weather, or other quantities
- Classification — categorizing data into predefined groups
- Recommendation — suggesting relevant items to users
- Optimization — finding optimal solutions to constrained problems
- Anomaly Detection — identifying unusual patterns in data
These systems do not require human-level intelligence. Instead, they excel at specific algorithmic tasks through pattern recognition and mathematical optimization.
Applied AI owns intelligence. Applied ML owns algorithms. Both use similar underlying technologies (neural networks, transformers), but serve different purposes.
Real-World Applications of Applied ML
Let's examine concrete examples to illustrate Applied ML in practice:
Recommendation Systems
Companies like Spotify or Netflix use sophisticated neural networks to encode user preferences and content features into latent spaces. These systems do not "understand" music or movies in a cognitive sense—they identify statistical patterns in user behavior and content characteristics to make predictions about what users might enjoy.
Quantitative Finance
Algorithmic trading systems use machine learning for alpha signal generation—identifying market opportunities likely to outperform average returns. These models analyze market dynamics, technical indicators, and other features to generate trading signals. They require domain expertise but not human-like reasoning capabilities.
Demand Forecasting
Retail and logistics companies use time-series models (including transformer-based architectures) to predict future demand. These systems analyze historical patterns, seasonality, and external factors to forecast quantities. The models are highly specialized but do not exhibit general intelligence.
Personal Example: Aquaculture
During my time at Captain Fresh, we used transformer models to analyze satellite imagery of aquaculture ponds. The goal was to predict harvest readiness and estimate biomass availability. This application combined computer vision with time-series analysis, demonstrating how modern architectures can solve highly specific problems outside traditional AI domains.
Educational Resources
To build expertise in applied AI and ML, I recommend focusing on three areas:
1. Machine Learning Fundamentals
Primary Resource: Kevin Murphy's Probabilistic Machine Learning
This comprehensive textbook covers modeling paradigms, generative models, decision-making frameworks, and inference techniques. It provides the mathematical foundation necessary for understanding modern AI systems.
Alternative: Stanford's CS229 Machine Learning course for a more practical introduction.
2. Modern Deep Learning Architectures
Primary Resource: Simon Prince's Understanding Deep Learning
This open-source book includes practical notebooks and covers modern architectures: transformers, diffusion models, autoencoders, energy-based models, and graph neural networks. It bridges theory and practice effectively.
Supplementary: Stanford's CS25 (Transformers United) and CMU's Advanced Deep Learning courses provide additional perspectives and cutting-edge research.
3. High-Performance Machine Learning (Optional)
If you choose the optimization path, study GPU programming, distributed training frameworks, and inference optimization techniques. Resources include:
- Stanford's CS149 (Parallel Computing)
- NVIDIA's CUDA programming guides
- PyTorch and JAX optimization documentation
Avoid relying solely on YouTube tutorials or interview preparation content. These resources rarely teach problem-first thinking or provide the depth necessary for applied work. Focus on comprehensive textbooks and university courses that build fundamental understanding.
Choosing Your Specialization
To avoid overwhelming yourself, consider specializing along several dimensions:
By Modality
- Text-only models (language understanding, generation)
- Vision-Language models (multimodal understanding)
- Audio models (speech, music, sound)
- Video models (temporal understanding, generation)
By Industry Vertical
- Healthcare (medical imaging, clinical decision support)
- Finance (trading, risk assessment, fraud detection)
- Life Sciences (drug discovery, protein prediction)
- Creative Industries (music, art, design)
- Enterprise Software (productivity, automation)
By Application Type
- Interactive systems (agents, chatbots)
- Analytical systems (forecasting, classification)
- Generative systems (content creation, synthesis)
- Optimization systems (resource allocation, planning)
Start with one combination (e.g., "text models for healthcare applications") and build deep expertise before expanding to adjacent areas.
Key Takeaways
- Applied AI is not limited to agents — vertical adoption through domain-specific models represents a major opportunity
- Companies build custom models for three reasons — IP ownership, compliance requirements, and competitive advantage
- Applied research differs from fundamental research — most work involves adapting architectures rather than inventing new ones
- Two career paths exist — training/validation specialists and scaling/optimization engineers
- Applied AI vs. Applied ML — the former builds intelligence, the latter builds algorithms for specific tasks
- Focus on fundamentals — study Kevin Murphy's and Simon Prince's textbooks rather than tutorial content
- PEFT techniques are important — LoRA and QLoRA enable efficient fine-tuning with limited resources
- Specialize strategically — choose a modality, industry, and application type to focus your learning
- Applied ML remains essential — forecasting, recommendations, and optimization still require specialized algorithms
- Build problem-solving skills — prioritize deep understanding over interview preparation
Conclusion
This article explored the second major dimension of applied AI: building and deploying domain-specific models. While Part 1 focused on agentic systems and orchestration, Part 2 emphasized model development, training methodologies, and the distinction between applied AI and applied ML.
The field offers multiple entry points depending on your interests and background. Whether you focus on training new models, optimizing existing ones, building agents, or developing specialized algorithms, the fundamental requirement remains the same: deep understanding of underlying principles combined with practical problem-solving skills.
In future installments of this series, we will explore additional dimensions of the applied AI landscape. Stay tuned for further guidance on navigating this rapidly evolving field.