Yi Xiang

Senior Applied Scientist · Amazon Bedrock

I research the modeling methods that enable foundation models and AI agents to represent, retrieve, and retain information, and translate those methods into production systems. I care about research that holds up under real product constraints, where model quality, latency, and rigorous evaluation matter from the start.

At AWS Bedrock, I develop representation-learning methods, contrastive-learning objectives, benchmarks, and evaluation frameworks for foundation-model systems. My work has contributed to Amazon Titan Text Embedding, multimodal retrieval systems over video, audio, and text, and agent systems with dynamic retrieval, context management, and self-evaluation. This work has led to production launches and an ICLR 2025 publication on embedding compression.

I see intelligent systems not as static generators, but as systems that close the loop between knowledge and experience: retrieving relevant knowledge through embedding models, grounding reasoning in retrieved information through RAG, taking action through agents, and learning from the outcomes of those actions through deployment-time continual learning with online reinforcement learning.

Career

2023 - Present

Research-to-product at Bedrock

Senior Applied Scientist, Amazon Bedrock

Build the science behind Bedrock embedding models, retrieval systems, and agent memory. Core contributor to Titan Text Embeddings, multimodal RAG, MultiKB Agentic RAG, embedding model customization, and stateful context management for agents. Work across the full path from research idea to deployed model behavior: large-scale distributed training and inference, evaluation, production code, engineering integration, and product launch.

2020 - 2023

Forward-deployed applied science

Applied Scientist, AWS Machine Learning Solutions Lab

Translated ambiguous enterprise problems into 0-to-1 production ML systems across healthcare, gaming, sports, finance, and recruiting. Trained models from scratch including transformers, deep recommenders, coordinate-regression networks, and variational autoencoders. Delivered 10+ systems generating $20M+ ARR and seven public customer references. This phase gave me scientific breadth and the operating discipline to turn research ideas into deployed systems.

2018 - 2020

ML systems foundation

Data Scientist, Invesco Asset Management

Built deep learning and NLP models for forecasting, recommendation, and segmentation. Designed petabyte-scale ETL pipelines and production data quality systems. This phase trained the systems engineering foundation behind my later applied science work.

What I've Built

Selected research-to-product systems

Amazon Titan Text Embeddings

Multilingual embedding model covering 100+ languages. Contributed to Titan Text Embeddings V2, including multilingual capabilities that improved 60% over v1. At release, TTE v2 outperformed then-current OpenAI embedding models on retrieval tasks in MTEB and Cohere Multilingual V2 on MIRACL. GA announcement: Amazon Titan Text Embeddings V2 now available in Amazon Bedrock

Multimodal RAG

Video and audio retrieval using text-auxiliary and parser-free multimodal embeddings. 20% performance gain on action-based datasets. Five benchmark datasets created. GA announcement: Multimodal retrieval for Bedrock Knowledge Bases now generally available

MultiKB Agentic RAG

RAG agent that answers complex queries across multiple vector stores with structured and unstructured data. 94% improvement in recall@5 over plain RAG through dynamic query planning, context management, and self-evaluation.

Embedding Model Customization

Platform enabling customers to adapt embedding models for domain-specific retrieval from raw or labeled data. 25.7% average performance improvement across nine retrieval datasets. Includes a customized false negative-aware contrastive loss that delivered 51% improvement for Bedrock Guardrails.

Research Areas

Memory as infrastructure. Retrieval as cognition. Agents as action. Continual learning as the consolidation of experience.

Representation learning and embedding models

Foundational embedding models as the representational substrate for retrieval, semantic search, RAG, recommendation, clustering, and domain adaptation, with work on contrastive training, post-training compression, quantization, batch-quality construction, synthetic data generation, and evaluation.

Retrieval-augmented generation

Information-enhanced LLM systems that bring external knowledge into generation, from text-only RAG to multimodal RAG over video, audio, and text.

Agentic retrieval and context management

Agents need mechanisms for accessing external knowledge and maintaining working memory over time. My work in this area includes MultiKB Agentic RAG, where an agent performs retrieval across multiple structured and unstructured knowledge bases, and stateful context management, where agents decide what information to retain, update, and reuse across tasks.

Deployment-time continual learning

Learning from interaction after deployment through online reinforcement learning, with a focus on how feedback, outcomes, and agent trajectories can be consolidated into durable model capability.

Embodied AI and world action models

An emerging research direction focused on robotic agents, data collection, evaluation, and world action models: models that predict how physical environments evolve in response to actions and support more capable robot policies.

Selected Publications

2025
Effective post-training embedding compression via temperature control in contrastive training. ICLR Spotlight.
2023
Towards building a robust toxicity predictor. ACL.
2023
A Coordinate-Regression-Based Deep Learning Model for Catheter Detection during Structural Heart Interventions. Applied Sciences.

Technical Writing

2023
Building an NLP-based job recommender at Talent.com with Amazon SageMaker. AWS Machine Learning Blog.
2023
Streamlining ETL data processing at Talent.com with Amazon SageMaker. AWS Machine Learning Blog.
2022
Build a robust text-based toxicity predictor. AWS Machine Learning Blog.
2021
Deploy variational autoencoders for anomaly detection with TensorFlow Serving on Amazon SageMaker. AWS Machine Learning Blog.

Selected Talks

2023
Automatic Prompt Optimization. AWS NLP Summit.
2022
Robust and fast detection of toxic speech content via machine learning. ICML Expo Demonstration.
2022
Content moderation talks and tutorial sessions at ACVC and AMLC.

Education

Master of Mathematics in Finance, Columbia University, 2016-2018.
B.Ec. in Financial Engineering, University of Science and Technology Beijing, 2012-2016.

Contact

For research, speaking, or professional inquiries, email elaine.yi.xiang@gmail.com.