Portrait of Yi Xiang

Yi Xiang

Senior Applied Scientist · Amazon Bedrock

I research how models and AI agents represent, retrieve, and retain information, and build them into product. I care about research that holds up under product constraints, where quality, latency, and evaluation matter from the start.

My current work focuses on retrieval and memory for foundation-model applications: embedding models, RAG, multimodal retrieval, and stateful context management for agents. I frame memory as infrastructure, retrieval as agent cognition, evaluation as a product constraint, and multimodal context as the interface between models and real-world agents.

Career

2023 - Present

Research-to-product at Bedrock

Senior Applied Scientist, Amazon Bedrock

Build the science behind Bedrock embedding models, retrieval systems, and agent memory. Core contributor to Titan Text Embeddings, multimodal RAG, MultiKB Agentic RAG, embedding model customization, and stateful context management for agents. Work across the full path from research idea to deployed model behavior: large-scale distributed training and inference, evaluation, production code, engineering integration, and product launch.

2020 - 2023

Forward-deployed applied science

Applied Scientist, AWS Machine Learning Solutions Lab

Translated ambiguous enterprise problems into 0-to-1 production ML systems across healthcare, gaming, sports, finance, and recruiting. Trained models from scratch including transformers, deep recommenders, coordinate-regression networks, and variational autoencoders. Delivered 10+ systems generating $20M+ ARR and seven public customer references. This phase gave me scientific breadth and the operating discipline to turn research ideas into deployed systems.

2018 - 2020

ML systems foundation

Data Scientist, Invesco Asset Management

Built deep learning and NLP models for forecasting, recommendation, and segmentation. Designed petabyte-scale ETL pipelines and production data quality systems. This phase trained the systems engineering foundation behind my later applied science work.

What I've Built

Selected research-to-product systems

Amazon Titan Text Embeddings

Multilingual embedding model covering 100+ languages. Contributed to Titan Text Embeddings V2, including multilingual capabilities that improved 60% over v1. At release, TTE v2 outperformed then-current OpenAI embedding models on retrieval tasks in MTEB and Cohere Multilingual V2 on MIRACL. GA announcement: Amazon Titan Text Embeddings V2 now available in Amazon Bedrock

MultiKB Agentic RAG

RAG agent that answers complex queries across multiple vector stores with structured and unstructured data. 94% improvement in recall@5 over plain RAG through dynamic query planning, context management, and self-evaluation.

Embedding Model Customization

Platform enabling customers to adapt embedding models for domain-specific retrieval from raw or labeled data. 25.7% average performance improvement across nine retrieval datasets. Includes a customized false negative-aware contrastive loss that delivered 51% improvement for Bedrock Guardrails.

Research Areas

Representation learning and embedding models

Foundational embedding models as the representational substrate for retrieval, semantic search, RAG, recommendation, clustering, and domain adaptation, with work on contrastive training, post-training compression, quantization, batch-quality construction, synthetic data generation, and evaluation.

Retrieval-augmented generation

Information-enhanced LLM systems that bring external knowledge into generation, from text-only RAG to multimodal RAG over video, audio, and text.

Agentic retrieval and context management

Agents need mechanisms for accessing external knowledge and maintaining working memory over time. My work in this area includes MultiKB Agentic RAG, where an agent performs retrieval across multiple structured and unstructured knowledge bases, and stateful context management, where agents decide what information to retain, update, and reuse across tasks.

Selected Publications

  1. 2025
    Effective post-training embedding compression via temperature control in contrastive training. ICLR.
  2. 2023
    Towards building a robust toxicity predictor. ACL.
  3. 2023
    A Coordinate-Regression-Based Deep Learning Model for Catheter Detection during Structural Heart Interventions. Applied Sciences.
  4. 2023
    Building an NLP-based job recommender at Talent.com with Amazon SageMaker. AWS Machine Learning Blog.
  5. 2023
    Streamlining ETL data processing at Talent.com with Amazon SageMaker. AWS Machine Learning Blog.
  6. 2022
    Build a robust text-based toxicity predictor. AWS Machine Learning Blog.
  7. 2021
    Deploy variational autoencoders for anomaly detection with TensorFlow Serving on Amazon SageMaker. AWS Machine Learning Blog.

Selected Talks

Education

Contact

For research, speaking, or professional inquiries, email elaine.yi.xiang@gmail.com.