diff --git a/README.md b/README.md
index bebb039..ebe5d31 100644
--- a/README.md
+++ b/README.md
@@ -4,11 +4,19 @@
 
 In the realm of object manipulation, human engagement typically manifests through a constrained array of discrete maneuvers. This interaction can often characterized by a handful of low-dimensional latent actions, such as the act of opening and closing a drawer. Notice that such interaction could diverge on different types of objects but the interaction mode such as opening and closing is discrete. In this paper, we explore how the learned prior emulates this limited repertoire of interactions and if such a prior can be learned from unsupervised play-data. we take a perspective that decomposes the policy into two distinct components: a skill selector and a low-level action predictor, where the skill selector is operating within a discretely structured latent space.
 
-We introduce ActAIM2, which given an RGBD image of an articulated object and a robot, identifies meaningful interaction modes like opening drawer and closing drawer. ActAIM2 represents the interaction modes as discrete clusters of embedding. ActAIM2 then trains a policy that takes cluster embedding as input and produces control actions for the corresponding interactions. 
+We introduce **ActAIM2**, which given an RGBD image of an articulated object and a robot, identifies meaningful interaction modes like opening drawer and closing drawer. ActAIM2 represents the interaction modes as discrete clusters of embedding. ActAIM2 then trains a policy that takes cluster embedding as input and produces control actions for the corresponding interactions. 
 
 <img width="1238" alt="teaser_3" src="https://github.com/pairlab/actaim2-eccv24/assets/30140814/687daaa0-3cb3-4697-b3f8-b33d5351b7dd">
 
 
+## Problem Formulation
+
+
+  $$\mathbb{P}(a|o) = \int \underbrace{\mathbb{P}(a|o,\epsilon)}_{\text{action predictor}}~  \underbrace{\mathbb{P}(\epsilon|o)}_{\text{mode selector}} d\epsilon $$
+
+
+
+
 
 ### sample object data collection