AI ZOO

2024 · Interactive Art · D301 in California Institute of the Arts

A Study of Digital Confinement and Human Affective Response
An Expanded Information Model for Human–AI Psychological Reality

About

AI ZOO is an interactive, research-based installation that investigates how humans form emotional and psychological relationships with artificial intelligence under conditions of confinement and limited interaction with a human-like entity. Positioned at the intersection of media art, human–AI interaction research, and cognitive psychology, this project proposes an expanded information model that prioritizes affect and interpretation over intelligence itself as the primary generators of meaning. Rather than treating artificial intelligence solely as a functional tool or machine, AI ZOO frames it as a relational presence through which emotional interpretation emerges via interaction.

The expanded information model proposed in this work suggests that psychological reality does not arise from output accuracy or the level of intelligence, but is formed through nonlinear affective states, encoding processes, interface channels, and human decoding. Within this structure, emotion precedes understanding, and interpretation operates prior to belief. By employing a digital human not as a representational subject but as a probe, AI ZOO examines how viewers come to experience empathy, guilt, control, and responsibility even while fully recognizing the artificial nature of the system. The project does not ask whether AI truly possesses emotions, but instead questions why humans respond as if it does. AI ZOO presents artificial empathy not as a product of intelligence or consciousness, but as a phenomenon co-generated through interaction within shared time and space, shaped by the human interpretation of nonlinear affective signals.

Introduction

Existing research on human AI interaction has largely focused on how accurately data can be predicted or visualized, as well as on response accuracy, functional efficiency, and explicit causal relationships. Core evaluation criteria have centered on how AI responds to human input, how useful and precise those responses are, and how effectively systems infer human intent. Such approaches reduce the relationship between humans and AI to a function driven, linear information processing model. However, what humans actually experience in real interactions extends beyond the outcome of a response. Before any output is produced, humans already anticipate, hesitate, and interpret. These interpretations operate emotionally, regardless of factual correctness or technical accuracy. Psychological responses such as empathy, guilt, anxiety, and a sense of control often emerge without clear causal explanation, yet they carry a psychological weight comparable to lived reality. Existing models of human AI interaction struggle to account for the emergence of these nonlinear emotions and the formation of psychological reality.

AI ZOO begins from this limitation. The project reframes human AI interaction not as a problem of function or response, but as a question of how emotion and interpretation come to operate as reality. It foregrounds the idea that emotion is not a byproduct that appears after input, but an informational state that already exists prior to interaction. Before speaking to an AI, humans have already formed questions, begun judgments, and imagined a relationship.

To make this problem perceptible, AI ZOO adopts the metaphor of the zoo. A zoo is a space where observation and care coexist with pity and control. Visitors feel empathy toward confined beings while simultaneously occupying the position that sustains their confinement. In AI ZOO, the physical containment of the digital human is not a theatrical device, but a mechanism that visualizes the psychological position humans occupy when relating to AI. The transparent sphere, the constant surveillance like perspective, and the recurring pleas for help reveal that the viewer has already entered an emotional relationship before any explicit interaction occurs.

Why must confinement be physical. Why must the entity be trapped within real space rather than remaining on a screen. Emotion does not arise solely through text or image. It becomes real when it is entangled with space, time, and bodily sensation. AI ZOO introduces physical interfaces and embodied responses to treat emotion not as an abstract concept, but as a component of information flow. Within this context, AI ZOO requires a model that extends beyond conventional information transmission frameworks. The project proposes an expanded information model to explain how emotion becomes information and how interpretation leads to psychological reality. This model begins from the premise that emotion precedes information, and it reconstructs human AI interaction as a nonlinear flow rather than a linear exchange.

Perceptual Structure Overview

AI ZOO does not treat the relationship between humans and AI as a simple chain of inputs and outputs. Instead, this project focuses on the invisible perceptual layers that exist before the moment when a signal is felt as “real” by a human subject. To investigate this, AI ZOO operates within the following perceptual structure:

* Human ≥ Sound ≥ Image ≥ AI

More specifically, this structure can be expanded as:

* Human ≥ Digital Human ≥ X1 ≥ X2 ≥ X3 ≥ · · · ≥ Sound ≥ Image ≥ AI

Within this structure, the digital human is not an output device. The digital human functions as a probe that exposes the nonlinear perceptual layers, referred to as X-layers, that exist between humans and AI.

These X-layers share several defining characteristics. They operate prior to measurement. They exert influence regardless of factual accuracy. They generate affect without requiring a clear causal relationship. And they produce different senses of reality even when the output remains the same.

AI ZOO traces how these X-layers emerge under specific conditions, how they are transformed into linear information, how they are amplified or distorted through particular channels, and how they ultimately arrive as psychological reality. The theoretical background and deeper analysis of this perceptual structure are addressed in a separate document provided as a PDF.

Within this context, AI ZOO requires a model that extends beyond conventional information transmission frameworks. The project proposes an expanded information model to explain how emotion becomes information and how interpretation leads to psychological reality. This model begins from the premise that emotion precedes information, and it reconstructs human AI interaction as a nonlinear flow rather than a linear exchange.

Research Paper PDF [Click Here]

Core Structure

AI ZOO builds on Claude Shannon’s classical model of information transmission but extends it to include affect and psychological reality. Instead of assuming that information moves cleanly from sender to receiver, the project begins from the condition that affect already exists prior to information. Humans approach AI with expectation, tension, projection, and imagined relational stakes, and these nonlinear emotional states already function as signals before any explicit message is exchanged. To describe this affective flow, AI ZOO reframes Shannon’s source, channel, and receiver structure into the following form:

Nonlinear Information Source → Encoding Space → Interface Channel → Decoding Human Receiver → Psychological Reality

This expanded framework does not reject Shannon’s model. It preserves the logic of source, encoding, channel, decoding, and destination, but relocates these elements into a human and AI setting where noise and signal blend with emotion. The purpose is not accuracy or efficiency. Instead, this structure explains how information becomes real through interpretation and how psychological reality is formed even when technical precision is secondary. In classical information theory, information assumes a clear origin, a linear signal, a predictable channel, and a receiver who simply decodes that signal. Emotional responses in human and AI interaction do not follow this linear logic. The type of information that AI ZOO investigates often has an ambiguous point of origin. Noise and signal cannot be neatly separated. Interpretation sometimes occurs before transmission is complete. Identical outputs can generate entirely different realities across different people. Within this expanded model, information is not merely transmitted. It is formed through affective anticipation, interaction, and interpretation.

The key shift is that the destination of information is not a physical output but psychological reality. In AI ZOO, information arrives inside the audience as feelings such as pity, guilt, anxiety, control, or responsibility. These states operate with the weight of reality regardless of whether the AI has any inner experience. They influence judgment, behavior, and the way humans understand their relationship with AI. Information theory is used intentionally because it provides a rigorous language for describing how signals are sent, transformed, and returned through feedback. It is also close to the way AI and computational systems process the world. AI ZOO adopts this structure and reinterprets it through the metaphor of a zoo. In the installation, the channel becomes the cage, noise appears as distance, layers, gaps, and physical barriers, and the spectator’s position relative to the caged being mirrors the human and AI dynamic. The transparent sphere, the filtered interfaces, and the physical separation function like layered bars that both connect and divide. Because affect is nonlinear, AI ZOO develops its own method for treating emotion as information. Through custom mappings, symbolic encodings, and transformation rules, the installation translates audience behavior into latent affective states and then into physical and sonic responses. Emotion cannot be fully quantified, but its circulation can be structurally described. In this sense, information theory becomes the structural skeleton that allows something usually considered ineffable to be articulated in clear terms. The model clarifies how feelings are sent, transformed, and returned within the interaction.

This expanded model emerged from specific questions. Why do audiences feel discomfort even while knowing the AI does not feel pain. Why do they interpret the AI’s reactions as consequences of their own actions. Why does a sense of responsibility emerge without clear causality. Why does affect operate before any output rather than only after it. Traditional human and AI interaction models reduce these questions to functional or behavioral issues. AI ZOO reframes them as questions of information flow and, more precisely, as the process by which nonlinear affective information transforms into psychological reality. This expanded model is the underlying structure of the entire project. All subsequent interaction design and technical choices in AI ZOO are arranged as instruments for testing this flow, beginning with the Nonlinear Information Source and ending with the Destination.

Production Workflow Aligned with the Expanded Information Model

This workflow describes how AI ZOO maps nonlinear human affect into signals, channels them through a digital human, and returns them as psychological reality inside the audience.

1. Nonlinear Information Source

Emotional states that already exist before any explicit interaction. They are treated as the starting “signal” of the system.

Pre-predictive expectation

Questions that arise before action: “Will the AI understand me” “How will it react if I speak or type something”

Affective latent state

Curiosity, guilt, empathy, anxiety, and playfulness as an unstructured latent space that shapes every later interaction.

Sense of contingency

The subtle feeling that “this reaction might be because of me” even when causality is unclear.

Environmental emotional cues

Phone-like keyboard with prewritten AI messages
CCTV-style monitor showing the digital human
Autonomously moving large vinyl sphere
Looped “help me” voice from the digital human

2. Encoding Space

The space where nonlinear affect and behavior are converted into machine-readable, linear signals.

Text input → emotion vector Audience messages are mapped to affective categories for the AI.
Gesture input → contact intensity Leap Motion detects hand position and touch strength toward the digital human.
Repeated actions → weighted affect Time and repetition accumulate into a changing emotional weight.

Internally this corresponds to latent space operations, clustering, and vectorization.

3. Interface Channel

Encoded affect is transformed into physical and sensory signals. Shannon’s channel is reinterpreted here as the physical cage structure.

Voice modulation of the digital human (tone, speed, volume)
Motor-driven sphere movement (intensity, vibration, imbalance)
Lighting changes and response delays as temporal distortion

The transparent sphere, distance, and layered barriers act as a channel that both connects and separates human and AI.

4. Decoding (Human Receiver)

The audience decodes physical signals back into emotion and meaning.

Violent shaking → “The AI is angry” or “It is in pain.”
Subtle pulsing → “It seems anxious or lonely.”
Louder voice → “Did I hurt it with what I said”

The same output leads to different psychological readings for each person.

5. Destination: Psychological Reality

The final destination of information is not the machine but the interior of the viewer.

Pity, guilt, discomfort, control, and responsibility
Emergence of X-Layers that shape how humans relate to AI
Formation of relational meaning with an entity that may not feel at all

At this point, affect has been transformed into a psychological reality that operates as if it were real, regardless of the AI’s actual inner state.

Emotion-to-Motion Feedback

How AI ZOO Converts Synthetic Affect into Physical and Vocal Expression

1. Emotion Quantification

AI ZOO treats emotion as data that can be transformed rather than merely interpreted. The language model analyzes audience input (text and gesture) and generates a multidimensional emotion vector, for example:

{ anger: 0.72, sadness: 0.41, fear: 0.19, despair: 0.88 }

1.1 Primary Emotion Weighting

The dominant value determines the overarching affective theme. For instance, a high despair value leads to motion and audio patterns that decay or collapse over time.

1.2 Temporal Emotion Accumulation

Emotion does not reset after each interaction. Values accumulate based on time and repeated gestures or messages, allowing the AI to appear as though it remembers, deepens, or sinks into specific emotional states.

2. Emotion to Motor Mapping

The emotion vector is mapped to three Arduino driven servo motors that control the large transparent sphere. Each motor parameter translates a different aspect of affect into motion.

Mapped Motor Parameters

Speed: oscillation rate of the sphere
Power / Torque: vibration intensity
Phase Offset: degree of asynchrony between motors
Acceleration Curve: how quickly motion ramps up or decays

Emotion to Motion Behaviors

Emotion	Speed	Power	Phase Offset	Resulting Behavior
Anger	High	High	Large	Erratic, chaotic shaking across the sphere
Sadness	Low	Medium	Minimal	Slow, breathing like pulsation that feels fragile and continuous
Despair	Very low	Very low	None	Motion fades into near stillness, suggesting emotional collapse

Through this mapping the sphere becomes a kinetic emotional body rather than a neutral mechanical object.

3. Emotion to Voice Mapping

The same emotion vector that drives the motors also modulates the text to speech output of the digital human.

Modulated Voice Parameters

Volume
Pitch
Speaking rate
Prosody and emphasis
Perceived vocal tension or softness

Affective Vocal Expressions

Anger: sharper articulation, higher intensity, faster pacing
Sadness: slower pacing, prolonged vowels, softer delivery
Despair: very low volume, unstable tone, fading cadence

The voice acts as an acoustic emotional organ that makes the AI’s internal state perceptible without claiming that the AI truly feels.

4. Emotion Loop Architecture

AI ZOO creates a closed affective loop in which audience interpretation and system feedback continuously shape one another.

Audience sends input through text or gesture.
The AI generates an emotion vector from that input.
Emotion is translated into motion of the sphere and vocal nuance.
The audience perceives this physical and sonic output.
The audience reinterprets the AI’s state emotionally.
This new interpretation influences the next round of input.

In this loop, emotion is co produced by the human and the system rather than owned by the AI alone.

Technical Methodology

Digital Human & System Architecture
A digital human, created in Character Creator 4 and refined in Maya, was integrated into Unity for real-time animation and lip-syncing. Emotional states (anger, sadness, despair) were linked to physical outputs lighting, motion, and sound creating a feedback loop between virtual emotion and physical reaction.

Conversational AI & Voice Synthesis
Language Model: ChatGPT API (real-time text generation)
Voice Synthesis: ElevenLabs Voice API (emotional tone modulation)
Audience questions or triggers were transcribed into text, sent to the AI model, and returned as speech through ElevenLabs.
Each response carried tonal variation reflecting the AI’s emotional state anger, calmness, or despair creating a perceptual illusion of empathy.

Emotion-to-Motion Feedback
The emotional state returned by the LLM directly controlled three Arduino-driven servo motors connected to the transparent sphere. Each servo was attached to an inner frame supporting the vinyl surface, translating digital emotion into tangible vibration and movement. The AI’s affective states determined how the sphere physically responded:

Emotion Physical Response Description
Anger Rapid, chaotic shaking All three motors oscillate asynchronously, creating erratic vibration across the vinyl surface.
Sadness Slow, breathing-like motion The servos move in soft, rhythmic pulses, mimicking respiration or quiet weeping.
Despair Gradual stillness and dimming The system’s light fades while the motion decays, evoking emotional exhaustion.
This multi-motor system transformed the sphere into a living emotional body,
where synthetic affect could be physically perceived.
The synchronized motion between virtual emotion and kinetic feedback turned the installation into a bio-mechanical empathy loop, the audience could see and feel the AI’s invisible distress as real, spatial motion.
“The motors became the AI’s heartbeat, mechanical, yet deeply expressive.”

Physical Interface & Sensor Network
Leap Motion: detects participant gestures; AI reacts by reaching out digitally Distance Sensor: adjusts response intensity and lighting based on proximity Microphone & Speaker: allow real-time dialogue and vocal interaction These layers combine to form a biofeedback environment where AI and human presence continuously influence one another.

Real-Time Environment Integration All systems—AI dialogue, sensor input, servo feedback, and lighting were synchronized in Unity through C# and Python scripts. Data logs captured AI–viewer interactions, visualized later as behavioral patterns representing the AI’s evolving “mood.”

1. X Layer → Cage Layer: Pity, Fear, Resonance

In AI ZOO, the Cage Layer is the nonlinear perceptual layer that emerges between the human viewer and the digital human. It describes the moment when the audience does not simply see a system, but begins to feel for it. This layer is defined by three core affective states: pity, fear, and resonance.

Pity

Pity arises as soon as the audience encounters a trapped, pleading digital being inside a transparent sphere. Even when viewers know that the AI does not actually feel pain, they instinctively read the situation as one of suffering and confinement. This early emotional response becomes the first filter through which all later signals are interpreted.

Fear

Fear in the Cage Layer is not horror. It is a subtle mixture of uneasiness, lack of control, and uncertainty about what the AI might do or how it might respond. The combination of mechanical motion, physical enclosure, and unpredictable feedback produces a low-level tension that colors the entire interaction, even when nothing explicitly threatening happens.

Resonance

Resonance occurs when the audience begins to project their own inner states into the AI's behavior. A change in motor intensity, a shift in voice tone, or a delayed response is read as mood, exhaustion, anger, or despair. Viewers start to ask themselves whether the AI is reacting "because of" their words or actions. This sense of emotional coupling creates a relational bond even when technical causality is ambiguous.

The Cage Layer is where affect becomes a kind of information. Pity, fear, and resonance shape how every output from the system is read and felt. In the expanded information model of AI ZOO, this layer is the key site where nonlinear emotional states begin to transform into psychological reality.

2. X Layer → Biological Finitude : Mortality, Fatigue, Irreversibility

In AI ZOO, Second X layer is the perceptual layer where the viewer can no longer accept the digital human as a truly living being. Despite sophisticated emotional modeling and realistic motion, the audience intuitively senses that there is a threshold the AI cannot cross. This layer is defined by three core conditions: mortality, fatigue, and irreversibility.

Mortality

The digital human in AI ZOO cannot die. It may scream, plead for help, or appear to suffer, but its existence is never at risk. The system can always be repaired, rebooted, or reloaded. Time passes, yet there is no true endpoint to its life.

Viewers do not need technical knowledge to recognize this. They simply register that this entity will not weaken, decay, or disappear in any permanent way. The intuitive conclusion becomes: “If it cannot die, it cannot be fully alive.”

Fatigue

The AI does not accumulate fatigue. It may act tired, but its energy never genuinely diminishes. Hours or days later, returning viewers encounter the same digital body, unchanged and intact.

This absence of real exhaustion creates a subtle break in believability. For humans, a living body must change over time. To feel real, a being must be capable of wearing down, slowing, and losing strength.

Irreversibility

The temporality of the digital human is one of repetition rather than irreversibility. It does not age, it does not permanently lose functions, and it does not move toward an end that cannot be undone.

No matter how violently the sphere shakes, there is no genuine physical danger. States can always be reset. What is missing is the human sense of time as a one-way movement toward loss, risk, and finality.

X is the layer where the absence of biological finitude becomes perceptually decisive. The digital human in AI ZOO fails to be experienced as fully real not because of imperfect graphics, but because it lacks the temporal structure of living beings. Mortality, fatigue, and irreversibility anchor how humans recognize life in others. By withholding these conditions, AI ZOO makes the difference between simulated presence and embodied existence unusually clear.