Week 4 Lecture Notes: Input Systems & Interaction Fundamentals
Virtual, Augmented and Spatial Computing
1 Overview
This week moves from hardware and perception (Weeks 1–3) into the first layer of interaction design: how users communicate intent to XR systems. We examine the full range of input modalities available across our device set and establish design principles that apply regardless of platform.
2 1. The Input Landscape in XR
Unlike desktop or mobile computing, XR has no settled input standard. Each platform has made different bets:
- Meta Quest 2/3: Touch controllers + hand tracking
- HTC Vive Pro: Wand controllers + SteamVR tracking
- Pico Neo Eye: Controller + eye tracking
- Hololens 2: Hand tracking + gaze (no controller)
- Snap Spectacles: Tap gesture + voice
- Display glasses: Typically paired with phone/controller
This diversity is a design challenge. A well-designed XR experience should either: 1. Target a specific device and optimise for its input, or 2. Abstract input so it degrades gracefully across devices
3 2. Controller-Based Interaction
3.1 2.1 Ray Casting
Ray casting is the most common far-field interaction technique. A ray is projected from the controller tip; when it intersects an interactable object, the user can select it.
Advantages: - Works at any distance - Familiar (analogous to a laser pointer) - Low fatigue
Disadvantages: - Imprecise for small targets - Breaks immersion (visible ray is artificial) - Difficult for manipulation tasks
3.2 2.2 Direct Interaction
Near-field interaction where the controller (or virtual hand) physically overlaps with an object to grab or activate it.
Advantages: - Intuitive — mirrors real-world grasping - High precision for close objects
Disadvantages: - Requires moving close to objects - Can cause collisions with virtual geometry
3.3 2.3 Haptic Feedback
Haptic feedback is a critical but often underused channel. Even simple vibration pulses significantly improve interaction confidence.
Design guidelines: - Use short pulses (50–100ms) for selection confirmation - Use sustained vibration for “holding” states - Vary intensity to convey different interaction types - Never use haptics without a corresponding visual cue
4 3. Hand Tracking
4.1 3.1 Skeletal Model
Modern hand tracking systems (Quest 2/3, Hololens 2) track a 26-joint skeletal model per hand in real time. This enables: - Pinch detection (index + thumb proximity) - Custom pose recognition - Full finger articulation for expressive avatars
4.2 3.2 Gesture Design Principles
Not all gestures are equal. Good XR gestures are: - Distinct — not easily confused with natural hand movement - Comfortable — can be held or repeated without fatigue - Discoverable — users can find them without instruction - Reversible — easy to cancel or undo
4.3 3.3 Limitations
- Occlusion: hands block each other; fingers block joints
- Lighting: poor lighting degrades tracking quality
- Fatigue: sustained hand poses are tiring
- Precision: less precise than controllers for small targets
5 4. Gaze and Eye Tracking
5.1 4.1 Gaze as Input
Gaze input uses where the user is looking as a selection signal. Two main patterns:
Dwell selection: Look at target for a fixed duration (typically 1–2 seconds) to activate. - Pro: completely hands-free - Con: slow, tiring, unnatural
Gaze + confirm: Look to target, then use a secondary input (pinch, button, voice) to confirm. - Pro: fast, natural, avoids Midas Touch - Con: requires secondary input channel
5.2 4.2 The Midas Touch Problem
Named after the mythological king: everything you look at turns to gold (activates). Gaze-only systems must carefully distinguish intentional gaze from casual glancing.
Mitigations: - Require dwell time - Use gaze + confirm pattern - Provide clear visual feedback of gaze state - Allow users to disable gaze input
5.3 4.3 Foveated Rendering
Eye tracking enables foveated rendering: rendering the area the user is looking at in full resolution, and reducing quality in the periphery. This can dramatically reduce GPU load.
Available on: Pico Neo Eye (hardware-level), some Quest 3 features.
5.4 4.4 Analytics Use
Eye tracking data is valuable for UX research: - Attention heatmaps - Fixation duration on UI elements - Saccade patterns (rapid eye movements between points)
Ethics note: Eye tracking data is biometric. Treat it with the same care as fingerprint or facial recognition data.
6 5. Designing for Physical Constraints
6.1 5.1 Gorilla Arm Effect
Sustained arm elevation causes rapid fatigue. This was first observed in early touchscreen kiosks where users had to reach up to interact. In XR, it’s worse because: - Sessions can be longer - Users may not notice fatigue building - Sudden fatigue can cause loss of balance
Design rule: Default interaction zone is waist to shoulder height, within arm’s reach.
6.2 5.2 Fitts’ Law in 3D
Fitts’ Law predicts movement time based on target size and distance:
MT = a + b × log₂(2D/W)
Where D = distance to target, W = target width.
In 3D XR: - Minimum comfortable target size: ~2cm at arm’s length - Targets at the edge of the field of view take longer to acquire - Moving targets are significantly harder to select
6.3 5.3 Interaction Zones
| Zone | Distance | Best Input |
|---|---|---|
| Intimate | 0–0.5m | Direct grab, touch |
| Personal | 0.5–1.5m | Near ray cast, hand |
| Social | 1.5–3m | Ray cast, gaze |
| Public | 3m+ | Gaze, voice |
7 6. Interaction State Design
Every interactive object in XR should implement a clear state machine:
Default → Hover → Selected → Activated → Released
Each state transition should be communicated through at least two feedback channels (visual + audio, or visual + haptic).
7.1 State Design Checklist
8 Self-Check Questions
- What is the Midas Touch Problem and how can it be mitigated?
- Why does Fitts’ Law still apply in 3D XR environments?
- What are the three channels of interaction feedback?
- When would you choose hand tracking over controller input?
- What is foveated rendering and which device in our lab supports it?
9 References
- LaViola, J.J. et al. (2017) 3D User Interfaces: Theory and Practice (2nd ed.). Addison-Wesley.
- Bowman, D.A. et al. (2004) 3D User Interfaces: Theory and Practice. Pearson.
- Fitts, P.M. (1954) “The information capacity of the human motor system in controlling the amplitude of movement.” Journal of Experimental Psychology, 47(6), 381–391.
- Poupyrev, I. et al. (1996) “The go-go interaction technique.” UIST ’96 Proceedings.
- Unity XR Interaction Toolkit: docs.unity3d.com/Packages/com.unity.xr.interaction.toolkit