HyperSkill User's Guide
  • Overview
    • System Requirements
    • Create a HyperSkill Account
    • Quick Start Guide
  • Virtual Entities (VX)
    • Public Virtual Entities
    • Import Virtual Object (VO) from HyperSkill Desktop
    • Edit Virtual Objects
    • VX Publisher
      • Import Model as Virtual Object (VO)
      • Import Model as Virtual Person (VP)
      • Import Model as Virtual Space (VS)
      • Special Considerations
    • Export Model from SolidWorks
    • Export your Virtual Object to a QR Code (Hololens)
    • Special Virtual Objects
      • Train
      • Digital Caliper
      • Blur
      • Crossbow
      • Picture Frame
      • Drone
      • Painting
        • Setup
        • Spray Painting
        • Pen
      • Timer Display
  • Media Upload
  • Authoring a Simulation
    • Object Browser
    • Inspector
      • General
      • Description
      • Grab Mechanics
        • Basic
        • Slider
        • Rotator
        • Dragger
        • Grabber
        • Climbable
      • Customizable
      • Player
    • Scene Layout
      • Asset Movement Tools
      • Skybox
      • Set Spawn Position
      • Show Axis
      • Assets
    • State Machine
      • Scenario Flow
        • State Machine Interface
        • State Actions
        • Attributes
          • Default Attributes
        • Transitions
          • Use
          • Snap
          • Proximity
          • Collide
          • Utterance
          • Use Release
          • Conditions
      • Rules
      • Triggers
        • Collide
        • Conversation Update
        • Grab
        • Proximity
        • Rejected Intent
        • Snap
        • Voice Intent
        • Ungrab
        • Unsnap
        • Use
        • Use Release
        • Wait
      • Conditions
        • Simulation Condition
        • Account Condition / User Attributes
      • Actions
        • How to Author the State Machine
        • Add Feedback Log
        • Add a Follow Camera
        • AI Chat
          • How to Author the Chat Box in a simulation
        • Animation Controller
        • Animation State Transition
        • Camera
        • Change Character Prompt
        • Checklist
        • Display Image
        • Display Message
        • End Timer
        • Go To Sim
        • Highlight
        • IK Solver
        • Leaderboard
        • Lighting Settings
        • Look At
        • Microphone
        • Open Link
        • Particle Controller
        • Play Animation
        • Play Audio
        • Play Video
        • Quiz
        • Reset Chat
        • Rotate
        • Run Agent
        • Set Attribute
        • Show Feedback Log
        • Spatial Audio
        • Spawner
        • Speak
        • Start Timer
        • Switch Virtual Assistant
        • Transition Effect
        • Translate
        • VCode Function
        • Wait
    • Object Groups
    • Paths
    • Wires
      • Wire Builder Interface Description
    • Simulation Settings
      • Experience Settings
        • Microphone
        • Player
        • Documents
        • Virtual Assistant
        • Conversational AI
        • Wires
        • Pin Code Multiplayer Overrides
        • Experts
      • Authoring Settings
        • Desktop
        • VR
    • SimGenie
    • Simulation Templates
    • Save States
  • Experience a Simulation
    • User Interface
    • Controls
    • Multiplayer
  • AI
    • Natural Conversations
    • Large Language Models & Data Privacy
    • AI Chat
      • How to Author AI Chat
      • SimKB
      • Edit Mode
      • Test Mode
    • Agents
      • How to Author an Agent
      • Edit Mode
      • Test Mode
    • PDF Question Answering
    • Review Mode
    • Settings
    • Conversational AI
      • Helpful Terminology for Conversational AI
        • Multi-expert dialog
        • Natural Language Understanding
        • Experts
          • List of experts developed so far
            • HM Graph Expert - Deployed
            • Key Phrase Expert - Deployed
            • Catch All Expert - Deployed
            • Azure PDF Expert - Standalone
            • Gaze Expert - Standalone
            • Knowledge Graph Expert - Standalone
  • Mixed/Augmented Reality
    • Create a Mixed Reality experience
    • Spatial Anchors
  • Publish
    • Playlist
    • LMS Integration
    • Embed Mode
  • Insights and Analytics
    • Logging Attributes
      • Detailed Event Table
      • Detailed Attributes Table
  • Web Portal
    • Simulation / Virtual Object Browser
    • Organizations
    • Settings
  • Subscriptions
  • Skillful
    • FAQ
  • Archived
    • Wires
Powered by GitBook
On this page
  • Speech-to-Text Transcription (ASR): Capture Learner Speech
  • Activating Speech-to-Text in HyperSkill
  • Text-to-Speech Generation (TTS): Breathe Life into Virtual Characters
  • Using Text-to-Speech in HyperSkill:
  1. AI

Natural Conversations

PreviousAINextLarge Language Models & Data Privacy

Last updated 10 months ago

HyperSkill bridges the gap between traditional learning and interactive experiences with its Speech-to-Text (ASR) and Text-to-Speech (TTS) features. Let's explore how these functionalities can elevate your VR/AR/Web/Desktop simulations.

Speech-to-Text Transcription (ASR): Capture Learner Speech

  • Dialogue Recording: HyperSkill's ASR feature automatically captures and transcribes user speech within your simulations. Captured speech can be used to branch in dialogue using trigger utterances or roleplaying using .

  • Analyze Trainee Communication: All recorded dialogue will appear in the HyperSkill dashboard for future review. Authors can evaluate responses, identify areas for improvement, or assess communication styles at any time.

In addition to spoken dialogue, HyperSkill also supports text-based input on desktop and web. For more information, visit the page.

Activating Speech-to-Text in HyperSkill

To enable Speech-to-Text (ASR) and capture spoken dialogue within your simulations, follow these steps:

  1. Enter Edit Mode: In HyperSkill Desktop or Web, navigate to your simulations list and enter edit mode for the simulation you want to add ASR.

  2. Access Settings: Locate the settings menu within the tabs bar.

  3. Enable Microphone Input: Within the settings menu, find the options for microphone input labeled "Microphone". Ensure the toggles for "Enable Microphone Input" and "Microphone Always Listening" are switched on.

Once you've enabled these settings, HyperSkill will be ready to capture spoken dialogue within your simulations using the device's microphone.

Text-to-Speech Generation (TTS): Breathe Life into Virtual Characters

  • Voice Customization: HyperSkill offers a variety of voice options to choose from. Select a voice that best suits the character's personality, gender, and the overall tone of your simulation. (Free plan limitations apply)

Using Text-to-Speech in HyperSkill:

Text-to-Speech functionality is integrated with various HyperSkill state actions. These actions allow you to trigger speech generation at specific points within your simulation. For example, you could use a Text-to-Speech action to:

  • Have a virtual instructor deliver introductory remarks.

  • Make characters react to learner choices with spoken feedback.

  • Provide audio cues and instructions throughout the simulation.

Supported Text-to-Speech Technologies:

HyperSkill offers two TTS options depending on your subscription plan:

  • Google Text-to-Speech: This service is available to all users and provides high-quality, natural-sounding voices in English, Spanish, and Hindi.

  • ElevenLabs Text-to-Speech (Paid Plans): Upgrade your plan to access ElevenLabs, which offers high-quality voices across multiple languages.

Configuring Text-to-Speech Settings:

You can define the Text-to-Speech engine (Google or ElevenLabs) and the spoken language within your simulation settings. Navigate to Settings > Conversational AI in edit mode to access these options.

Fine-tuning Speech with SSML:

HyperSkill's TTS supports Speech Synthesis Markup Language (SSML). This allows you to add specific instructions for the TTS engine, further customizing the generated speech. With SSML, you can control aspects like:

  • Speech rate: Adjust the speed of the narration to match your simulation's pace.

  • Pitch: Modify the character's vocal pitch to create a more distinct personality.

  • Emphasis: Highlight specific words or phrases for dramatic effect.

  • Pauses: Introduce pauses for a more natural flow of conversation.

Voice options in HyperSkill:

You can choose from any of the following voices in HyperSkill:

  • Matthew

  • Justin

  • Joey

  • Salli

  • Kimberly

  • Kendra

  • Joanna

  • Ivy

  • Brian

  • Amy

  • Emma

  • Aditi

  • Raveena

  • Russell

  • Nicole

In HyperSkill, when selecting voices for ElevenLabs, you may notice duplicates. This occurs because ElevenLabs offers fewer voice options compared to Google. As a result, some voices appear identical to each other, offering equivalent choices. The following voices are duplicates:

  • Kendra, Joanna, Ivy, Amy, Emma, Aditi, and Raveena

  • Joey and Russell

Tips and Common Errors

  • Experiment with different voice options to find the perfect fit for your characters. (ElevenLabs plan limitations apply)

  • Preview your TTS implementation to ensure the audio is correctly generated at experience time. Text-to-speech may sometimes break on the apostrophe ' character. This may occur if text was pasted instead of typed into HyperSkill. To fix the issue, delete and retype the apostrophe character.

  • Consider the pacing and intonation of the generated speech for optimal impact.

  • Use SSML tags to fine-tune the speech and create a more nuanced performance by your virtual characters.

You can visit the following link to review supported SSML elements: . You do not need to add the <speak> tag.

AI Chat
Desktop
Google SSML elements