No description

JavaScript 83.6%
CSS 11.9%
HTML 3.9%
Shell 0.6%

Find a file

samuel a1d3895290 new forgejo		2026-06-04 05:34:38 +00:00
public	new forgejo	2026-06-04 05:34:38 +00:00
src	new forgejo	2026-06-04 05:34:38 +00:00
.gitignore	new forgejo	2026-06-04 05:34:38 +00:00
index.html	new forgejo	2026-06-04 05:34:38 +00:00
package-lock.json	new forgejo	2026-06-04 05:34:38 +00:00
package.json	new forgejo	2026-06-04 05:34:38 +00:00
README.md	new forgejo	2026-06-04 05:34:38 +00:00
realtime-voice-translate.service	new forgejo	2026-06-04 05:34:38 +00:00
server.js	new forgejo	2026-06-04 05:34:38 +00:00
start.sh	new forgejo	2026-06-04 05:34:38 +00:00

README.md

May contain hallucinations!

Realtime voice translate

Minimal full-screen mobile voice translator.

Uses ElevenLabs Scribe v2 Realtime and OpenAI GPT-5.4-mini

This app and README was written using OpenAI Codex GPT-5.5.

What It Does

Uses ElevenLabs Scribe v2 Realtime through the official client-side token flow.
Keeps the ElevenLabs API key on the server.
Uses VAD and partial transcripts.
Detects the source language from the script of each partial transcript.
Sends each partial transcript to OpenAI for translation without waiting for earlier partials.
Uses gpt-5.4-mini with reasoning effort none for translation.
Displays only the latest available translation for the latest segment.
Bottom-aligns the full latest translation and clips older overflowing lines off-screen.
Shows ??? if script-based language detection fails for the latest partial.
Supports English/Hindi and English/Russian language pairs from a dropdown.
Shows a Settings button during translation that disconnects Scribe and resets the current conversation.
Clears the current conversation when the page is hidden, closed, or unloaded.
Ignores stale translation responses after a reset.
Requests a screen wake lock while translating when the browser supports it.
Optimizes the display for mobile use, with a horizontal phone recommendation and Add to Home Screen note for fullscreen mode.

Run

For local/manual testing after dependencies and tokens are present:

npm install
npm run build
source /root/all_tokens.sh
HTTP_PORT=80 HTTPS_PORT=443 \
TLS_CERT=/etc/letsencrypt/live/translate.samuelshadrach.com/fullchain.pem \
TLS_KEY=/etc/letsencrypt/live/translate.samuelshadrach.com/privkey.pem \
node server.js

Fresh Server Deployment

These steps are intended to be enough for a new OpenAI Codex instance or human operator to host the app on a fresh Ubuntu/Debian server.

Assumptions:

Domain: translate.samuelshadrach.com
App directory: /root/realtime-voice-translate
Public repo: https://github.com/samuel-da-shadrach/realtime-voice-translate.git
Runtime user: root
HTTP/HTTPS ports: 80 and 443
TLS: Let's Encrypt certificate at /etc/letsencrypt/live/translate.samuelshadrach.com/

1. DNS

Create an A record:

translate.samuelshadrach.com -> SERVER_IPV4_ADDRESS

For initial Let's Encrypt setup, use DNS-only mode in Cloudflare if possible. After the certificate works, Cloudflare proxying can be enabled with SSL/TLS mode set to Full or Full strict.

2. System Packages

Install Node.js, npm, git, and certbot:

apt update
apt install -y nodejs npm git certbot

If the distro Node.js is too old for Vite/React tooling, install a current LTS Node.js release and then rerun npm install.

3. Clone

Clone this repo into the path expected by the included service files:

cd /root
git clone https://github.com/samuel-da-shadrach/realtime-voice-translate.git realtime-voice-translate
cd /root/realtime-voice-translate

If you clone elsewhere, update both start.sh and realtime-voice-translate.service.

4. Tokens

Create /root/all_tokens.sh:

cat >/root/all_tokens.sh <<'EOF'
export ELEVENLABS_API_KEY='replace-with-elevenlabs-api-key'
export OPENAI_API_KEY='replace-with-openai-api-key'
EOF
chmod 600 /root/all_tokens.sh

Do not commit this file. It is intentionally outside the repo.

5. Build

cd /root/realtime-voice-translate
npm install
npm run build

6. TLS Certificate

Make sure nothing is already listening on port 80, then issue the certificate:

certbot certonly --standalone -d translate.samuelshadrach.com

Expected files:

/etc/letsencrypt/live/translate.samuelshadrach.com/fullchain.pem
/etc/letsencrypt/live/translate.samuelshadrach.com/privkey.pem

The app can run HTTP-only if these files are absent, but microphone access in browsers requires HTTPS for normal use.

7. Install Service

cd /root/realtime-voice-translate
chmod +x start.sh
cp realtime-voice-translate.service /etc/systemd/system/realtime-voice-translate.service
systemctl daemon-reload
systemctl enable realtime-voice-translate.service
systemctl restart realtime-voice-translate.service

8. Verify

Check service status:

systemctl status realtime-voice-translate.service

Check logs:

journalctl -u realtime-voice-translate.service -f

Check health:

curl -sS http://127.0.0.1/health
curl -sS https://translate.samuelshadrach.com/health

Open:

https://translate.samuelshadrach.com

9. Firewall

Ensure inbound TCP 80 and 443 are allowed by the server firewall and any cloud firewall.

10. Common Failure Points

ELEVENLABS_API_KEY is not set: /root/all_tokens.sh is missing, unreadable, or does not export ELEVENLABS_API_KEY.
OPENAI_API_KEY is not set: /root/all_tokens.sh is missing, unreadable, or does not export OPENAI_API_KEY.
Browser microphone does not work: use HTTPS, not plain HTTP.
HTTPS does not start: check the Let's Encrypt cert/key paths in realtime-voice-translate.service.
Service starts but domain fails: check DNS, Cloudflare SSL mode, firewall, and whether ports 80/443 are already in use.
Certbot standalone fails: stop anything using port 80, verify DNS points to this server, and rerun certbot.

Service

The included realtime-voice-translate.service is the systemd unit used on the server.