溪水旁2.0,一年之后
Table of Contents
前言:去年六月的一句承诺
去年六月,我在这里写下了《溪水旁2.0计划书》:模块化首页、播客式语音、多人同听、文字高光、经文映射、祷告墙、好友关注……写的时候野心勃勃,读起来像是下个月就能全部上线。
一年多过去了。今天想老老实实记一笔账:到底做出了什么,踩了哪些坑,哪些设想还安静地躺在待办清单里。剧透一下——答案和计划书不太一样,但我觉得更有意思。
真正动工是从今年一月的一次头脑风暴开始的:技术栈定为 React Native + GraphQL,一套代码打三个平台。而真正密集的开发,集中在四月十一日到二十一日之间——八九个批次,其中好几天是从白天写到深夜,把一个单文件 HTML 原型,变成了一个能登录、能收藏、能搜索、还能"开口说话"的真实应用。代码全部放在 github.com/Waye/BytheStream2.0,好奇细节的可以自己翻。
八个页面,一个原型
一切从一个叫 xishuipang-prototype.html 的单文件原型开始——所有页面的交互先在一个文件里走一遍,感觉对了,再动真格。
真格是 Expo SDK 52 + React Native 0.76.9 + TypeScript,配 Expo Router 的文件路由和 Zustand 状态管理,一套代码同时跑在 iOS、Android 和网页上。首页最先成型:三套主题(暖白、深色、护眼),顶部导航、公告轮播、最新文章、我的收藏、往期期刊,加一个常驻底部的 Mini Player——地基一天基本搭完。
五天后,剩下的八个页面一口气全部实现:期刊详情做成了 Spotify 专辑式的曲目列表,文章阅读器带字号调节和简繁切换,播放队列、全部期刊、收藏管理、搜索结果、登录页、用户中心,一个不落。播放队列排序当时图省事用了上下箭头按钮而不是拖拽,理由很实在——“跨平台兼容”。真正的拖拽排序留进了待办清单,到现在还没轮到它。
通上电:真数据、真后端,和一份没被用上的设计研究
页面有了骨架,接下来是让它活起来。后端选了 Fastify 配 Mercurius GraphQL,DataLoader 防 N+1 查询,MongoDB Atlas 存内容,简繁版本靠 slug 后缀 _s / _t 区分。前端从 mock 数据切到 Apollo Client 真实查询,搜索页做出了每页十条的无限滚动。
图片系统意外地成了个大工程:文章正文里历史遗留的 <filename.jpg> 标记要用正则解析出来,期刊封面要从封面封底文章里智能提取。顺手做了个优化——期刊列表原本要发 85 次数据库计数查询,改成一次聚合查询,加载时间肉眼可见地缩短。
打磨视觉语言的过程中,还认认真真把 Pinterest 的设计系统拆解了一遍——色板、Pin Sans 字体、圆角尺度、暖调的中性色——写成了一份完整的参考文档。最后《溪水旁》走的是完全不同的路:暖白、深色、护眼,外加春夏秋冬四套季节主题。那份 Pinterest 研究基本没直接用上,文档倒是还留着,就当是"我们确实做过功课"的纪念品。
音频,或者说:一天之内我是怎么踩完十个坑的
这一年里最难、也最值得的一块,是把《溪水旁》从"能读"变成"能听"。
方案很朴素:本地用 MeloTTS 把文章批量合成语音,Python 脚本管生成,一个轻量 HTTP 服务用 Range 请求的方式把 mp3 吐给前端,GraphQL 只负责"告诉你音频在哪",不管字节怎么传。特意没上 HLS 切片——体量还不到那个份上,纯 MP3 加 Range 请求,浏览器和 expo-av 天然支持拖进度、断点续传,够用就好。
但通往"够用"的路,一天之内连续踩了十个坑,大致按出现顺序:日语分词词典下载卡在最后一兆;英文发音词典的下载请求莫名被本地代理拦截;MeloTTS 的参数对象没有该有的方法,直接报错;HuggingFace 模型下载慢到怀疑人生,换了个镜像才解决;苹果自家的 GPU 和模型的数据精度对不上,索性整个禁用 GPU,退回 CPU 跑;一个记录进度的索引文件死活是空的,查到最后发现是单篇测试脚本压根没往里写数据;后端用了 ESM 模块,__dirname 这个 Node.js 老熟人突然不认得了;Python 环境保护机制不让直接装包,得先激活虚拟环境;还有两次,中文括号和缩进差异让自动替换脚本判断失败。
十个坑踩完,第一篇真实文章——第85期陈力的《西班牙游记》——合成出了 12 分 26 秒的语音,2.85MB,由 27 个语音片段拼接而成。生成出的欢迎音频,用编辑部的口吻先念出两处经文,其中一处正是刊名"溪水旁"的出处——诗篇二十三篇那句"他使我躺卧在青草地上,领我到幽静的溪水旁"——末尾再落一句"现在,它也会朗读给你听"。这句话,此刻终于不再只是一句文案。
试听后的反馈很直接:“还行”、“有些地方不清楚”、“口音有问题”。分析下来,口音偏南方是训练语料的问题,模型内部改不了;断句衔接处偶尔的电音是拼接痕迹,加了淡入淡出缓解,但没根治。于是有了一个务实的决定:先用现在这套跑通的 demo 上线 MVP,等换到那台配 RTX 5070 的机器、上 IndexTTS 1.5 之后,再把全部 1352 篇文章重新合成一遍——用 Mac 的 CPU 全量跑要三十多个小时,换那张显卡估计几个小时就能跑完。存储也算过账:全量音频大概 3.4GB,Cloudflare R2 的免费额度是 10GB、零出站流量费,预算里留的十五美元,大概率一分都花不出去。
“登录"两个字,两天工作量
看起来最简单的一个功能——邮箱、Google、Facebook 三路登录——实际做完花了两天工作量。后端的 JWT 骨架其实两天前就写好了,真正吃时间的,是 Google Cloud Console 和 Facebook Developers 后台那一整串配置。
Google Cloud Console 建项目、配同意屏幕、注册 Web / iOS / Android 三个 OAuth 客户端,Android 那个还得现造一个调试签名密钥。中途不小心把下载下来的客户端密钥文件发进了聊天记录,虽然后端其实根本用不上那个密钥,还是老老实实把整个客户端删了重建——安全习惯这种事,多一道手续总没错。
Facebook 那边同样一路踩坑:一次因为没勾选邮箱权限,报错说权限不合法;一次因为没把跳转用的域名加进应用的域名白名单,报错说域名不在列表里。两处都是那种"配置界面上一个小勾没打"的错误,调试起来比听上去磨人得多。
最有成就感的一刻,是验证账号合并逻辑那次:先用 Google 登录,退出后换 Facebook 登录,同一个邮箱——两次都进的是同一个账号,收藏的两篇文章原封不动还在。后端逻辑其实很朴素:优先按登录方式加用户 ID 精确匹配,没匹配上就退而按邮箱找,找到了就把新的登录方式绑定上去,全都找不到才新建用户。
顺便重写了 Mini Player 的播放暂停图标——不用表情符号,改用几何图形手绘三角形和双竖线,原因是有些系统会把表情符号渲染成完全不搭调的样子。这类"看起来很小"的修复,往往才是最吃时间的那一类。
一个索引,两百四十倍:一次深夜的性能侦探故事
OAuth 打通的当天晚上,发现点 Mini Player 里的文章标题跳转到文章页时,经常转圈转到天荒地老。
排查过程活像一场推理游戏。怀疑缓存没命中?查了配置,策略本来就是对的,排除。怀疑没做数据预取?确实是个可以优化的点,但补丁没打进去,先记下待办。怀疑 React 渲染慢?开发者工具显示所有网络请求都是毫秒级完成,排除。怀疑打包工具断连?确实断过,重启后问题依旧,排除。最后把网络时间线拉到最细才看出真相——那两秒多的延迟里,压根没发出任何请求,说明卡顿根本不在网络,而是某处在空转。
顺着这条线查到了数据库:文章查询用的两个字段组合,数据库里根本没有对应的复合索引,每次查询都要在一千八百多篇文档里做一次全表扫描。补上这一个索引,创建耗时五十毫秒,查询耗时从 2.4 秒直接降到 10 毫秒——两百四十倍的提速,胜负手只是一行 createIndex。
顺带牵出另外两个真 bug:一处跳转逻辑在拼接链接时把期号前缀意外砍掉了,导致文章页永远查错期号;另一处是加载状态判断写反了——文章数据为空但加载已经结束时,页面选择继续显示"加载中"而不是"未找到”,任何格式不对的链接因此会安安静静地转圈到永远,不给用户任何提示。
现在,以及还没做完的
盘点到这里,得老实说一句:现在的《溪水旁2.0》,和一年前计划书里描绘的那个应用,不完全是同一个东西。
已经做出来、而且真的能用的部分相当扎实:全文搜索、期刊详情、文章阅读、多主题切换、播放队列、收藏云同步、Google / Facebook / 邮箱三路登录,还有那套从零搭起来的语音合成管线。首页跑着一套加权推荐算法——用收藏过期号的中位数当"兴趣中心",同一个作者最多推两篇——未登录或者还没收藏过的用户,则退回展示最新热门往期。
计划书里那些更"社交"的设想——文字高光、经文双向映射、祷告与见证墙、好友关注和评论、多人同听——目前都还没动工,安静地躺在待办清单靠后的位置。这不算遗憾,更像是老实说出了优先级排序:先把"读"和"听"这两件事做扎实,社区互动留到下一程。
真正拦在上线前面的,其实是几件不那么性感、但绕不开的事:Google 和 Facebook 的应用审核(要写隐私政策、说明数据用途)、剩下 1351 篇文章的语音全量生成、把音频搬上 Cloudflare R2,还有一个至今没敲定的服务器托管方案。
三条教训,写给一年后的自己
这一年记下的教训,最后浓缩成了三条,抄在这里,也算是留给未来某次深夜排查问题时的自己:
- 网页应用第一次加载慢,先看数据库有没有索引——十次有八次,锅在这里。
- 别用 Python 的 heredoc 去改 JavaScript 字符串,反斜杠会被吃掉两遍,不值得。
- “加载中或者没数据就转圈"是一种反模式。加载、出错、没找到、正常显示,应该是四种不同的画面;安安静静地无限转圈,是能想到的最差的用户体验。
计划书里写下的那些设想,会一个一个再捡起来。但如果说这一年学到了什么,大概是:写下承诺是容易的,让承诺能跑、能登录、能被人听见,靠的是一千八百多篇文档、十个坑,和一行迟到的 createIndex。
By the streams 2.0: One Year Later
Preface: A Promise Made Last June
Last June, I wrote the By the streams 2.0 project plan right here — a long, ambitious list: a modular homepage, podcast-style audio, synchronized group listening, text highlighting, scripture mapping, a prayer wall, friend-following. Reading it back, it sounds like all of it was going to ship the following month.
Over a year has passed. Today I want to honestly settle the accounts: what actually got built, which pitfalls got hit, and which ideas are still sitting quietly in the backlog. Spoiler — the answer doesn’t quite match the plan, but I think it’s more interesting this way.
Real work started with a brainstorm this past January, where the stack got settled: React Native plus GraphQL, one codebase for three platforms. The real intensive build happened between April 11th and 21st — eight or nine batches, several of those days running from daylight into the middle of the night — turning a single HTML prototype into a real app that can log in, save favorites, search, and, remarkably, talk. The code all lives at github.com/Waye/BytheStream2.0, for anyone curious about the details.
Eight Pages, One Prototype
It all started with a single file called xishuipang-prototype.html — every page’s interactions rehearsed there first, before anything got built for real.
“For real” meant Expo SDK 52, React Native 0.76.9, and TypeScript, with Expo Router’s file-based routing and Zustand for state, one codebase running on iOS, Android, and web at once. The homepage took shape first: three themes (warm white, dark, sepia eye-care), a top nav, an announcement carousel, latest articles, favorites, past volumes, and a Mini Player fixed to the bottom — the foundation was mostly there in a day.
Five days later, the remaining eight pages landed in one push: a volume detail page styled like a Spotify album track list, an article reader with font-size controls and simplified/traditional switching, a play queue, a full volume browser, favorites management, search results, a login page, a profile page — nothing skipped. Queue reordering used simple up/down buttons instead of drag-and-drop, for a practical reason: cross-platform compatibility. Real drag-and-drop went straight into the backlog, and it’s still waiting its turn.
Powering On: Real Data, a Real Backend, and a Design Study That Never Shipped
With the skeleton in place, next came making it live. The backend went with Fastify and Mercurius GraphQL, DataLoader to avoid N+1 queries, and MongoDB Atlas for content, with simplified and traditional versions told apart by an _s / _t slug suffix. The frontend swapped mock data for real Apollo Client queries, and the search page got infinite scroll at ten results per page.
The image system turned into an unexpected undertaking: legacy <filename.jpg> markers embedded in article text had to be parsed out with regex, and volume covers had to be intelligently extracted from front- and back-cover articles. A small optimization paid off well too — the volume list used to fire 85 separate database count queries; collapsing that into a single aggregation made load times visibly shorter.
Somewhere in the middle of refining the visual language, a full stretch of time went into reverse-engineering Pinterest’s entire design system — the color palette, the Pin Sans typeface, the border-radius scale, its warm sand-toned neutrals — written up as a complete reference document. In the end, By the streams took a completely different path: warm white, dark, sepia eye-care, plus four seasonal themes for spring, summer, autumn, and winter. That Pinterest research never really made it into the shipped product. The document’s still sitting there, though — proof, at least, that the homework got done.
Audio, or: How I Hit Ten Walls in One Day
If one piece of this year was the hardest, and the most worth it, it was turning By the streams from something you read into something you can listen to.
The plan was simple enough: batch-synthesize articles locally with MeloTTS, let a Python pipeline handle generation, and hand the resulting mp3 files to a lightweight HTTP server speaking Range requests. GraphQL’s only job is to say where the audio lives — not to move the bytes itself. HLS segmentation got deliberately skipped; the scale didn’t call for it. Plain MP3 over HTTP Range is natively supported by browsers and expo-av alike — seeking and resuming just work.
But the road to “just works” ran through ten walls in a single day, roughly in this order: a Japanese tokenizer dictionary download stalling at the very last megabyte; an English pronunciation dictionary download getting mysteriously intercepted by a local proxy; a parameter object inside MeloTTS missing a method it was supposed to have, throwing a bare error; Hugging Face model downloads so slow they needed a mirror; Apple’s own GPU disagreeing with the model’s numeric precision, resolved by disabling it entirely and falling back to CPU; a progress-tracking index file that stayed stubbornly empty, traced to a single-article test script that had never written to it in the first place; a backend __dirname — an old Node.js friend — suddenly going missing because the project used ES modules; Python’s environment protections refusing package installs until a virtual environment got activated; and twice, Chinese parentheses and indentation mismatches breaking an automated string-replace script.
Ten walls later, the first real article — “A Trip to Spain” by Chen Li, from Issue 85 — came out as 12 minutes 26 seconds of audio, 2.85MB, stitched from 27 chunks. The generated welcome track opens with two lines of Psalms in the editorial team’s own voice — one of them the very source of the magazine’s name, Psalm 23:2’s “he leadeth me beside the still waters” — and closes with the line: “now, it can be read aloud to you, too.” For the first time, that line stopped being just copy.
Feedback after a first listen was blunt: “decent,” “some parts unclear,” “the accent’s off.” Digging in: the accent leans southern because of the training corpus, not something fixable inside the model; the occasional glitchy transitions are chunk-boundary artifacts, softened with fade-in and fade-out but not eliminated. That led to a practical call — ship the MVP on the current working demo, then, once a machine with an RTX 5070 comes online running IndexTTS 1.5, re-synthesize all 1,352 articles from scratch. A full run on the Mac’s CPU would take thirty-plus hours; on that GPU, an estimated few hours would do it. The storage math got worked out too: the full audio library comes to roughly 3.4GB, comfortably inside Cloudflare R2’s 10GB free tier with zero egress fees — the $15 budgeted for it will likely go entirely unspent.
Two Words, “Log In,” Two Days of Work
The feature that looked simplest on paper — email, Google, and Facebook login — actually took about two days’ worth of work. The JWT backend skeleton had actually been written two days earlier; what really ate the time was the entire chain of configuration inside Google Cloud Console and Facebook Developers.
Google Cloud Console meant creating a project, configuring the consent screen, and registering three separate OAuth clients for Web, iOS, and Android — the Android one needing a freshly generated debug signing key. Partway through, a downloaded client secret file accidentally made it into a chat log; even though the backend never actually needs that secret, the whole client got deleted and rebuilt from scratch out of caution. Good security habits rarely hurt.
Facebook brought its own trail of small stumbles: one error for forgetting to enable the email permission scope, another for forgetting to add the redirect domain to the app’s allowed domains list. Both were the kind of “one missing checkbox” mistakes that take far longer to debug than they sound like they should.
The most satisfying moment came at verification: log in with Google, log out, log back in with Facebook using the same email — and land in the exact same account both times, with both saved favorites intact. The logic behind it is simple: match first by login provider and provider ID, fall back to matching by email, link the new provider if a match turns up, and only create a new account when nothing matches at all.
Along the way, the Mini Player’s play/pause icons got rebuilt from scratch — hand-drawn triangles and bars instead of emoji characters, since some systems render emoji in ways that clash with everything around them. The fixes that look smallest are often the ones that eat the most time.
One Index, 240x: A Late-Night Performance Mystery
The same night OAuth finally worked, tapping an article title from the Mini Player started producing a spinner that never seemed to stop.
The investigation played out like a proper mystery. Suspect the cache wasn’t hitting? Checked the config — the strategy was already correct. Ruled out. Suspect missing data prefetching? A real optimization opportunity, but the patch hadn’t actually landed — noted for later. Suspect slow rendering? Dev tools showed every network request finishing in milliseconds. Ruled out. Suspect the bundler had disconnected? It had, at some point, but restarting it didn’t fix anything. Only after zooming all the way into the network timeline did the truth appear: over two full seconds passed with no request in flight at all — the delay wasn’t network-related, something was spinning in place.
Following that thread led to the database: the two fields used to look up an article had no matching compound index, so every query triggered a full scan across more than 1,800 documents. Adding that one index took fifty milliseconds to build and dropped query time from 2.4 seconds to 10 milliseconds — a 240x speedup, decided by a single line of createIndex.
Two real bugs turned up along the way, too: one routing path was silently stripping the volume-number prefix off article links, so the article page always looked up the wrong volume; the other was a loading-state check written backwards — when article data came back empty but loading had technically finished, the page kept showing a spinner instead of a “not found” message, so any malformed link would spin forever without ever telling the user anything was wrong.
Where Things Stand, and What’s Still Left
Being honest here: today’s By the streams 2.0 isn’t quite the same app described in last year’s plan.
What’s actually built and working is genuinely solid: full-text search, volume detail pages, article reading, multiple themes, a play queue, cloud-synced favorites, three-way login through Google, Facebook, and email, and an entire text-to-speech pipeline built from scratch. The homepage runs a weighted recommendation algorithm — using the median of a reader’s favorited volumes as an “interest center,” capping any one author at two picks — falling back to recent popular issues for anyone not logged in or without favorites yet.
The more “social” ideas from the original plan — text highlighting, two-way scripture links, a prayer and testimony wall, friend-following and comments, synchronized group listening — haven’t been started, and sit quietly near the bottom of the backlog. That’s less a disappointment than an honest reflection of priorities: get reading and listening solid first, save community features for the next stretch.
What’s actually standing between this and launch is a handful of unglamorous but unavoidable items: Google and Facebook app review (a privacy policy, data-use disclosures), synthesizing audio for the remaining 1,351 articles, moving audio storage to Cloudflare R2, and a hosting decision that still hasn’t been made.
Three Lessons, Written for Next Year’s Self
The lessons from this year boil down to three, written down here as a note for some future late-night debugging session:
- When a web app is slow on first load, check the database indexes first — eight times out of ten, that’s where the problem is.
- Don’t use a Python heredoc to edit JavaScript strings; backslashes get eaten twice, and it’s never worth the trouble.
- “Show a spinner while loading or when there’s no data” is an anti-pattern. Loading, error, not-found, and loaded should be four distinct screens — a silent, endless spinner is about the worst experience there is.
The ideas written into last year’s plan will get picked back up, one at a time. But if this year taught me anything, it’s this: writing down a promise is easy. Making it run, letting people log in, and letting it be heard takes something closer to eighteen hundred documents, ten walls, and one line of createIndex that showed up a little late.
打赏 · 请我喝杯咖啡
你的支持会让我更有动力创造更多内容。Buy me a coffee — your support keeps new posts coming.

