Paragraph breaks and sentence punctuation stop a category lookup from accidentally spanning unrelated thoughts.
IOTA-1 (ɩ≃1)
Language Converter
Type English, for example "How are you today", and convert it to ɩ≃1 (Iota-1). Paste ɩ≃1 back in and convert it to English. Protocol5 breaks paragraphs into sentences, checks Category.Categories string segments first, then Category.Words, and then maps against Category.ISO10646 glyphs for public-symbol output.
English ⇄ ɩ≃1
Convert Your Language
English is the active human-language lane. The live Protocol5 path prefers stored SQL corpus vectors, then falls back to public seed concepts when the corpus is unavailable.
Output
Approximate result
Ranked conversion candidates
Trace evidence
Vector evidence
Embedding comparison
Semantic similarity
SQL corpus
Category search
Grammar order
Paragraphs become sentences; sentences become segments.
For English to ɩ≃1 (Iota-1), the converter first splits the input at paragraph and sentence boundaries. Inside each sentence it tries the longest stored English segment first, so a phrase in Category.Categories, such as a category label or three-word segment, can match before the system falls back to individual rows in Category.Words. The resulting vector is then compared against public glyph candidates in Category.ISO10646, with each ranked candidate labeled by its ranking lane and each response summarized by ranking-lane counts.
Longer English category text is tested before single-word fallback so phrases can keep their meaning together.
If no stored category segment exists, each English word can still map through its stored vector.
English vectors are ranked against searchable public Unicode glyph rows for visible ɩ≃1 output.
Hard rules
What IOTA-1 (ɩ≃1) is allowed to be
- Approximate
- ɩ≃1 compares conceptual weights. Results are ranked semantic neighbors, not exact translations or lossless conversions.
- Public symbols
- Use assigned ISO/IEC 10646 / Unicode characters and public standard sequences. Private-use areas are prohibited.
- No secret map
- No proprietary dictionary, hidden bilingual table, or private codebook may become the conversion authority.
- Research filter
- Discard claims that reject the project because exact conversion is impossible. Keep evidence about security, Unicode processing, and vector limits.
Public symbols
Assigned characters are the visible source material.
IOTA-1 (ɩ≃1) uses public Unicode characters and standard public sequences as reviewable symbol candidates. The converter may rank candidates by vector proximity, but the symbol inventory itself must stay inspectable through public character assignments and public metadata.
Implementation status
Current Protocol5-hosted capability map.
The hosted demo is deliberately split between read-only public endpoints and local-only mutation tooling. That keeps website exploration useful while preserving the rule that embedding population happens off the public web host.
Protocol5.com uses the same-server SQL category corpus first when available: Category.Categories string segments, then Category.Words, then Category.ISO10646 glyph candidates for ɩ≃1 output.
Status endpointSearch endpoints are read-only surfaces over stored public vectors or caller-supplied embeddings, with optional local LM Studio text embedding when configured. Disconnected package paths stay separate.
Search endpointVector population belongs in the local WPF desktop runner and scripts, not in public web mutation routes.
Protocol rulesFacade
One stable C# entry point
Consumer projects use IJustAnIotaConverterFacade for conversion, meaning queries, and round trips. The included factory can create a database-only converter from the public seed registry or wire an LM Studio embedding provider.
Logic layer
Approximate semantic engine
The logic layer owns NFC normalization, System.Text.Rune scalar handling, grapheme grouping, semantic segmentation, candidate ranking, vector evidence summaries, source evidence atlas output, approximation labels, and private-use rejection.
Repository
ADO.NET over SQL Server vectors
Repository interfaces stay persistence-agnostic. The SQL Server implementation stores English anchors and public symbol embeddings in vector columns and uses VECTOR_DISTANCE, VECTOR_SEARCH, and DiskANN indexes where available.
Local AI
LM Studio is optional
LM Studio can generate embeddings, rerank candidates, and verbalize results. After the database is populated, DatabaseOnly mode must still return a basic gist without live AI.
Pipeline
Public metadata to approximate meaning
| Step | Owner | Rule |
|---|---|---|
| Build symbol atlas | Unicode ingestion worker | Use UCD, CLDR, Unihan, emoji data, source versions, and public provenance. |
| Generate embeddings | Deterministic seed builder or LM Studio adapter | Use descriptor text, anchors, code-point evidence, and source provenance, not bare code point numbers as the only semantic evidence. |
| Store vectors | ADO.NET SQL repository | Persist dimensions, model, prompt profile, public source, and private-use exclusion state. |
| Query gist | Logic layer | Compose stored English or symbol vectors when live AI is unavailable. |
| Return evidence | Facade | Always include mode, approximate status, ranked candidates, ranking lanes, ranking-lane counts, scores, provenance, and source atlas families. |
Developer path
Use it without SQL Server first
The Protocol5.com host uses the local SQL category corpus. The package can still run immediately from its public seed registry for disconnected .NET apps, deterministic smoke tests, UI prototypes, and the disconnected JustAnIota.com WordPress path before a populated SQL Server vector store is available.
IJustAnIotaConverterFacade converter = JustAnIotaConverterFactory.CreateDefaultDatabaseOnly();
ConversionResponse response = await converter.ConvertAsync(new ConversionRequest
{
Input = "good help",
Direction = ConversionDirection.EnglishToIota,
Mode = ConversionMode.DatabaseOnly,
ResultLimit = 3
});
// response.Output is "好救"; response.Trace exposes concept IDs, scores, code points, and provenance.
Implementation status
Current C# slice
The repository now contains a usable Protocol5.JustAnIota package slice: facade and logic interfaces, a public seed registry, deterministic descriptor embeddings, segmented database-only conversion, segment-level vector evidence, source evidence atlas output, candidate ranking lanes, semantic similarity comparison, public Unicode safety checks, an optional LM Studio embedding adapter, SQL Server vector repository primitives, a hosted Protocol5 demo API, and MSTest contract coverage. Future work adds curated public metadata ingestion and SQL integration tests against a vector-capable SQL Server instance.