Version: v1.0
Author: Arnie Widdowson
System: TranSymbolics / Gyrator Plug-in Framework
Date: [Autogenerated at save]
This document defines the folder structure used to save and restore full transformer context snapshots, including symbolic extensions such as obfuscation, encryption, and plug-in metadata. The structure supports both human auditing and automated traversal.
All saved contexts reside under a root directory:
/media/krusty/gm/gm194/context/
Each save creates a timestamp-named subdirectory using this pattern:
m<M>d<D>x<h>x<m>x<s>
Where:
Example: m30d2x15x23x48x10
This ensures natural sort order and time uniqueness.
m30d2x15x23x48x10/├── input_ids.npy├── attention_mask.npy├── position_ids.npy├── past_key_values/│ ├── 0_key.npy│ ├── 0_value.npy│ ├── 1_key.npy│ ├── 1_value.npy│ └── ...├── metadata/│ ├── config.txt│ ├── symbolic_map.npy│ ├── obfchain.txt│ └── keylog.txt
What it is: A file that stores the actual text as numbers. These numbers are token IDs, which are how the model reads and understands text.
Format: A 2D table of numbers (a NumPy array). Each row is one piece of text (like a sentence). Each number in the row is one token.
[ [101, 2074, 1037, 2307, 2154, 102], [101, 2129, 2024, 2017, 102, 0]]
Example:
101 might mean “[START]”,
102 might mean “[END]”,
“just”, “a”, “good”, “day”, etc.,
0 at the end is padding
Why it matters: This is the actual input that was given to the model. Without it, the saved memory (the “context”) makes no sense.
Symbolic use: You might change the IDs using a secret map, so the text is hidden unless the map is known. That’s symbolic obfuscation.
What it is: A matching table that tells the model which tokens are real and which are just padding.
Format: Same size as input_ids.npy, just 1s and 0s.
[ [1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 0]]
What it means: 1 = real token, 0 = ignore this (padding)
Why it matters: It controls which tokens affect the model’s attention. If this is wrong, the model could focus on meaningless input.
Symbolic use: Usually not altered symbolically. But it can be encrypted for storage if needed.
What it is: Tells the model where each token appears in the full conversation.
Format: A 2D array of integers indicating position indexes.
[ [0, 1, 2, 3, 4, 5], [6, 7, 8, 9,10,11]]
Why it matters: These must align with the cached positions. Mismatches can cause attention collapse or drift.
Symbolic use: Should remain exact. Minor symbolic drift only under controlled transformation.
What it is: The model's memory—key/value tensors for each transformer layer.
Imagine each transformer layer holding index cards. Each card has a key (what it remembers) and a value (what it retrieved). These are filed in a stack—one for each layer.
past_key_values/├── 0_key.npy├── 0_value.npy├── 1_key.npy├── 1_value.npy...├── N_key.npy├── N_value.npy
Each key/value tensor is shaped like:
(batch, heads, tokens, head_dim)
Example: (1, 8, 128, 64)
means:
These are saved as uncompressed .npy files after converting from CuPy:
np.save(path, cp.asnumpy(tensor))
They hold the actual memory of the transformer. Saving them allows the model to skip replaying the full prompt.
You can symbolically transform them with:
If any tensor is corrupted, missing, or mismatched, the model may hallucinate, freeze, or collapse.
Each key/value tensor encodes: “What did this token mean at that layer, through that head?” This is symbolic state, not just math.
What it is: Human-readable text metadata. Used to match model configuration and cache format.
model: Gemma-2Bdtype: float16cache_impl: standard
What it is: NumPy array mapping obfuscated token IDs back to their true values.
Used to reverse symbolic masking during contextload()
.
What it is: Describes the sequence of symbolic and/or encryption steps applied during contextsave().
rotate_heads layer=0 order=[3,0,2,1]xor_mask value seed=982734remap_ids map=symbolic_map.npy
What it is: Records key material, agent names, prompts, or symbolic signatures related to this save.
seed: 42918user: agent_fkeyhash: ab7721a3...source: symbolic prompt #12
During contextload()
, the recovery sequence follows:
config.txt
to verify model compatibilityobfchain.txt
(in reverse order)input_ids.npy
if obfuscatedEach op in gyrobf()
must leave an entry in obfchain.txt
.
# | Gyrator Element | File or Source | Description |
---|---|---|---|
1 | Token IDs | input_ids.npy | Raw token sequence |
2 | Attention Mask | attention_mask.npy | Binary mask |
3 | Position IDs | position_ids.npy | Absolute positions |
4 | Past Keys | past_key_values/*_key.npy | KV memory: keys |
5 | Past Values | past_key_values/*_value.npy | KV memory: values |
6 | Model Config | config.txt | Architecture and dtype |
7 | Cache Format | config.txt (cache_impl) | Cache structure (standard, etc.) |
8 | Cache Position | position_ids + KV size | Offset for continuation |
9 | Symbolic Map | symbolic_map.npy | Token remapping table |
10 | Obf Chain | obfchain.txt | Transformation log |
11 | Key Log | keylog.txt | Seed, user, session |
12 | Context Metadata | config.txt, protocol.txt | Format and versioning |