Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
AbstractPhil 
posted an update 2 days ago
Post
1249
The Rosetta Stone geometric vocabulary and the ramping up capacity.

What makes this particular invariant special, is the existence within all structures I've tested so far. I had Claude write up the direct article based on what we built together, but I've tested it on many substructures. This is flawed, and I have a series of answers to making it more accurate.

First a reconstruction from the ground up. This means each shape is specifically built upward from the substructure to the point of inductive deviance. This will be less quick at first and then build speed as I optimize like the last system did.

The "saddle" problem; the system detected saddles because there wasn't enough deviance in the shapes to attenuate to more cardinality and more aligned substructures. The blobs were around 30-40% of the overall patches, which interpolated into the others produced a fair approximation.
It MOST DEFINITELY did see those shapes in their voxel complexity. This is real.

https://claude.ai/public/artifacts/bf1256c7-726d-4943-88ad-d6addb263b8b
You can play with a public claude artifact dedicated to viewing the current shape spectrum - and with that know exactly why it's flawed.

The flawed and repetitive shapes. I rapid prototyped and there are multiple redundant shapes that simply don't classify well or at all. Not to mention the rotation simply doesn't help much of the time, or doesn't exist with many shapes. This will be rectified in the next variation.

Projecting to shared latent space as a catalyst to allow growing subjective geoflow matched step variance, rather than simply direct classification. This will theoretically allow for full channel-to-channel invariant features to be mapped from structure to structure, and the very formula that encapsulated them to be directly baked into the math rather than classified as a substructure analysis.

There are many challenges between here and there, so stay tuned my friends as I plot the geometric language of pretrained AI.

So far I've found the most meaningful and reusable representations can be formatted through a gated geometric hierarchy. I'm currently running roughly 50k images through the VAE in order to assess the capacity of the model's components before refactor or reassessment. So far the results are promising for synthetic supervised local patch geometric contribution bias being a very real potential. The model learns to predict the classification elements and then the model no longer requires the transformer blocks, so the gates can be snapped off and the model turned into a fragment of it's larger self. A form of hardened crystalline.

The gates are nearly deterministic between trains, however the classification elements are non-determinant - which means the model is learning to bias in specific routes beyond the current stage in order to justify classification goals. The gates themselves are producing utilizable feature information however, so the outcomes are promising on the refactor.

image

image

So far the patch features are showing the most robust reusability potential, but that's only about 120 images or so total, the 50k 15 category test will be the real measure.

Surprisingly the gate statistics are essentially useless, nearly identical through all stages.

50k test completed using synthetic data extracted from flux for another project;
https://huggingface.co/datasets/AbstractPhil/synthetic-characters

This is more than enough inference information to get a fair measure as to which features are the most helpful and which aren't so useful.

The results are here as well as the runner;
https://huggingface.co/AbstractPhil/grid-geometric-multishape/tree/main/50k_results

It requires the cell 1 model code and then it'll run.

So what we do here, is snap off the classifier and utilize the various features in cosine similarity conjunction. The accuracy of the tested model is roughly 93% 3-4 shape shared space in the patches, so this can be greatly expanded but it requires additional computational power.

The 3-4 shape shared space should be more than enough pretraining for this hypothesis; which seems to be building more and more potency as something beyond a possibility. This is most definitely a measurable phenomena. Geometric structure most definitely can be analyzed and compacted into useful discriminative features in order to apply a learned bias. How USEFUL those features are? Well, they're pretty discriminative, so there needs to be more tests.

image

image

image

This leaves many questions. Predominantly, the singular one that will be required; can the patches be made smaller if the mathematics are condensed and the shared attention is expanded, and how many patches can this actually support within a nearly-instant computation window?

Does this require the geometric transformers to train or can it learn useful features independently?

Can this benefit from captured embeds in differential conjunction sharing space with a powerful text encoder such as Qwen 2.5 instruct?

Will these patches actually provide attention use down the chain to a diffusion model, or will the mechanism simply get lost in the noise?

In this post