removed embedding by jordandekraker · Pull Request #45 · lucidrains/titans-pytorch

jordandekraker · 2025-04-10T15:58:07Z

This is a pretty simple PR, it just removes the embedder so people can feed in other (flattened) data types, such as embedded video frames or audio or other. We also remove softmax and logits. From the sampler, we remove min_p_filter and gumbel_sample.

For tokens, it is recommended to do embedding and logits outside mac_transformer.py. I wasn't able to run train_mac.py due to incompatible dependencies, so I left it alone, but it should be simple to add an embedder and logits. min_p_filter and gumbel_sample could possibly be added back in somewhere else (utilis?)

jordandekraker · 2025-04-10T16:17:52Z

Again, I could not test this but this dae41d2 is how I would organize train_mac.py. Conisder it only a suggestion!

Notes: we're cutting a dense, no bias logits layer off the end and swapping cross_entropy for l1_loss. This is a price I would pay for being able to run on non token data.

lucidrains · 2025-04-12T13:17:53Z

@jordandekraker hey Jordan, thanks for the pull request

any chance you could disentangle this so it can support language modeling and your use-case (with a few tests)? are you seeing something with this architecture on continuous data?

lucidrains · 2025-04-12T13:18:30Z

@jordandekraker it may be faster if i just build it for you, but you'll have to share with me what you are seeing. just reach out over Signal

jordandekraker · 2025-04-14T15:29:12Z

The README code worked for me and sample was able to return an appropriately sized tensor of floats. I haven't tested whether this returns sensible results in, e.g., embedded video frames yet.

The changes to get away from tokens MAY be a bit deeper than I thought - not sure if additional classes and modules will need to be updated to have a features dimension. That is, even though I cannot run it all locally, i think tests/test_titans.py may still fail on test_flex with SegmentedAttention. I didn't touch the dimensionality of that code, but i'm just not sure if its implicitly being handled correctly. There are some reshapes making me shakey.
In general I think its nice to explicitly have a features dimension instead of tokens, but I would not be upset if you disagree.

I prefer to chat via github but can move if that is prohibitive for you

Jordan DeKraker added 2 commits April 10, 2025 11:42

removed embedding

274d419

suggested token organization

dae41d2

test cleanup

421d7df

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

removed embedding#45

removed embedding#45
jordandekraker wants to merge 3 commits intolucidrains:mainfrom
jordandekraker:main

jordandekraker commented Apr 10, 2025

Uh oh!

jordandekraker commented Apr 10, 2025

Uh oh!

lucidrains commented Apr 12, 2025 •

edited

Loading

Uh oh!

lucidrains commented Apr 12, 2025

Uh oh!

jordandekraker commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jordandekraker commented Apr 10, 2025

Uh oh!

jordandekraker commented Apr 10, 2025

Uh oh!

lucidrains commented Apr 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lucidrains commented Apr 12, 2025

Uh oh!

jordandekraker commented Apr 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lucidrains commented Apr 12, 2025 •

edited

Loading