Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor tokenization #191

Merged
merged 19 commits into from
Aug 6, 2023
Merged

Refactor tokenization #191

merged 19 commits into from
Aug 6, 2023

Conversation

mivanit
Copy link
Member

@mivanit mivanit commented Aug 3, 2023

Refactor to be compatible with maze-dataset versions 0.2.1 and onwards.

See PRs:

See related issues:

These changes also revert changes in #118, to be consistent with underscores only appearing once in the special tokens.

Upgrading transformer_lens to 1.4.0 caused
`HookedTransformer.process_weights_()` to no longer accept
the keyword arg `move_state_dict_to_device`

However, I'm not sure if this was important in the first place.
If any issues come up, move the state dict to device manually in
`ZanjHookedTransformer._load_state_dict_wrapper()` where all this
was happening in the first place
since we removed tokenizer stuff from the dataset
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

the `eval_model.ipynb` notebook has a function `testdata_plot_predicted_path`
which was using `model.zanj_model_config` to get the tokenizer, an attribute
missing from the `RandomBaseline` class since it only inherits from `HookedTransformer`

to fix this:

- `ZanjHookedTransformer` now has a `config` property which just
  accesses the `zanj_model_config` used by the parent `ConfiguredModel`
- `testdata_plot_predicted_path` now uses `model.config` everywhere
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant