Skip to content
View andyrdt's full-sized avatar

Block or report andyrdt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. refusal_direction refusal_direction Public

    Code and results accompanying the paper "Refusal in Language Models Is Mediated by a Single Direction".

    Python 79 16

  2. andyrdt.github.io andyrdt.github.io Public

    SCSS

  3. CircuitsVis CircuitsVis Public

    Forked from TransformerLensOrg/CircuitsVis

    Mechanistic Interpretability Visualizations using React

    Jupyter Notebook

  4. mi mi Public

    Repo to track miscellaneous mi stuff

    Jupyter Notebook 3

  5. llm-attacks llm-attacks Public

    Forked from llm-attacks/llm-attacks

    Python

  6. SycophancySteering SycophancySteering Public

    Forked from nrimsky/CAA

    Modulating sycophancy in llama-2 via activation steering

    Python