Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For more people to learn about the EleutherAI/minetest project #19

Open
Pantyhose-X opened this issue Jan 10, 2023 · 0 comments
Open

For more people to learn about the EleutherAI/minetest project #19

Pantyhose-X opened this issue Jan 10, 2023 · 0 comments

Comments

@Pantyhose-X
Copy link

Pantyhose-X commented Jan 10, 2023

Please add the content at the beginning of README.md

# Alignment Environments for Minetest: Exploratory Phase Outline


## What is the project?
The goal of this project is to provide a rich and easily moddable environment that alignment researchers can use to test many aspects of alignment and alignment techniques. From there there are a few potential directions depending on feasibility/interest/etc.
-More specialized Challenge environments that grapple with specific problems in alignment
-Datasets and base models that can be used to train/fine-tune models
-Live servers where humans can interact with agents
-A more involved project to develop alignment techniques for agents

## What is the motivation behind this project?
The motivation of this project is to follow in the footsteps of projects like [AI safety gridworlds](https://www.deepmind.com/open-source/ai-safety-gridworlds), [MineRL BASALT](https://minerl.io/basalt/), and [MineDojo](https://minedojo.org/) and then to build on top of them. While currently alignment research is mostly theoretical, we want to be able to “make the rubber meet the road” in an environment that is both rich in terms of the complexity of tasks that can be done, and easy to extend to test any particular situation. Minetest appears very well suited for this.

Specific examples of why we might want this are:

-Current interpretability research is mostly not all that “grounded” and deals with “fully observable” situations, like image classification and text generation. There are no latent factors present that are not directly fed/output by the NN, but we will want to be able to detect/interpret these in a real system.
-Major “alignment tactics” currently employed assume a hard boundary between the AI and the task/environment/human. We can’t fully test this with a minetest environment, but we can get much closer, and therefore verify that our systems don’t have failure modes that are akin to reward hacking.
-None of the existing rl environments based on minecraft are “live”, and so it’s difficult to study human machine interactions with them, which is likely going to be an important facet of future alignment research.

## Who is part of the project?
The primary contact for the project will be @AI_WAIFU, with @JDP as the secondary contact. There is also interest from @ac @FieryVictoryStar @harfe and @triggerhappygandi 
Do you need a separate channel for discussion?
Yes, it should be called “#alignment-minetest”.

## Do you need compute?
Probably not. The Project should be mostly conventional software development, and minetest runs on a potato, so compute likely won’t be necessary for the exploratory phase. We might need some later to train RL agents or to run a server or multiple concurrent instances. 

## What are the deliverables for the exploratory phase of this project?
-A report on the minetest codebase, how it works, and how to modify it.(This is probably just going to be links to existing docs)
-Candidate “AI alignment test setups” that we can use to study different facets of alignment
-A design and roadmap for releasing a full environment
-Design and roadmap for any additional work we want to do on top of the environment
-Full Project Proposal using the new EAI project proposal template that contains the above
-A prototype oai-gym like environment that can launch a gym client, receive fake keyboard/mouse input, and return frames from the minetest game.

## What is the rough timeline for the exploratory phase?
-The goal is for the exploratory phase to take roughly 2-3 months, ending sometime in December 2022.
# [minetest](https://github.com/minetest/minetest)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant