GitHub - artecs-group/Tensorflow-Container-Scheduler

`Documentation`

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to [email protected]. See all the mailing lists.

Installation Steps (Docker Malleable Tensorflow 2.0.0)

1- Download this folder

2- Unzip the folder where you want.

3- Start a linux command console.

4- Enter the tensorflow_master folder from the command console.

5- Execute "docker build -f ./tensorflow-cpu.Dockerfile -t tf ."

6- Execute "docker run -v $(pwd):/my-devel -it tf". The container of the builded image will open.

Run multiple consoles in docker

1- Run the image with docker run.

2- Search with "docker ps" for the container id.

3- Run docker exec -it container_id bash.

Run Tensorflow Examples

To run a tensorflow example, four environment variables must be exported:

export MIN_ENV_THREADS: defines the minimum number of threads used for execution.
export MAX_ENV_THREADS: defines the maximum number of threads used for execution.
export ITERATION_DOWN: defines in which node of the graph the number of threads will decrease
export ITERATION_UP: defines in which node of the graph the number of threads will increase.

Then run the examples located in the / tensorflow_src / tensorflow_examples folder:

cd tensorflow_src/tensorflow_examples

To run the VGG example:

python keras_example_VGG.py NUM_INTER NUM_INTRA

To run the Resnet50 example:

python keras_example_renset.py NUM_INTER NUM_INTRA

The script execution_script.sh allows you to run an example by varying the number of inter and intra threads from 1 to 32, testing all combinations. It has 5 parameters:

minimum number of threads
maxinum number of threads.
which node of the graph the number of threads will decrease.
which node of the graph the number of threads will increase.
name of the model example (VGG or Resnet50).

Save changes in Docker Container

"docker commit" allows you to save a snapshot of your container as a docker image so you can return to it later. Like any docker image, these can be moved around to a different machines. Docker's git-like features really shine here -- you can roll back to previous commits, and pushing images with docker push is really fast since it pushes only differences.

Quick overview:

1- On the host machine that is running docker, look up the name or container id of the running container using docker ps. (You can also assign your own choice of name to the container when calling docker run and then use that).

2-Save the running container as a docker image, e.g. docker commit username/imagename. Optionally you can include commit messages with -m. Once the container is committed, you can now stop or remove the container without losing data.

3- Push the container to the Docker Hub: docker push username/imagename. Be sure to use a private image (either on the Hub or by running a private registry) if necessary: just create the private image name on the Hub before pushing. (Alternatively you can save the container as a tarball with docker save and download that for future use. This approach does not benefit from transferring only the changed layers, so should be avoided in favor of docker push/pull if possible).

NOTE: If you start an instance with a linked volume, docker commit will not capture changes to that volume

Bibliography: https://github.com/rocker-org/rocker/wiki/How-to-save-data

Use Tensorboard in Docker Container

Run the saved image by binding port 6006 of the container to port 6006 of the local host. Use the following command:

docker run -p 0.0.0.0:6006:6006 -it container_name:latest

Then, execute Tensorflow with option --bind_all. This will expose your TensorBoard instance to the network on both IPv4 and IPv6 (where available). Mutually exclusive with --host. Use the following command:

tensorboard --logdir PATH_PROFILE --bind_all

Use Linux signals to change parallelism

To change the parallelism at runtime, we first look for the PID of the process that runs the tensorflow instance with the following command:

pidof python example_name.py

Where name is keras_example_resnet.py or keras_example_VGG.py.

If we want to modify the inter parallelism, we must use the signal 10 to decrease or 12 to increase.

If we want to modify the intra parallelism, we must use the signal 16 to decrease or 17 to increase.

Finally we execute the kill command to send the signal to the program:

kill -SIGNAL_NUMBER PID_PROCESS

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release for CPU-only:

$ pip install tensorflow

Use the GPU package for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow-gpu

Nightly binaries are available for testing using the tf-nightly and tf-nightly-gpu packages on PyPi.

Try your first TensorFlow program

$ python

>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Contribution guidelines

If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.

The TensorFlow project strives to abide by generally accepted best practices in open-source software development:

Continuous build status

Official Builds

Build Type	Status	Artifacts
Linux CPU		PyPI
Linux GPU		PyPI
Linux XLA		TBA
macOS		PyPI
Windows CPU		PyPI
Windows GPU		PyPI
Android
Raspberry Pi 0 and 1		Py2 Py3
Raspberry Pi 2 and 3		Py2 Py3

Community Supported Builds

Build Type	Status	Artifacts
Linux AMD ROCm GPU Nightly		Nightly
Linux AMD ROCm GPU Stable Release		Release 1.15 / 2.x
Linux s390x Nightly		Nightly
Linux s390x CPU Stable Release		Release
Linux ppc64le CPU Nightly		Nightly
Linux ppc64le CPU Stable Release		Release 1.15 / 2.x
Linux ppc64le GPU Nightly		Nightly
Linux ppc64le GPU Stable Release		Release 1.15 / 2.x
Linux CPU with Intel® MKL-DNN Nightly		Nightly
Linux CPU with Intel® MKL-DNN Stable Release		Release 1.15 / 2.x
Red Hat® Enterprise Linux® 7.6 CPU & GPU Python 2.7, 3.6		1.13.1 PyPI

Resources

Learn more about the TensorFlow community and how to contribute.

License

Apache License 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 99 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
k8s		k8s
logs		logs
scheduler		scheduler
tensorflow-cpu_archivos		tensorflow-cpu_archivos
tensorflow		tensorflow
tensorflow_examples		tensorflow_examples
third_party		third_party
tools		tools
venv		venv
.bazelrc		.bazelrc
.bazelversion		.bazelversion
.gitignore		.gitignore
ACKNOWLEDGMENTS		ACKNOWLEDGMENTS
ADOPTERS.md		ADOPTERS.md
AUTHORS		AUTHORS
BUILD		BUILD
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ISSUES.md		ISSUES.md
ISSUE_TEMPLATE.md		ISSUE_TEMPLATE.md
LICENSE		LICENSE
README.md		README.md
README.md.save		README.md.save
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
WORKSPACE		WORKSPACE
arm_compiler.BUILD		arm_compiler.BUILD
bashrc		bashrc
bfg-1.14.0.jar		bfg-1.14.0.jar
configure		configure
configure.cmd		configure.cmd
configure.py		configure.py
gitlab.domain.com.csr		gitlab.domain.com.csr
history		history
launch_all.sh		launch_all.sh
models.BUILD		models.BUILD
tensorflow-cpu-original.Dockerfile		tensorflow-cpu-original.Dockerfile
tensorflow-cpu.Dockerfile		tensorflow-cpu.Dockerfile

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Installation Steps (Docker Malleable Tensorflow 2.0.0)

Run multiple consoles in docker

Run Tensorflow Examples

Save changes in Docker Container

Use Tensorboard in Docker Container

Use Linux signals to change parallelism

Install

Try your first TensorFlow program

Contribution guidelines

Continuous build status

Official Builds

Community Supported Builds

Resources

License

About

Releases

Packages

Languages

License

artecs-group/Tensorflow-Container-Scheduler

Folders and files

Latest commit

History

Repository files navigation

Installation Steps (Docker Malleable Tensorflow 2.0.0)

Run multiple consoles in docker

Run Tensorflow Examples

Save changes in Docker Container

Use Tensorboard in Docker Container

Use Linux signals to change parallelism

Install

Try your first TensorFlow program

Contribution guidelines

Continuous build status

Official Builds

Community Supported Builds

Resources

License

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages