Skip to content

artecs-group/Tensorflow-Container-Scheduler

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

99 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Documentation
Documentation

TensorFlow is an end-to-end open source platform for machine learning. It has a comprehensive, flexible ecosystem of tools, libraries, and community resources that lets researchers push the state-of-the-art in ML and developers easily build and deploy ML-powered applications.

TensorFlow was originally developed by researchers and engineers working on the Google Brain team within Google's Machine Intelligence Research organization to conduct machine learning and deep neural networks research. The system is general enough to be applicable in a wide variety of other domains, as well.

TensorFlow provides stable Python and C++ APIs, as well as non-guaranteed backward compatible API for other languages.

Keep up-to-date with release announcements and security updates by subscribing to [email protected]. See all the mailing lists.

Installation Steps (Docker Malleable Tensorflow 2.0.0)

1- Download this folder

2- Unzip the folder where you want.

3- Start a linux command console.

4- Enter the tensorflow_master folder from the command console.

5- Execute "docker build -f ./tensorflow-cpu.Dockerfile -t tf ."

6- Execute "docker run -v $(pwd):/my-devel -it tf". The container of the builded image will open.

Run multiple consoles in docker

1- Run the image with docker run.

2- Search with "docker ps" for the container id.

3- Run docker exec -it container_id bash.

Run Tensorflow Examples

To run a tensorflow example, four environment variables must be exported:

  • export MIN_ENV_THREADS: defines the minimum number of threads used for execution.
  • export MAX_ENV_THREADS: defines the maximum number of threads used for execution.
  • export ITERATION_DOWN: defines in which node of the graph the number of threads will decrease
  • export ITERATION_UP: defines in which node of the graph the number of threads will increase.

Then run the examples located in the / tensorflow_src / tensorflow_examples folder:

  • cd tensorflow_src/tensorflow_examples

To run the VGG example:

  • python keras_example_VGG.py NUM_INTER NUM_INTRA

To run the Resnet50 example:

  • python keras_example_renset.py NUM_INTER NUM_INTRA

The script execution_script.sh allows you to run an example by varying the number of inter and intra threads from 1 to 32, testing all combinations. It has 5 parameters:

  • minimum number of threads
  • maxinum number of threads.
  • which node of the graph the number of threads will decrease.
  • which node of the graph the number of threads will increase.
  • name of the model example (VGG or Resnet50).

Save changes in Docker Container

"docker commit" allows you to save a snapshot of your container as a docker image so you can return to it later. Like any docker image, these can be moved around to a different machines. Docker's git-like features really shine here -- you can roll back to previous commits, and pushing images with docker push is really fast since it pushes only differences.

Quick overview:

1- On the host machine that is running docker, look up the name or container id of the running container using docker ps. (You can also assign your own choice of name to the container when calling docker run and then use that).

2-Save the running container as a docker image, e.g. docker commit username/imagename. Optionally you can include commit messages with -m. Once the container is committed, you can now stop or remove the container without losing data.

3- Push the container to the Docker Hub: docker push username/imagename. Be sure to use a private image (either on the Hub or by running a private registry) if necessary: just create the private image name on the Hub before pushing. (Alternatively you can save the container as a tarball with docker save and download that for future use. This approach does not benefit from transferring only the changed layers, so should be avoided in favor of docker push/pull if possible).

NOTE: If you start an instance with a linked volume, docker commit will not capture changes to that volume

Bibliography: https://github.com/rocker-org/rocker/wiki/How-to-save-data

Use Tensorboard in Docker Container

Run the saved image by binding port 6006 of the container to port 6006 of the local host. Use the following command:

docker run -p 0.0.0.0:6006:6006 -it container_name:latest

Then, execute Tensorflow with option --bind_all. This will expose your TensorBoard instance to the network on both IPv4 and IPv6 (where available). Mutually exclusive with --host. Use the following command:

tensorboard --logdir PATH_PROFILE --bind_all

Use Linux signals to change parallelism

To change the parallelism at runtime, we first look for the PID of the process that runs the tensorflow instance with the following command:

  • pidof python example_name.py

Where name is keras_example_resnet.py or keras_example_VGG.py.

If we want to modify the inter parallelism, we must use the signal 10 to decrease or 12 to increase.

If we want to modify the intra parallelism, we must use the signal 16 to decrease or 17 to increase.

Finally we execute the kill command to send the signal to the program:

  • kill -SIGNAL_NUMBER PID_PROCESS

Install

See the TensorFlow install guide for the pip package, to enable GPU support, use a Docker container, and build from source.

To install the current release for CPU-only:

$ pip install tensorflow

Use the GPU package for CUDA-enabled GPU cards (Ubuntu and Windows):

$ pip install tensorflow-gpu

Nightly binaries are available for testing using the tf-nightly and tf-nightly-gpu packages on PyPi.

Try your first TensorFlow program

$ python
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
3
>>> hello = tf.constant('Hello, TensorFlow!')
>>> hello.numpy()
'Hello, TensorFlow!'

For more examples, see the TensorFlow tutorials.

Contribution guidelines

If you want to contribute to TensorFlow, be sure to review the contribution guidelines. This project adheres to TensorFlow's code of conduct. By participating, you are expected to uphold this code.

We use GitHub issues for tracking requests and bugs, please see TensorFlow Discuss for general questions and discussion, and please direct specific questions to Stack Overflow.

The TensorFlow project strives to abide by generally accepted best practices in open-source software development:

CII Best Practices Contributor Covenant

Continuous build status

Official Builds

Build Type Status Artifacts
Linux CPU Status PyPI
Linux GPU Status PyPI
Linux XLA Status TBA
macOS Status PyPI
Windows CPU Status PyPI
Windows GPU Status PyPI
Android Status Download
Raspberry Pi 0 and 1 Status Status Py2 Py3
Raspberry Pi 2 and 3 Status Status Py2 Py3

Community Supported Builds

Build Type Status Artifacts
Linux AMD ROCm GPU Nightly Build Status Nightly
Linux AMD ROCm GPU Stable Release Build Status Release 1.15 / 2.x
Linux s390x Nightly Build Status Nightly
Linux s390x CPU Stable Release Build Status Release
Linux ppc64le CPU Nightly Build Status Nightly
Linux ppc64le CPU Stable Release Build Status Release 1.15 / 2.x
Linux ppc64le GPU Nightly Build Status Nightly
Linux ppc64le GPU Stable Release Build Status Release 1.15 / 2.x
Linux CPU with Intel® MKL-DNN Nightly Build Status Nightly
Linux CPU with Intel® MKL-DNN Stable Release Build Status Release 1.15 / 2.x
Red Hat® Enterprise Linux® 7.6 CPU & GPU
Python 2.7, 3.6
Build Status 1.13.1 PyPI

Resources

Learn more about the TensorFlow community and how to contribute.

License

Apache License 2.0

About

No description, website, or topics provided.

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 56.8%
  • Python 32.4%
  • HTML 3.5%
  • Starlark 2.3%
  • Go 1.3%
  • MLIR 1.0%
  • Other 2.7%