Contributing to TensorFlow Community
Installing the TensorFlow (TF) is fairly well documented in the official documentation, however, when I was rewriting some of the parts, I had to get both the Stable and the Developer versions. In here I will go through TensorFlow installation for both versions in the VirtualEnv.
- Install Using
pip
(non-dev) - Install from
master
Branch - File an issue report and submit a pull request (PR)
Install Using pip
(non-dev)
Note: You might need the non-developer
, stable version of TF if you work on other machine learning projects
We will be using a Virtual Environment tf-env
.
Note, that you will need to have virtualenv
and virtualenvwrapper
installed.
If you want to install a specific version, and not rely on the PyPI
, you can get the specific version path here, and install it using the following (you will need to change the TF_BINARY_URL
to your own system).
And, you are done! If you are not planning on working with the source codes, just use the pip
installation. Whenever you need to use tensorflow, just switch to your environments, and you are done:
Install from master
Branch
I am assuming you are not using the “easier” pip
installation because you want to contribute to the community. That means, you need to have a forked repository… So, let’s start from GitHub
Create a fork and clone it
Navigate to the GitHub TensorFlow repo, and fork it using the button in the top right that looks like this: Fork
At this point, if you navigate to your own GitHub account, you will find a new repository named .../tensorflow
. You can find the clone path shown here
If you didn’t setup the SSH, you can pick the HTTPS
option above
Notice that the origin repository is your own account on GitHub. You might want to add another origin, so you could easily synchronize your repo with the official one.
Now if you want to synchronize your repository with the official one, just do
Compile (Python 2.7)
I am assuming you are using brew
on OS X, and you have Python installed
Note that I am installing the python packages globally, and not in my virtual environment.
This would probably be a good time to create a virtual environment for our dev
version of the TF – I am going to use tf-dev-env
for the development version of the tensorflow.
Run the configuration – I will not go through every question, as it is well documented on the TF website
Build and install (dev)
Now we can install the TF from source. Because we want to install for development, meaning we will be modifying and testing it many times, we want to have symlinks in our site-packages
directory, and not copied versions.
Notice the --copt=-march=native
in the code below. This argument specifies that we will be running TF on our own machine only, and there it could optimize it for it specifically (makes it faster)
Done! You can test your installed version of the TF simply by
File an issue report and submit a pull request (PR)
Before you decide to change anything in the source, you absolutely HAVE TO check if there is a fix for it pending already. That is why, there is a tracking system for all features, issues, and bugs. Go to the main TF GitHub bage, and click on the “Issues” button and use some keywords to find a relevant post.
Only if you did not find any issue, you can create a “New Issue”.
Finding a bug and Creating Issue
Suppose you found a bug. For example, you look through the source code,
and in file tensorflow/configure
you notice the following
1
2
3
4
5
6
7
8
9
10
#!/usr/bin/env bash
set -e
set -o pipefail
# Find out the absolute path to where ./configure resides
pushd `dirname $0` #> /dev/null
SOURCE_BASE_DIR=`pwd -P`
popd > /dev/null
.........
Notice what lines 7-9
do: they push
the current directory to the directory stack, assign the current directory to the $SOURCE_BASE_DIR
, and after that pop
the directory from the stack. Most probably this is done for the cases when the script is run from a directory different from TF root.
We cannot really fix it (there might be a reason why it is there), but what we also notice is that the pushd
is not silenced, and thus repeats the directory name when we run the configure
script.
You can see the issue I have submitted about it here
Submitting a Pull Request
To submit a Pull Request (PR), you first need to check if there is a fix pending. Just search the same way you would search for an issue, but under the “Pull requests” tab on GitHub. If you don’t find it, just create a new PR, describe what it fixes, mention the issue number that it resolves and you done.
Let us silence the pushd
and submit it as a PR.
Create a branch with the fix
Never try to change the master
branch – this causes confusion, and overall not a right way of doing it. You will have to roll back a lot if you do.
Create new branch called silence-pushd
Implement your changes
Modify the files you need, and test the final result. In my case, I will just silence the pushd
in the configure
scipt like so
1
2
3
4
5
6
7
8
9
10
#!/usr/bin/env bash
set -e
set -o pipefail
# Find out the absolute path to where ./configure resides
pushd `dirname $0` > /dev/null
SOURCE_BASE_DIR=`pwd -P`
popd > /dev/null
.........
Now add, commit, and push the changes to the origin/silence-pushd
Submit a PR
Now you can go ahead, and submit a PR on GitHub. Just write the description, and you are done. You can see the submitted PR here