Build XGBoost under OS X
XGBoost - C ++ library that implements gradient boosting methods, which is increasingly found in descriptions of winning algorithms on Kaggle. For use from R or Python there are corresponding bindings, but the library itself needs to be assembled from source. When I started make, I saw a lot of errors reporting about undetected headers and unsupported OpenMP. Well, not the first time.
Progress does not stand still, and the algorithm is somewhat simplified:
It is reported from the fields that when starting XGBoost may swear at the /usr/local/lib/gcc/5/libgomp.1.dylib library that was not found. In this case, you should find it and put it on the specified path.
Previously, you had to do the following:
1. Download Xcode
Xcode can be downloaded for free from the App Store. After installation, typing gcc -v on the command line of the terminal, we should see something like the following on the screen:
2. Install command line tools
If you skip this step, the compiler will not be able to find the standard C and C ++ libraries. In the terminal you need to run
and follow the instructions.
3. Build Clang with OpenMP support
This version of the compiler supports OpenMP instructions for parallelization and is developed by Intel staff. It is hoped that one day this branch will flow back into the trunk, and OpenMP will be available in the original Clang out of the box. Apparently, some time ago you could install clang-omp using brew, but this happy time has passed. So, we compile the compiler:
If there are more than 4 cores on the machine, it makes sense to correct the number in the last command.
4. Assemble Intel OpenMP library The
library for OpenMP support is also assembled from source. Download , unpack, collect:
5. Register the path to the OpenMP library and the corresponding readers
so that the compiler and linker during the XGBoost assembly can find the components they need, you need to register the paths to them. To do this, add the following lines to ~ / .bash_profile:
As you might guess, PATH_TO_LIBOMP is the path to the folder where the library is located. For the changes to take effect, you must run the command
You need to make sure that everything works correctly. To do this, create a sample program
and try to compile it:
If everything is fine, starting the program, we will see messages from several threads on the screen.
6. Collect XGBoost
We are almost there. In the xgboost folder is a Makefile, the first lines of which must be edited as follows:
We collect:
Victory.
7. Set Python bindings
You can verify that the XGBoost binding is working properly using the demo scripts in xgboost / demo.
8. To conquer the tops of leaderboards on Kaggle
Suddenly the power went out, and this most important part, unfortunately, was lost and will be covered in the future.
Progress does not stand still, and the algorithm is somewhat simplified:
- / usr / bin / ruby -e "$ (curl -fsSL raw.githubusercontent.com/Homebrew/install/master/install)"
- brew install gcc --without-multilib
- pip install xgboost
- Conquer Kaggle Leaderboards Tops
It is reported from the fields that when starting XGBoost may swear at the /usr/local/lib/gcc/5/libgomp.1.dylib library that was not found. In this case, you should find it and put it on the specified path.
Previously, you had to do the following:
- Download Xcode
- Install command line tools
- Build Clang with OpenMP Support
- Build Intel OpenMP Library
- Register the path to the OpenMP library and the corresponding headers
- Build XGBoost
- Set Python bindings
- Conquer Kaggle Leaderboards Tops
1. Download Xcode
Xcode can be downloaded for free from the App Store. After installation, typing gcc -v on the command line of the terminal, we should see something like the following on the screen:
Configured with: --prefix=/Applications/Xcode.app/Contents/Developer//usr --with-gxx-include-dir=/usr/include/c++/4.2.1
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.3.0
Thread model: posix
2. Install command line tools
If you skip this step, the compiler will not be able to find the standard C and C ++ libraries. In the terminal you need to run
xcode-select --install
and follow the instructions.
3. Build Clang with OpenMP support
This version of the compiler supports OpenMP instructions for parallelization and is developed by Intel staff. It is hoped that one day this branch will flow back into the trunk, and OpenMP will be available in the original Clang out of the box. Apparently, some time ago you could install clang-omp using brew, but this happy time has passed. So, we compile the compiler:
mkdir clang-omp && cd clang-omp
git clone https://github.com/clang-omp/llvm
git clone https://github.com/clang-omp/compiler-rt llvm/projects/compiler-rt
git clone -b clang-omp https://github.com/clang-omp/clang llvm/tools/clang
mkdir build && cd build
cmake ../llvm -DCMAKE_BUILD_TYPE=Release
make -j 4
If there are more than 4 cores on the machine, it makes sense to correct the number in the last command.
4. Assemble Intel OpenMP library The
library for OpenMP support is also assembled from source. Download , unpack, collect:
mkdir build && cd build
cmake ..
make -j 4
5. Register the path to the OpenMP library and the corresponding readers
so that the compiler and linker during the XGBoost assembly can find the components they need, you need to register the paths to them. To do this, add the following lines to ~ / .bash_profile:
export C_INCLUDE_PATH=PATH_TO_LIBOMP/libomp/exports/common/include/:$C_INCLUDE_PATHexport CPLUS_INCLUDE_PATH=PATH_TO_LIBOMP/libomp/exports/common/include/:$CPLUS_INCLUDE_PATHexport LIBRARY_PATH=PATH_TO_LIBOMP/libomp/exports/mac_32e/lib/:$LIBRARY_PATHexport DYLD_LIBRARY_PATH=PATH_TO_LIBOMP/libomp/exports/mac_32e/lib/:$DYLD_LIBRARY_PATHAs you might guess, PATH_TO_LIBOMP is the path to the folder where the library is located. For the changes to take effect, you must run the command
source ~/.bash_profile
You need to make sure that everything works correctly. To do this, create a sample program
#include<omp.h>#include<stdio.h>#include<iostream>intmain(int argc, char** argv){
std::cout << "Hello!" << std::endl;
#pragma omp parallelprintf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads());
return0;
}
and try to compile it:
PATH_TO_CLANGOMP/clang-omp/build/bin/clang++ sample.cpp -o sample -fopenmp
If everything is fine, starting the program, we will see messages from several threads on the screen.
6. Collect XGBoost
We are almost there. In the xgboost folder is a Makefile, the first lines of which must be edited as follows:
export CC = PATH_TO_CLANGOMP/clang-omp/build/bin/clang
export CXX = PATH_TO_CLANGOMP/clang-omp/build/bin/clang++
export MPICXX = mpicxx
export LDFLAGS= -pthread -lm
export CFLAGS = -Wall -O3 -msse2 -Wno-unknown-pragmas -funroll-loops -fopenmp
We collect:
make -j 4
Victory.
7. Set Python bindings
cd wrapper
python setup.py install
You can verify that the XGBoost binding is working properly using the demo scripts in xgboost / demo.
8. To conquer the tops of leaderboards on Kaggle
Suddenly the power went out, and this most important part, unfortunately, was lost and will be covered in the future.