SMART 2

A Comprehensive Analysis Tool for Bisulfite Sequencing Data

Detailed INSTALL Guide For SMART-BS-Seq

Time-stamp: <2018-04-20 15:11:20 Hongbo Liu>

Please check the following instructions to complete your installation.


Prerequisites

Python version must be equal to 2.7 to run SMART-BS-Seq. I recommend using the version 2.7.12
scipy (>=0.18.1), numpy (>=1.11.2), statsmodels (>=0.8.0 and <=0.10.2), pandas (==0.19.2) are required to run SMART-BS-Seq.


1. Easy installation through PyPI

The easiest way to install SMART-BS-Seq is through PyPI system. Get pip if it's not available in your system PyPI will install all related packages including Numpy, Scipy and Pandas automatically if they are absent.

root user: pip install SMART-BS-Seq
non root user: pip install --user SMART-BS-Seq

Note 1: If you have no authority to write to root folder, maybe you will get an error like "OSError: [Errno 13] Permission denied: '/usr/local/lib/python2.7'". To solve this, you can try install SMART in your own home folder using `pip install --user SMART-BS-Seq`. By this, SMART will installed into the home folder of yourself like "/home/hongbo.liu/.local/bin".

Note 2: To upgrade SMART-BS-Seq, type `pip install -U SMART-BS-Seq` or `pip install --user -U SMART-BS-Seq`. It will check currently installed SMART-BS-Seq, compare the version with the one on PyPI repository, download and install newer version while necessary.

Note 3:To uninstall SMART-BS-Seq, type `pip uninstall SMART-BS-Seq`. PyPI will uninstall all files about SMART from the system.

Note 4: If you do not want pip to fix dependencies. For example, you already have a workable Scipy and Numpy, and when 'pip install -U SMART-BS-Seq', pip downloads newest Scipy and Numpy but unable to compile and install them. This will fail the whole installation. You can pass '--no-deps' option to pip and let it skip all dependencies. Type `pip install -U --no-deps SMART-BS-Seq`.

Note 5: It should be noted that the location of installation directory of SMART may be different in different Operating System.
- Linux (Ubuntu 16.04): */usr/local/lib/python2.7/dist-packages/SMART/*
- macOS (Sierra 10.12): */Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/SMART/*

Note 6: Please make sure statsmodels version betwwen 0.8.0 and 0.10.2, and pandas == 0.19.2.
- statsmodels > 0.10.2 will report error "ImportError: No module named abc"
- pandas > 0.19.2 has removed some key funtions needed by SMART2


2. Use SMART without pip installation

SMART2 support direct running without pip installation. To run SMART directly, download source distribution of "SMART-BS-Seq-2.2.8.tar.gz" at https://pypi.org/project/SMART-BS-Seq/#files, unpack the distribution tarball and open up a command terminal. Add the directory where you unpacked SMART-BS-Seq to PYTHONPATH. And then change to the folder ./SMART-BS-Seq-2.2.8/bin. Now SMART can be used to data analysis.


$ cd /home/hongbo.liu/tools/Pacages/SMART2
$ tar zxvf SMART-BS-Seq-2.2.8.tar.gz
$ export PYTHONPATH="${PYTHONPATH}:/home/hongbo.liu/tools/Pacages/SMART2/SMART-BS-Seq-2.2.8"
$ cd SMART-BS-Seq-2.2.8/bin
$ ./SMART -h

3. Configure enviroment variables (if needed)

After running the setup script, you might need to add the install location to your PYTHONPATH and PATH environment variables. The process for doing this varies on each platform, but the general concept is the same across platforms.

PYTHONPATH

To set up your PYTHONPATH environment variable, you'll need to add the value PREFIX/lib/pythonX.Y/site-packages to your existing PYTHONPATH. In this value, X.Y stands for the major¨Cminor version of Python you are using (such as 2.7 ; you can find this with sys.version[:3] from a Python command line). PREFIX is the install prefix where you installed SMART-BS-Seq. If you did not specify a prefix on the command line, SMART-BS-Seq will be installed using Python's sys.prefix value.

On Linux, using bash
$ export PYTHONPATH=/usr/local/lib/python2.7/dist-packages/SMART:$PYTHONPATH

On MacOS, using bash
$ export PYTHONPATH=/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/SMART:$PYTHONPATH

PATH

Just like your PYTHONPATH, you'll also need to add a new value to your PATH environment variable so that you can use the SMART-BS-Seq command line directly. Unlike the PYTHONPATH value, however, this time you'll need to add PREFIX/bin to your PATH environment variable. The process for updating this is the same as described above for the PYTHONPATH variable:

On Linux, using bash
$ export PATH=/usr/local/bin:$PATH

On MacOS, using bash
$ export PATH=/Library/Frameworks/Python.framework/Versions/2.7/bin:$PATH

4. Testing in different system platforms

(1) Testing of SMART installation and usage on MacOS 10.13.4 (32G memory and 100G disk storage) (Log file)

(2) Testing of SMART installation and usage on Ubuntu 16.04 ( 4G memory and 32G disk storage) (Log file)

(3) Testing of SMART installation and usage on CentOS 7.3.1611 ( 250G memory and 1T disk storage) (Log file)

[hongbo.liu@login01 SMART]$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: CentOS
Description: CentOS Linux release 7.3.1611 (Core)
Release: 7.3.1611
Codename: Core
Python: 2.7.12

[hongbo.liu@login01 SMART]$ pip install --user SMART-BS-Seq
Collecting SMART-BS-Seq
Using cached https://files.pythonhosted.org/packages/5f/66/aa5194687293a3afbed30e60a2f82d4eb85bb06b78f01201bd8a1890a3a7/SMART_BS_Seq-2.2.8-py2-none-any.whl
Requirement already satisfied: pandas==0.19.2 in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from SMART-BS-Seq)
Requirement already satisfied: statsmodels>=0.8.0 in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from SMART-BS-Seq)
Requirement already satisfied: numpy>=1.11.2 in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages/numpy-1.14.2-py2.7-linux-x86_64.egg (from SMART-BS-Seq)
Requirement already satisfied: scipy>=0.18.1 in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from SMART-BS-Seq)
Requirement already satisfied: pytz>=2011k in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from pandas==0.19.2->SMART-BS-Seq)
Requirement already satisfied: python-dateutil in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from pandas==0.19.2->SMART-BS-Seq)
Requirement already satisfied: patsy in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages (from statsmodels>=0.8.0->SMART-BS-Seq)
Requirement already satisfied: six>=1.5 in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages/six-1.11.0-py2.7.egg (from python-dateutil->pandas==0.19.2->SMART-BS-Seq)
Installing collected packages: SMART-BS-Seq
Successfully installed SMART-BS-Seq-2.2.8
You are using pip version 9.0.1, however version 10.0.0 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
[hongbo.liu@login01 SMART]$ SMART -v
SMART 2.2.8
[hongbo.liu@login01 SMART]$ SMART -h
usage: SMART [-h] [-t {DeNovoDMR,DMROI,DMC,Segment}] [-r REGION_OF_INTEREST]
[-c CASE_CONTROL_MATRIX] [-n PROJECT_NAME] [-o OUTPUT_FOLDER]
[-MR MISS_VALUE_REPLACE] [-AG PERCENTAGE_OF_AVAILABLE_GROUPS]
[-MS METHYLATION_SPECIFICITY] [-ED EUCLIDEAN_DISTANCE]
[-SM SIMILARITY_ENTROPY] [-CD CPG_DISTANCE] [-CN CPG_NUMBER]
[-SL SEGMENT_LENGTH] [-PD P_DMR] [-PM P_METHYLMARK]
[-PC P_DMR_CASECONTROL] [-AD ABSMEANMETHDIFFER] [-v]
MethylMatrix

[hongbo.liu@login01 SMART]$ cd /primary/home/hongbo.liu/.local/lib/python2.7/site-packages/SMART/Example
[hongbo.liu@login01 Example]$ SMART -t DeNovoDMR -c Case_control_matrix.txt MethylMatrix_Test.txt

2018-04-20 17:36:13 ***************Project SMART parameters**************
{'Project_Name': 'SMART', 'Euclidean_Distance': 0.2, 'CpG_Distance': 500, 'AbsMeanMethDiffer': 0.3, 'Percentage_of_Available_Groups': 1.0, 'p_DMR_CaseControl': 0.05, 'CpG_Number': 5, 'Output_Folder': '', 'Miss_Value_Replace': 0.5, 'Segment_Length': 20, 'p_DMR': 0.05, 'Project_Type': 'DeNovoDMR', 'p_MethylMark': 0.05, 'Similarity_Entropy': 0.6, 'MethylMatrix': 'MethylMatrix_Test.txt', 'Region_of_interest': '', 'Case_control_matrix': 'Case_control_matrix.txt', 'Methylation_Specificity': 0.5}

2018-04-20 17:36:13 *****************Project SMART Start*****************
2018-04-20 17:36:13 Start to check methylation data and fill missing value...
2018-04-20 17:36:16 Start genome segmentation ...
2018-04-20 17:36:16 Finish segmentation for chrY
2018-04-20 17:36:26 Finish segmentation for chr18
2018-04-20 17:36:27 Finish segmentation for chr20
2018-04-20 17:36:27 Finish segmentation for chr14
2018-04-20 17:36:28 Finish segmentation for chr1
2018-04-20 17:36:31 Finish segmentation for chr3
2018-04-20 17:36:31 Finish segmentation for chr11
2018-04-20 17:36:31 Finish segmentation for chr8
2018-04-20 17:36:32 Finish segmentation for chr21
2018-04-20 17:36:32 Finish segmentation for chr19
2018-04-20 17:36:32 Finish segmentation for chr2
2018-04-20 17:36:32 Finish segmentation for chr16
2018-04-20 17:36:33 Finish segmentation for chr10
2018-04-20 17:36:33 Finish segmentation for chr13
2018-04-20 17:36:33 Finish segmentation for chr9
2018-04-20 17:36:33 Finish segmentation for chr15
2018-04-20 17:36:34 Finish segmentation for chr22
2018-04-20 17:36:34 Finish segmentation for chr12
2018-04-20 17:36:34 Finish segmentation for chrX
2018-04-20 17:36:34 Finish segmentation for chr6
2018-04-20 17:36:34 Finish segmentation for chr4
2018-04-20 17:36:35 Finish segmentation for chr5
2018-04-20 17:36:35 Finish segmentation for chr7
2018-04-20 17:36:35 Finish segmentation for chr17
2018-04-20 17:36:35 Start to merge small segments...
2018-04-20 17:36:35 Finish merging segments for chr1
2018-04-20 17:36:35 Finish merging segments for chr2
2018-04-20 17:36:36 Finish merging segments for chr3
2018-04-20 17:36:36 Finish merging segments for chr4
2018-04-20 17:36:36 Finish merging segments for chr5
2018-04-20 17:36:36 Finish merging segments for chr6
2018-04-20 17:36:36 Finish merging segments for chr7
2018-04-20 17:36:36 Finish merging segments for chr8
2018-04-20 17:36:36 Finish merging segments for chr9
2018-04-20 17:36:36 Finish merging segments for chr10
2018-04-20 17:36:36 Finish merging segments for chr11
2018-04-20 17:36:36 Finish merging segments for chr12
2018-04-20 17:36:36 Finish merging segments for chr13
2018-04-20 17:36:36 Finish merging segments for chr14
2018-04-20 17:36:36 Finish merging segments for chr15
2018-04-20 17:36:36 Finish merging segments for chr16
2018-04-20 17:36:36 Finish merging segments for chr17
2018-04-20 17:36:36 Finish merging segments for chr18
2018-04-20 17:36:36 Finish merging segments for chr19
2018-04-20 17:36:36 Finish merging segments for chr20
2018-04-20 17:36:36 Finish merging segments for chr21
2018-04-20 17:36:37 Finish merging segments for chr22
2018-04-20 17:36:37 Finish merging segments for chrX
2018-04-20 17:36:37 Finish merging segments for chrY
2018-04-20 17:36:37 Start to identify DMRs...
2018-04-20 17:36:37 Finish DMR identification for chrY
2018-04-20 17:36:39 Finish DMR identification for chr20
2018-04-20 17:36:39 Finish DMR identification for chr3
2018-04-20 17:36:39 Finish DMR identification for chr14
2018-04-20 17:36:40 Finish DMR identification for chr6
2018-04-20 17:36:40 Finish DMR identification for chr21
2018-04-20 17:36:40 Finish DMR identification for chr2
2018-04-20 17:36:40 Finish DMR identification for chr9
2018-04-20 17:36:40 Finish DMR identification for chr1
2018-04-20 17:36:40 Finish DMR identification for chr16
2018-04-20 17:36:40 Finish DMR identification for chr22
2018-04-20 17:36:40 Finish DMR identification for chr15
2018-04-20 17:36:40 Finish DMR identification for chr18
2018-04-20 17:36:40 Finish DMR identification for chr19
2018-04-20 17:36:41 Finish DMR identification for chr10
2018-04-20 17:36:41 Finish DMR identification for chrX
2018-04-20 17:36:41 Finish DMR identification for chr8
2018-04-20 17:36:41 Finish DMR identification for chr11
2018-04-20 17:36:41 Finish DMR identification for chr4
2018-04-20 17:36:41 Finish DMR identification for chr13
2018-04-20 17:36:41 Finish DMR identification for chr5
2018-04-20 17:36:41 Finish DMR identification for chr7
2018-04-20 17:36:41 Finish DMR identification for chr17
2018-04-20 17:36:41 Finish DMR identification for chr12
2018-04-20 17:36:41 Finish DMR identification for all chromesomes

2018-04-20 17:36:41 ********************Project Summary******************
2018-04-20 17:36:41 Chromesomes : chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY
2018-04-20 17:36:41 Sample Number : 18
2018-04-20 17:36:41 Sample Names : G1_1,G1_2,G2_1,G2_2,G3_1,G3_2,G3_3,G4_1,G4_2,G4_3,G5_1,G5_2,G6_1,G6_2,G7_1,G7_2,G7_3,G7_4
2018-04-20 17:36:41 Group Number : 7
2018-04-20 17:36:41 Group Names : G7,G6,G5,G4,G3,G2,G1
2018-04-20 17:36:41 Number of total CpG sites in all chromesomes :24000
2018-04-20 17:36:41 Number of CpG sites with methylation in all groups :11837
2018-04-20 17:36:41 Number of missing values that have been filled :18911
2018-04-20 17:36:41 Small Segment Number :5097
2018-04-20 17:36:41 Merged Segment Number :591
2018-04-20 17:36:41 DMR Number :329
2018-04-20 17:36:41 NonDMR Number :262
2018-04-20 17:36:41 NonDMR-UniHypo Segment Number :59
2018-04-20 17:36:41 NonDMR-UnipLow Segment Number :19
2018-04-20 17:36:41 NonDMR-UnipHigh Segment Number :70
2018-04-20 17:36:41 NonDMR-UniHyper Segment Number :114
2018-04-20 17:36:41 MethyMark Segment Number :312
2018-04-20 17:36:41 MethyMark-HypoMark Segment Number :132
2018-04-20 17:36:41 MethyMark-HyperMark Segment Number :180
2018-04-20 17:36:41 G7 HypoMark Number: 16
2018-04-20 17:36:41 G7 HyperMark Number: 2
2018-04-20 17:36:41 G6 HypoMark Number: 0
2018-04-20 17:36:41 G6 HyperMark Number: 84
2018-04-20 17:36:41 G5 HypoMark Number: 21
2018-04-20 17:36:41 G5 HyperMark Number: 7
2018-04-20 17:36:41 G4 HypoMark Number: 36
2018-04-20 17:36:41 G4 HyperMark Number: 14
2018-04-20 17:36:41 G3 HypoMark Number: 7
2018-04-20 17:36:41 G3 HyperMark Number: 43
2018-04-20 17:36:41 G2 HypoMark Number: 29
2018-04-20 17:36:41 G2 HyperMark Number: 28
2018-04-20 17:36:41 G1 HypoMark Number: 23
2018-04-20 17:36:41 G1 HyperMark Number: 2
2018-04-20 17:36:41 G3_G4 Case Hypo DMR Number: 30
2018-04-20 17:36:41 G3_G4 Case Hyper DMR Number: 141
2018-04-20 17:36:41 G1_G3 Case Hypo DMR Number: 145
2018-04-20 17:36:41 G1_G3 Case Hyper DMR Number: 27
2018-04-20 17:36:41 G1_G2 Case Hypo DMR Number: 139
2018-04-20 17:36:41 G1_G2 Case Hyper DMR Number: 54
2018-04-20 17:36:41 *********************Summary End*********************

2018-04-20 17:36:41 Detailed results in /primary/home/hongbo.liu/.local/lib/python2.7/site-packages/SMART/Example/SMART20180420173613/
2018-04-20 17:36:41 For any questions, visit http://fame.edbc.org/smart/ or contact Hongbo Liu (hongbo919@gmail.com)

2018-04-20 17:36:41 ***************Project SMART Finished!***************