Confs and train_data are basic data required by software to perform prediction. Testing_data are input sequences for prediction. They are generated by the step: source_data --> precessing_data --> testing_data. The predictions given by the prediction models were transformed to a uniformed format: codingf. The files in codingf directory were then combined into "array" format that is convenient for further analysis.