Hierarchy-Based File Fragment Classification
Hierarchy-Based File Fragment Classification
Blog Article
File fragment classification is an essential problem in digital forensics.Although several attempts had been made to aptamil allerpro solve this challenging problem, a general solution has not been found.In this work, we propose a hierarchical machine-learning-based approach with optimized support vector machines (SVM) as the base classifiers for file fragment classification.This approach consists of more general classifiers at the top level and more specialized fine-grain classifiers at the lower levels of the hierarchy.We also propose a primitive taxonomy for file types that can be used to perform hierarchical classification.
We evaluate our model with a dataset of 14 file types, with 1000 fragments measuring 512 bytes from each file type derived from a subset of the publicly available Digital Corpora, the govdocs1 corpus.Our experiment shows comparable results to att nighthawk hotspot the present literature, with an average accuracy of 67.78% and an F1-measure of 65% using 10-fold cross-validation.We then improve on the hierarchy and find better results, with an increase in the F1-measure of 1%.Finally, we make our assessment and observations, then conclude the paper by discussing the scope of future research.