Harvest - a tool to classify large collections of files and directories
Harvest is a compact, fast and portable software that can scan files and folders to recognise their typology. Scanning is based on file extensions and a simple fuzzy logic analysis of folder contents to recognise if they are related to video, audio or text materials.
Harvest makes it easy to list folders by type or year, to move them or to categorize them for tagged filesystems. It can process approximately 1GB of stored files per second and is operated from the console terminal.
Harvest is designed to operate on folders containing files without exploding the files around: it assesses the typology of a folder from the files contained, but does not move the files outside of that folder. For instance it works very well to move around large collections of downloaded torrent folders.
:floppy_disk: Installation
Harvest requires zsh
to be installed and works on all desktop platforms supported by it (GNU/Linux, Apple/OSX and MS/Windows).
From inside the source, just type:
Just type
git submodule update --init --recursive
make
sudo make install
to install into /usr/local/share/harvest
.
The environmental variable HARVEST_PREFIX
can be set when running harvest to indicate different installation directories. Using harvest on different operating systems than GNU/Linux/BSD may require tweaking of this variable.
:video_game: Usage
To scan all files and directories found in a folder:
harvest /path/to/folder
To scan only the files (non recursive):
harvest /path/to/folder files
After scanning, results are print to screen, but also saved in a local cache. Then it is possible to list all video hits in the most recent scan:
harvest ls video
The harvest ls
command will list all hits comma separated per line so that it can be piped and parsed into other programs to take further actions; the CSV format is:
[dir|file],TYPE,year,path_to_file
The TYPE
field is one of the strings returned by ls file-extension-list/data
which is the catalogue of file types maintained in the file-extension-list project.
To proceed moving harvested audio files to another "Sound" folder in home:
harvest mv audio ~/Sound/
To move all harvested files to a new destination folder (destination must already exist and be a writable directory):
harvest mv all ~/destination
So for instance a simple script using harvest to move all downloaded audio and video files in different home folders would look like:
#!/bin/sh
harvest ~/Downloads
harvest mv video ~/Video
harvest mv audio ~/Music
Or a short concatenation of commands that will delete all harvested code files and directories:
harvest ls code | cut -d, -f4 | xargs rm -rf
Or a Zsh script to move all files into "Archive/YEAR" to distribute files according to the year in which they were created:
#!/usr/bin/env zsh
for i in ${(f)"$(harvest ls)"}; do
year=${i[(ws:,:)3]}
file=${i[(ws:,:)4]}
mkdir -p ~/Archive/$year
mv $file ~/Archive/$year/
done
In the previous script one can use the file type instead of the year by changing year=${i[(ws:,:)3]}
into type=${i[(ws:,:)2]}
.
:heart_eyes: Acknowledgements
Harvest is Copyright (C) 2014-2018 by the Dyne.org Foundation
Harvest is designed, written and maintained by Denis "Jaromil" Roio
This source code is free software; you can redistribute it and/or modify it under the terms of the GNU Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
This source code is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. Please refer to the GNU Public License for more details.
You should have received a copy of the GNU Public License along with this source code; if not, write to: Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.