Dirkeeper - a small utility to keep directories clean and move files around
Raise your hand if it happened to you too that an application working on a server for years ended up filling up the whole disk space blocking the entire server.
Too many files or no more space on disk
It is a pretty common problem if you work with applications that generate output on disk, that in one way or another, they also leave temporary files or logs spread across one or more directories. And sometimes you don't notice them until it's too late, because you already ended up with a directory with tens of thousands of small or empty files or with no space left on disk.
They are two faces of the same problem because a directory with too many files becomes impossible to navigate, and a disk with no space is the root of weird application behavior and crashes, sometimes making it even impossible to log into the server.
When a new file is generated, copy it over there
Another common issue I get, still related to files generated by applications is, as soon as a new file is generated, please copy it over there for that other application to process it. But not only that, if the file is named this way move it here, if it is named that other way move it there, and so on...
To handle this kind of requests without having to periodically check by hand all of the servers, and to play around with file handling in Go, I created Dirkeeper, a small command line utility that can be used directly or inside scripts to automate this kind of tasks.
Dirkeeper
The tool is quite simple at the moment and has a few commands:
Directory management utilities
Usage:
dirkeeper [command]
Available Commands:
cleanold clean old files
completion Generate the autocompletion script for the specified shell
help Help about any command
match match and process files
watch watch for new files and process them based on config rules
Flags:
-h, --help help for dirkeeper
Use "dirkeeper [command] --help" for more information about a command.
cleanold - clean old files
clean old files
Usage:
dirkeeper cleanold [flags]
Flags:
-d, --directory strings List of directories to cleanup
--dry-run Only check for old files without deleting
-h, --help help for cleanold
--max-age int Maximum age of the file in days
The cleanold
command is pretty simple, give it a comma-separated list of directories and the maximum number of days since the file creation, and dirkeeper will delete all the files older than the given age.
You can also specify the --dry-run
flag to only check the matching files before deleting them.
match - execute a task on matching files
match and process files
Usage:
dirkeeper match [flags]
Flags:
-a, --action string Action to execute
--dest-dir string Destination directory
-d, --directory string Base directory
--dry-run Do not execute action
-h, --help help for match
--max-age int Min file age in minutes
--pattern strings List of file name patterns
--prefix strings List of file name prefixes
--suffix strings List of file name suffixes
The match
command checks one or more directories for files matching a given pattern and executes an action when a match is found.
The list of directories can be specified with the -d
or --directory
flag and is mandatory.
The matching rules can be specified using one or more of the --pattern
--prefix
and --suffix
flags, where prefix
and suffix
accept a comma-separated list of prefixes and suffixes of the file name, while pattern
accept a comma-separated list of regular expressions that the file name has to match.
In addition to these rules, a --max-age
flag could be specified to indicate the minimum age of the file in minutes before executing the action. This could be useful when the file takes several seconds to be generated and prevent working on a partial file.
When one of the matching rules is triggered, then the action specified with the -a
or --action
flag is executed. Valid actions are copy
, move
, delete
or copy-delete
. The latter is useful when the system does not allow directly moving the file, like when working on remotely mounted directories.
watch - keep watching for new files and execute tasks on matching rule
watch for new files and process them based on config rules
Usage:
dirkeeper watch [flags]
Flags:
-c, --config string Config file
--debug Enable debug log
--frequency int Watch frequency in seconds (default 10)
-h, --help help for watch
The watch
command is similar to the match
one, but runs periodically with a frequency in seconds specified by the --frequency
flag.
The other main difference with the watch
command is that it takes a config file as input so that it's not required to specify all the parameters on the command line.
The config file is in YAML format and an example is like the following:
watch:
# Dry run indicates if the action should be executed or only logged
dryRun: false
# Can have a list of input directories to watch
directories:
# The path of the directory to watch
- name: "/test/input"
# The list of rules to apply
rules:
# The action to execute for every matching file, can be copy, move or delete
- action: "move"
pattern:
# The list of pattern to match, as regular expressions on file name
- "RY59A.*"
prefix:
# The list of prefixes to match
suffix:
# Th elist of suffixes to match
# The destination directory for the copy or move actions
destination: "/tmp/test/outputA"
- action: "delete"
pattern:
# Regular expression of file name
- "RY59B.*"
destination: "/tmp/test/outputB"
Contributions and suggestions are welcome
I created this little utility for my customers, but it is open source and available on Github for anyone to use and extend.
If you find interesting additional commands or bugs you can open an issue directly on Github or contact me via Twitter @fabiomarininet.